machine-learning - example - tensorflow tutorial
Keras, problema escaso de la matriz (2)
Estoy tratando de alimentar una gran matriz dispersa con el modelo de Keras. Como el conjunto de datos no encaja en la memoria RAM, el camino a seguir es capacitar al modelo en un lote generado por lote de datos por un generador.
Para probar este enfoque y asegurarme de que mi solución funciona bien, modifiqué ligeramente un simple MLP de Kera en la tarea de clasificación de temas de Reuters Newswire . Entonces, la idea es comparar modelos originales y editados. Simplemente convierto numpy.ndarray en scipy.sparse.csr.csr_matrix y lo paso al modelo.
Pero mi modelo falla en algún momento y necesito una mano para descubrir una razón.
Aquí está el modelo original y mis adiciones a continuación
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import reuters
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.utils import np_utils
from keras.preprocessing.text import Tokenizer
max_words = 1000
batch_size = 32
nb_epoch = 5
print(''Loading data...'')
(X_train, y_train), (X_test, y_test) = reuters.load_data(nb_words=max_words, test_split=0.2)
print(len(X_train), ''train sequences'')
print(len(X_test), ''test sequences'')
nb_classes = np.max(y_train)+1
print(nb_classes, ''classes'')
print(''Vectorizing sequence data...'')
tokenizer = Tokenizer(nb_words=max_words)
X_train = tokenizer.sequences_to_matrix(X_train, mode=''binary'')
X_test = tokenizer.sequences_to_matrix(X_test, mode=''binary'')
print(''X_train shape:'', X_train.shape)
print(''X_test shape:'', X_test.shape)
print(''Convert class vector to binary class matrix (for use with categorical_crossentropy)'')
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
print(''Y_train shape:'', Y_train.shape)
print(''Y_test shape:'', Y_test.shape)
print(''Building model...'')
model = Sequential()
model.add(Dense(512, input_shape=(max_words,)))
model.add(Activation(''relu''))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation(''softmax''))
model.compile(loss=''categorical_crossentropy'',
optimizer=''adam'',
metrics=[''accuracy''])
history = model.fit(X_train, Y_train,
nb_epoch=nb_epoch, batch_size=batch_size,
verbose=1)#, validation_split=0.1)
#score = model.evaluate(X_test, Y_test,
# batch_size=batch_size, verbose=1)
print(''Test score:'', score[0])
print(''Test accuracy:'', score[1])
Emite:
Loading data...
8982 train sequences
2246 test sequences
46 classes
Vectorizing sequence data...
X_train shape: (8982, 1000)
X_test shape: (2246, 1000)
Convert class vector to binary class matrix (for use with categorical_crossentropy)
Y_train shape: (8982, 46)
Y_test shape: (2246, 46)
Building model...
Epoch 1/5
8982/8982 [==============================] - 5s - loss: 1.3932 - acc: 0.6906
Epoch 2/5
8982/8982 [==============================] - 4s - loss: 0.7522 - acc: 0.8234
Epoch 3/5
8982/8982 [==============================] - 5s - loss: 0.5407 - acc: 0.8681
Epoch 4/5
8982/8982 [==============================] - 5s - loss: 0.4160 - acc: 0.8980
Epoch 5/5
8982/8982 [==============================] - 5s - loss: 0.3338 - acc: 0.9136
Test score: 1.01453569163
Test accuracy: 0.797417631398
Finalmente, aquí está mi parte
X_train_sparse = sparse.csr_matrix(X_train)
def batch_generator(X, y, batch_size):
n_batches_for_epoch = X.shape[0]//batch_size
for i in range(n_batches_for_epoch):
index_batch = range(X.shape[0])[batch_size*i:batch_size*(i+1)]
X_batch = X[index_batch,:].todense()
y_batch = y[index_batch,:]
yield(np.array(X_batch),y_batch)
model.fit_generator(generator=batch_generator(X_train_sparse, Y_train, batch_size),
nb_epoch=nb_epoch,
samples_per_epoch=X_train_sparse.shape[0])
El choque:
Exception Traceback (most recent call last)
<ipython-input-120-6722a4f77425> in <module>()
1 model.fit_generator(generator=batch_generator(X_trainSparse, Y_train, batch_size),
2 nb_epoch=nb_epoch,
----> 3 samples_per_epoch=X_trainSparse.shape[0])
/home/kk/miniconda2/envs/tensorflow/lib/python2.7/site-packages/keras/models.pyc in fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose, callbacks, validation_data, nb_val_samples, class_weight, max_q_size, **kwargs)
648 nb_val_samples=nb_val_samples,
649 class_weight=class_weight,
--> 650 max_q_size=max_q_size)
651
652 def evaluate_generator(self, generator, val_samples, max_q_size=10, **kwargs):
/home/kk/miniconda2/envs/tensorflow/lib/python2.7/site-packages/keras/engine/training.pyc in fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose, callbacks, validation_data, nb_val_samples, class_weight, max_q_size)
1356 raise Exception(''output of generator should be a tuple ''
1357 ''(x, y, sample_weight) ''
-> 1358 ''or (x, y). Found: '' + str(generator_output))
1359 if len(generator_output) == 2:
1360 x, y = generator_output
Exception: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: None
Creo que el problema se debe a una configuración incorrecta de samples_per_epoch. Realmente apreciaría si alguien pudiera comentar sobre esto.
@ BigBoy1337, aquí está mi solución.
def batch_generator(X, y, batch_size):
number_of_batches = samples_per_epoch/batch_size
counter=0
shuffle_index = np.arange(np.shape(y)[0])
np.random.shuffle(shuffle_index)
X = X[shuffle_index, :]
y = y[shuffle_index]
while 1:
index_batch = shuffle_index[batch_size*counter:batch_size*(counter+1)]
X_batch = X[index_batch,:].todense()
y_batch = y[index_batch]
counter += 1
yield(np.array(X_batch),y_batch)
if (counter < number_of_batches):
np.random.shuffle(shuffle_index)
counter=0
En mi caso, X - matriz dispersa, y - matriz.
Si puedes usar lasaña en lugar de Keras, he escrito una pequeña clase de MLP con las siguientes características:
admite matrices densas y dispersas
admite el abandono y la capa oculta
Admite la distribución de probabilidad completa en lugar de etiquetas únicas para admitir el entrenamiento multilable.
Compatible con scikit-learn como API (ajuste, predicción, precisión, etc.)
Es muy fácil de configurar y modificar