tutorial example espaƱol dense machine-learning neural-network keras

machine-learning - example - tensorflow tutorial



Keras, problema escaso de la matriz (2)

Estoy tratando de alimentar una gran matriz dispersa con el modelo de Keras. Como el conjunto de datos no encaja en la memoria RAM, el camino a seguir es capacitar al modelo en un lote generado por lote de datos por un generador.

Para probar este enfoque y asegurarme de que mi solución funciona bien, modifiqué ligeramente un simple MLP de Kera en la tarea de clasificación de temas de Reuters Newswire . Entonces, la idea es comparar modelos originales y editados. Simplemente convierto numpy.ndarray en scipy.sparse.csr.csr_matrix y lo paso al modelo.

Pero mi modelo falla en algún momento y necesito una mano para descubrir una razón.

Aquí está el modelo original y mis adiciones a continuación

from __future__ import print_function import numpy as np np.random.seed(1337) # for reproducibility from keras.datasets import reuters from keras.models import Sequential from keras.layers import Dense, Dropout, Activation from keras.utils import np_utils from keras.preprocessing.text import Tokenizer max_words = 1000 batch_size = 32 nb_epoch = 5 print(''Loading data...'') (X_train, y_train), (X_test, y_test) = reuters.load_data(nb_words=max_words, test_split=0.2) print(len(X_train), ''train sequences'') print(len(X_test), ''test sequences'') nb_classes = np.max(y_train)+1 print(nb_classes, ''classes'') print(''Vectorizing sequence data...'') tokenizer = Tokenizer(nb_words=max_words) X_train = tokenizer.sequences_to_matrix(X_train, mode=''binary'') X_test = tokenizer.sequences_to_matrix(X_test, mode=''binary'') print(''X_train shape:'', X_train.shape) print(''X_test shape:'', X_test.shape) print(''Convert class vector to binary class matrix (for use with categorical_crossentropy)'') Y_train = np_utils.to_categorical(y_train, nb_classes) Y_test = np_utils.to_categorical(y_test, nb_classes) print(''Y_train shape:'', Y_train.shape) print(''Y_test shape:'', Y_test.shape) print(''Building model...'') model = Sequential() model.add(Dense(512, input_shape=(max_words,))) model.add(Activation(''relu'')) model.add(Dropout(0.5)) model.add(Dense(nb_classes)) model.add(Activation(''softmax'')) model.compile(loss=''categorical_crossentropy'', optimizer=''adam'', metrics=[''accuracy'']) history = model.fit(X_train, Y_train, nb_epoch=nb_epoch, batch_size=batch_size, verbose=1)#, validation_split=0.1) #score = model.evaluate(X_test, Y_test, # batch_size=batch_size, verbose=1) print(''Test score:'', score[0]) print(''Test accuracy:'', score[1])

Emite:

Loading data... 8982 train sequences 2246 test sequences 46 classes Vectorizing sequence data... X_train shape: (8982, 1000) X_test shape: (2246, 1000) Convert class vector to binary class matrix (for use with categorical_crossentropy) Y_train shape: (8982, 46) Y_test shape: (2246, 46) Building model... Epoch 1/5 8982/8982 [==============================] - 5s - loss: 1.3932 - acc: 0.6906 Epoch 2/5 8982/8982 [==============================] - 4s - loss: 0.7522 - acc: 0.8234 Epoch 3/5 8982/8982 [==============================] - 5s - loss: 0.5407 - acc: 0.8681 Epoch 4/5 8982/8982 [==============================] - 5s - loss: 0.4160 - acc: 0.8980 Epoch 5/5 8982/8982 [==============================] - 5s - loss: 0.3338 - acc: 0.9136 Test score: 1.01453569163 Test accuracy: 0.797417631398

Finalmente, aquí está mi parte

X_train_sparse = sparse.csr_matrix(X_train) def batch_generator(X, y, batch_size): n_batches_for_epoch = X.shape[0]//batch_size for i in range(n_batches_for_epoch): index_batch = range(X.shape[0])[batch_size*i:batch_size*(i+1)] X_batch = X[index_batch,:].todense() y_batch = y[index_batch,:] yield(np.array(X_batch),y_batch) model.fit_generator(generator=batch_generator(X_train_sparse, Y_train, batch_size), nb_epoch=nb_epoch, samples_per_epoch=X_train_sparse.shape[0])

El choque:

Exception Traceback (most recent call last) <ipython-input-120-6722a4f77425> in <module>() 1 model.fit_generator(generator=batch_generator(X_trainSparse, Y_train, batch_size), 2 nb_epoch=nb_epoch, ----> 3 samples_per_epoch=X_trainSparse.shape[0]) /home/kk/miniconda2/envs/tensorflow/lib/python2.7/site-packages/keras/models.pyc in fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose, callbacks, validation_data, nb_val_samples, class_weight, max_q_size, **kwargs) 648 nb_val_samples=nb_val_samples, 649 class_weight=class_weight, --> 650 max_q_size=max_q_size) 651 652 def evaluate_generator(self, generator, val_samples, max_q_size=10, **kwargs): /home/kk/miniconda2/envs/tensorflow/lib/python2.7/site-packages/keras/engine/training.pyc in fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose, callbacks, validation_data, nb_val_samples, class_weight, max_q_size) 1356 raise Exception(''output of generator should be a tuple '' 1357 ''(x, y, sample_weight) '' -> 1358 ''or (x, y). Found: '' + str(generator_output)) 1359 if len(generator_output) == 2: 1360 x, y = generator_output Exception: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: None

Creo que el problema se debe a una configuración incorrecta de samples_per_epoch. Realmente apreciaría si alguien pudiera comentar sobre esto.


@ BigBoy1337, aquí está mi solución.

def batch_generator(X, y, batch_size): number_of_batches = samples_per_epoch/batch_size counter=0 shuffle_index = np.arange(np.shape(y)[0]) np.random.shuffle(shuffle_index) X = X[shuffle_index, :] y = y[shuffle_index] while 1: index_batch = shuffle_index[batch_size*counter:batch_size*(counter+1)] X_batch = X[index_batch,:].todense() y_batch = y[index_batch] counter += 1 yield(np.array(X_batch),y_batch) if (counter < number_of_batches): np.random.shuffle(shuffle_index) counter=0

En mi caso, X - matriz dispersa, y - matriz.


Si puedes usar lasaña en lugar de Keras, he escrito una pequeña clase de MLP con las siguientes características:

admite matrices densas y dispersas

admite el abandono y la capa oculta

Admite la distribución de probabilidad completa en lugar de etiquetas únicas para admitir el entrenamiento multilable.

Compatible con scikit-learn como API (ajuste, predicción, precisión, etc.)

Es muy fácil de configurar y modificar