python - tutorial - Exactitud de Tensorflow en.99 pero predicciones horribles

tensorflow tutorial español pdf (1)

Errores en el código

Hay varios errores en su código:

no debe llamar a tf.nn.sigmoid_cross_entropy_with_logits con la salida de una capa de softmax, pero con los logits sin escala :

ADVERTENCIA: esta opción espera logits sin escala, ya que realiza un softmax en logits internamente para mayor eficiencia. No llame a esta operación con la salida de softmax, ya que producirá resultados incorrectos.

de hecho, dado que tiene 2 clases, debe usar una pérdida con softmax, usando tf.nn.softmax_cross_entropy_with_logits
Al usar tf.argmax(pred, 1) , solo aplicará argmax sobre el eje 1, que es el alto de la imagen de salida. Debe usar tf.argmax(pred, 3) en el último eje (de tamaño 2).
- Esto podría explicar por qué obtienes una precisión de 0.99
- En la imagen de salida, tomará argmax sobre la altura de la imagen, que es por defecto 0 (ya que todos los valores son iguales para cada canal)

Modelo incorrecto

El mayor inconveniente es que su modelo en general será muy difícil de optimizar.

Tienes un softmax de más de 40,000 clases, que es enorme.
No aprovecha en absoluto el hecho de que desea generar una imagen (la predicción en primer plano / fondo).
- por ejemplo, la predicción 2.345 está altamente correlacionada con la predicción 2.346 y la predicción 2.545, pero no lo tienes en cuenta

Recomiendo leer un poco acerca de la segmentación semántica primero:

este artículo : Redes Completamente Convolucionales para la Segmentación Semántica
estas diapositivas de CS231n (Stanford): especialmente la parte sobre upsampling y deconvolution

Recomendaciones

Si quieres trabajar con TensorFlow, deberás comenzar siendo pequeño. Primero intente una red muy simple con tal vez 1 capa oculta.

Necesitas trazar todas las formas de tus tensores para asegurarte de que corresponden a lo que pensabas. Por ejemplo, si hubiera trazado tf.argmax(y, 1) , se habría dado cuenta de que la forma es [batch_size, 200, 2] lugar de la esperada [batch_size, 200, 200] .

TensorBoard es tu amigo, deberías intentar trazar la imagen de entrada aquí, así como tus predicciones para ver cómo se ven.

Pruebe con un pequeño conjunto de datos de 10 imágenes y vea si puede sobredimensionarlo y predecir casi la respuesta exacta.

Para concluir, no estoy seguro de todas mis sugerencias, pero valen la pena intentarlas, y espero que esto lo ayude en el camino hacia el éxito.

Tal vez estoy haciendo predicciones mal?

Aquí está el proyecto ... Tengo una imagen de entrada en escala de grises que estoy tratando de segmentar. La segmentación es una clasificación binaria simple (piense en primer plano frente a fondo). Entonces la verdad del suelo (y) es una matriz de 0''s y 1''s, entonces hay 2 clasificaciones. Ah, y la imagen de entrada es un cuadrado, así que solo uso una variable llamada n_input

Mi precisión básicamente converge a 0.99, pero cuando hago una predicción obtengo todos los ceros. EDITAR -> hay un solo 1 en cada una de las matrices de salida, ambas en el mismo lugar ...

Aquí está mi código de sesión (todo lo demás está funcionando) ...

with tf.Session() as sess: sess.run(init) summary = tf.train.SummaryWriter(''/tmp/logdir/'', sess.graph_def) step = 1 from tensorflow.contrib.learn.python.learn.datasets.scroll import scroll_data data = scroll_data.read_data(''/home/kendall/Desktop/'') # Keep training until reach max iterations flag = 0 # while flag == 0: while step * batch_size < training_iters: batch_y, batch_x = data.train.next_batch(batch_size) # pdb.set_trace() # batch_x = batch_x.reshape((batch_size, n_input)) batch_x = batch_x.reshape((batch_size, n_input, n_input)) batch_y = batch_y.reshape((batch_size, n_input, n_input)) batch_y = convert_to_2_channel(batch_y, batch_size) # batch_y = batch_y.reshape((batch_size, n_output, n_classes)) batch_y = batch_y.reshape((batch_size, 200, 200, n_classes)) sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout}) if step % display_step == 0: flag = 1 # Calculate batch loss and accuracy loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x, y: batch_y, keep_prob: 1.}) print "Iter " + str(step*batch_size) + ", Minibatch Loss= " + / "{:.6f}".format(loss) + ", Training Accuracy= " + / "{:.5f}".format(acc) step += 1 print "Optimization Finished!" save_path = "model.ckpt" saver.save(sess, save_path) im = Image.open(''/home/kendall/Desktop/HA900_frames/frame0635.tif'') batch_x = np.array(im) pdb.set_trace() batch_x = batch_x.reshape((1, n_input, n_input)) batch_x = batch_x.astype(float) # pdb.set_trace() prediction = sess.run(pred, feed_dict={x: batch_x, keep_prob: 1.}) print prediction arr1 = np.empty((n_input,n_input)) arr2 = np.empty((n_input,n_input)) for i in xrange(n_input): for j in xrange(n_input): for k in xrange(2): if k == 0: arr1[i][j] = prediction[0][i][j][k] else: arr2[i][j] = prediction[0][i][j][k] # prediction = np.asarray(prediction) # prediction = np.reshape(prediction, (200,200)) # np.savetxt("prediction.csv", prediction, delimiter=",") np.savetxt("prediction1.csv", arr1, delimiter=",") np.savetxt("prediction2.csv", arr2, delimiter=",")

Como hay dos clasificaciones, esa parte final (con el par de bucles) es solo para dividir la predicción en dos matrices de 2x2.

Guardé las matrices de predicción en un archivo CSV, y como dije, todos eran ceros.

También confirmé que todos los datos son correctos (dimensiones y valores).

¿Por qué la capacitación converge, pero las predicciones son terribles?

Si quieres ver todo el código, aquí está ...

import tensorflow as tf import pdb import numpy as np from numpy import genfromtxt from PIL import Image # Import MINST data # from tensorflow.examples.tutorials.mnist import input_data # mnist = input_data.read_data_sets("/tmp/data/", one_hot=True) # Parameters learning_rate = 0.001 training_iters = 20000 batch_size = 128 display_step = 1 # Network Parameters n_input = 200 # MNIST data input (img shape: 28*28) n_output = 40000 # MNIST total classes (0-9 digits) n_classes = 2 #n_input = 200 dropout = 0.75 # Dropout, probability to keep units # tf Graph input x = tf.placeholder(tf.float32, [None, n_input, n_input]) y = tf.placeholder(tf.float32, [None, n_input, n_input, n_classes]) keep_prob = tf.placeholder(tf.float32) #dropout (keep probability) # Create some wrappers for simplicity def conv2d(x, W, b, strides=1): # Conv2D wrapper, with bias and relu activation x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding=''SAME'') x = tf.nn.bias_add(x, b) return tf.nn.relu(x) def maxpool2d(x, k=2): # MaxPool2D wrapper return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding=''SAME'') # Create model def conv_net(x, weights, biases, dropout): # Reshape input picture x = tf.reshape(x, shape=[-1, n_input, n_input, 1]) # Convolution Layer conv1 = conv2d(x, weights[''wc1''], biases[''bc1'']) # Max Pooling (down-sampling) conv1 = maxpool2d(conv1, k=2) conv1 = tf.nn.local_response_normalization(conv1) # Convolution Layer conv2 = conv2d(conv1, weights[''wc2''], biases[''bc2'']) # Max Pooling (down-sampling) conv2 = tf.nn.local_response_normalization(conv2) conv2 = maxpool2d(conv2, k=2) # Convolution Layer conv3 = conv2d(conv2, weights[''wc3''], biases[''bc3'']) # Max Pooling (down-sampling) conv3 = tf.nn.local_response_normalization(conv3) conv3 = maxpool2d(conv3, k=2) # pdb.set_trace() # Fully connected layer # Reshape conv2 output to fit fully connected layer input fc1 = tf.reshape(conv3, [-1, weights[''wd1''].get_shape().as_list()[0]]) fc1 = tf.add(tf.matmul(fc1, weights[''wd1'']), biases[''bd1'']) fc1 = tf.nn.relu(fc1) # Apply Dropout fc1 = tf.nn.dropout(fc1, dropout) output = [] for i in xrange(2): output.append(tf.nn.softmax(tf.add(tf.matmul(fc1, weights[''out'']), biases[''out'']))) return output # return tf.nn.softmax(tf.add(tf.matmul(fc1, weights[''out'']), biases[''out''])) # Store layers weight & bias weights = { # 5x5 conv, 1 input, 32 outputs ''wc1'': tf.Variable(tf.random_normal([5, 5, 1, 32])), # 5x5 conv, 32 inputs, 64 outputs ''wc2'': tf.Variable(tf.random_normal([5, 5, 32, 64])), # 5x5 conv, 32 inputs, 64 outputs ''wc3'': tf.Variable(tf.random_normal([5, 5, 64, 128])), # fully connected, 7*7*64 inputs, 1024 outputs ''wd1'': tf.Variable(tf.random_normal([25*25*128, 1024])), # 1024 inputs, 10 outputs (class prediction) ''out'': tf.Variable(tf.random_normal([1024, n_output])) } biases = { ''bc1'': tf.Variable(tf.random_normal([32])), ''bc2'': tf.Variable(tf.random_normal([64])), ''bc3'': tf.Variable(tf.random_normal([128])), ''bd1'': tf.Variable(tf.random_normal([1024])), ''out'': tf.Variable(tf.random_normal([n_output])) } # Construct model pred = conv_net(x, weights, biases, keep_prob) # pdb.set_trace() pred = tf.pack(tf.transpose(pred,[1,2,0])) pred = tf.reshape(pred, [-1,n_input,n_input,n_classes]) # Define loss and optimizer cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(pred, y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) # Initializing the variables init = tf.initialize_all_variables() saver = tf.train.Saver() def convert_to_2_channel(x, batch_size): #assume input has dimension (batch_size,x,y) #output will have dimension (batch_size,x,y,2) output = np.empty((batch_size, 200, 200, 2)) temp_arr1 = np.empty((batch_size, 200, 200)) temp_arr2 = np.empty((batch_size, 200, 200)) for i in xrange(batch_size): for j in xrange(200): for k in xrange(200): if x[i][j][k] == 1: temp_arr1[i][j][k] = 1 temp_arr2[i][j][k] = 0 else: temp_arr1[i][j][k] = 0 temp_arr2[i][j][k] = 1 for i in xrange(batch_size): for j in xrange(200): for k in xrange(200): for l in xrange(2): if l == 0: output[i][j][k][l] = temp_arr1[i][j][k] else: output[i][j][k][l] = temp_arr2[i][j][k] return output # Launch the graph with tf.Session() as sess: sess.run(init) summary = tf.train.SummaryWriter(''/tmp/logdir/'', sess.graph_def) step = 1 from tensorflow.contrib.learn.python.learn.datasets.scroll import scroll_data data = scroll_data.read_data(''/home/kendall/Desktop/'') # Keep training until reach max iterations flag = 0 # while flag == 0: while step * batch_size < training_iters: batch_y, batch_x = data.train.next_batch(batch_size) # pdb.set_trace() # batch_x = batch_x.reshape((batch_size, n_input)) batch_x = batch_x.reshape((batch_size, n_input, n_input)) batch_y = batch_y.reshape((batch_size, n_input, n_input)) batch_y = convert_to_2_channel(batch_y, batch_size) # batch_y = batch_y.reshape((batch_size, n_output, n_classes)) batch_y = batch_y.reshape((batch_size, 200, 200, n_classes)) sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout}) if step % display_step == 0: flag = 1 # Calculate batch loss and accuracy loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x, y: batch_y, keep_prob: 1.}) print "Iter " + str(step*batch_size) + ", Minibatch Loss= " + / "{:.6f}".format(loss) + ", Training Accuracy= " + / "{:.5f}".format(acc) step += 1 print "Optimization Finished!" save_path = "model.ckpt" saver.save(sess, save_path) im = Image.open(''/home/kendall/Desktop/HA900_frames/frame0635.tif'') batch_x = np.array(im) pdb.set_trace() batch_x = batch_x.reshape((1, n_input, n_input)) batch_x = batch_x.astype(float) # pdb.set_trace() prediction = sess.run(pred, feed_dict={x: batch_x, keep_prob: 1.}) print prediction arr1 = np.empty((n_input,n_input)) arr2 = np.empty((n_input,n_input)) for i in xrange(n_input): for j in xrange(n_input): for k in xrange(2): if k == 0: arr1[i][j] = prediction[0][i][j][k] else: arr2[i][j] = prediction[0][i][j][k] # prediction = np.asarray(prediction) # prediction = np.reshape(prediction, (200,200)) # np.savetxt("prediction.csv", prediction, delimiter=",") np.savetxt("prediction1.csv", arr1, delimiter=",") np.savetxt("prediction2.csv", arr2, delimiter=",") # Calculate accuracy for 256 mnist test images print "Testing Accuracy:", / sess.run(accuracy, feed_dict={x: data.test.images[:256], y: data.test.labels[:256], keep_prob: 1.})