python - tutorial - ¿Cómo tener múltiples salidas Softmax en Tensorflow?

tensorflow tutorial (2)

No está definiendo sus logits para la capa de softmax tamaño 10 en su código, y tendría que hacer eso explícitamente.

Una vez hecho esto, puede usar tf.nn.softmax , aplicándolo por separado a sus dos tensores logit.

Por ejemplo, para su tensor de softmax de 20 clases:

softmax20 = tf.nn.softmax(logits[0])

Para la otra capa, podrías hacer:

output_layer[1] = Layer.W(1 * hidden_layer_size, 10, ''OutputLayer10'') output_bias[1] = Layer.b(10, ''OutputBias10'') logits[1] = tf.matmul(tf.concat([f.h for f in fstate[0]], 1), output_layer[1]) + output_bias[1] softmax10 = tf.nn.softmax(logits[1])

También hay un tf.contrib.layers.softmax que le permite aplicar el softmax en el eje final de un tensor con más de 2 dimensiones, pero no parece que necesite algo así. tf.nn.softmax debería funcionar aquí.

Nota al output_layer : output_layer no es el mejor nombre para esa lista, debería ser algo relacionado con los pesos. Estos pesos y sesgos ( output_layer , output_bias ) tampoco representan la capa de salida de su red (ya que esto vendrá de lo que sea que haga a sus salidas de softmax, ¿no?). [Lo siento, no pude evitarlo]

Estoy tratando de crear una red en flujo tensorial con múltiples salidas softmax, cada una de un tamaño diferente. La arquitectura de red es: Input -> LSTM -> Dropout. Luego tengo 2 capas de softmax: Softmax de 10 salidas y Softmax de 20 salidas. La razón de esto es porque quiero generar dos conjuntos de resultados (10 y 20) y luego combinarlos para producir un resultado final. No estoy seguro de cómo hacer esto en Tensorflow.

Previamente, para hacer una red como la descrita, pero con un softmax, creo que puedo hacer algo como esto.

inputs = tf.placeholder(tf.float32, [batch_size, maxlength, vocabsize]) lengths = tf.placeholders(tf.int32, [batch_size]) embeddings = tf.Variable(tf.random_uniform([vocabsize, 256], -1, 1)) lstm = {} lstm[0] = tf.contrib.rnn.LSTMCell(hidden_layer_size, state_is_tuple=True, initializer=tf.contrib.layers.xavier_initializer(seed=random_seed)) lstm[0] = tf.contrib.rnn.DropoutWrapper(lstm[0], output_keep_prob=0.5) lstm[0] = tf.contrib.rnn.MultiRNNCell(cells=[lstm[0]] * 1, state_is_tuple=True) output_layer = {} output_layer[0] = Layer.W(1 * hidden_layer_size, 20, ''OutputLayer'') output_bias = {} output_bias[0] = Layer.b(20, ''OutputBias'') outputs = {} fstate = {} with tf.variable_scope("lstm0"): # create the rnn graph at run time outputs[0], fstate[0] = tf.nn.dynamic_rnn(lstm[0], tf.nn.embedding_lookup(embeddings, inputs), sequence_length=lengths, dtype=tf.float32) logits = {} logits[0] = tf.matmul(tf.concat([f.h for f in fstate[0]], 1), output_layer[0]) + output_bias[0] loss = {} loss[0] = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits[0], labels=labels[0]))

Sin embargo, ahora, quiero que mi salida RNN (después del abandono) fluya a 2 capas de softmax, una de tamaño 10 y otra de tamaño 20. ¿Alguien tiene una idea de cómo hacer esto?

Gracias

Editar: Lo ideal sería utilizar una versión de softmax como la que se define aquí en esta biblioteca de Knet Julia. ¿Tiene Tensorflow un equivalente? https://github.com/denizyuret/Knet.jl/blob/1ef934cc58f9671f2d85063f88a3d6959a49d088/deprecated/src7/op/actf.jl#L103

Puede hacer lo siguiente en la salida de dynamic_rnn que llamó a output[0] para calcular los dos softmax y las pérdidas correspondientes:

with tf.variable_scope("softmax_0"): # Transform you RNN output to the right output size = 10 W = tf.get_variable("kernel_0", [output[0].get_shape()[1], 10]) logits_0 = tf.matmul(inputs, W) # Apply the softmax function to the logits (of size 10) output_0 = tf.nn.softmax(logits_0, name = "softmax_0") # Compute the loss (as you did in your question) with softmax_cross_entropy_with_logits directly applied on logits loss_0 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits_0, labels=labels[0])) with tf.variable_scope("softmax_1"): # Transform you RNN output to the right output size = 20 W = tf.get_variable("kernel_1", [output[0].get_shape()[1], 20]) logits_1 = tf.matmul(inputs, W) # Apply the softmax function to the logits (of size 20) output_1 = tf.nn.softmax(logits_1, name = "softmax_1") # Compute the loss (as you did in your question) with softmax_cross_entropy_with_logits directly applied on logits loss_1 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits_1, labels=labels[1]))

A continuación, puede combinar las dos pérdidas si es relevante para su aplicación:

total_loss = loss_0 + loss_1

EDITAR Para responder a su pregunta en comentario sobre lo que específicamente necesita hacer con las dos salidas de softmax: puede hacer lo siguiente aproximadamente:

with tf.variable_scope("second_part"): W1 = tf.get_variable("W_1", [output_1.get_shape()[1], n]) W2 = tf.get_variable("W_2", [output_2.get_shape()[1], n]) prediction = tf.matmul(output_1, W1) + tf.matmul(output_2, W2) with tf.variable_scope("optimization_part"): loss = tf.reduce_mean(tf.squared_difference(prediction, label))

Solo necesita definir n , la cantidad de columnas de W1 y W2.