python - tutorial - ¿Cómo agregar regularizaciones en TensorFlow?

tensorflow tutorial (10)

Algunas respuestas me confunden más. Aquí les doy dos métodos para aclararlo.

#1.adding all regs by hand var1 = tf.get_variable(name=''v1'',shape=[1],dtype=tf.float32) var2 = tf.Variable(name=''v2'',initial_value=1.0,dtype=tf.float32) regularizer = tf.contrib.layers.l1_regularizer(0.1) reg_term = tf.contrib.layers.apply_regularization(regularizer,[var1,var2]) #here reg_term is a scalar #2.auto added and read,but using get_variable with tf.variable_scope(''x'', regularizer=tf.contrib.layers.l2_regularizer(0.1)): var1 = tf.get_variable(name=''v1'',shape=[1],dtype=tf.float32) var2 = tf.get_variable(name=''v2'',shape=[1],dtype=tf.float32) reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) #here reg_losses is a list,should be summed

Luego, se puede agregar a la pérdida total

En muchos códigos de redes neuronales disponibles implementados usando TensorFlow descubrí que los términos de regularización a menudo se implementan agregando manualmente un término adicional al valor de pérdida.

Mis preguntas son:

¿Existe una forma más elegante o recomendada de regularización que hacerlo manualmente?
También encuentro que get_variable tiene un argumento regularizer . ¿Cómo debe usarse? Según mi observación, si le pasamos un regularizador (como tf.contrib.layers.l2_regularizer , se tf.contrib.layers.l2_regularizer un tensor que representa el término regularizado y se agregará a una colección de gráficos llamada tf.GraphKeys.REGULARIZATOIN_LOSSES . ¿ tf.GraphKeys.REGULARIZATOIN_LOSSES colección se usará automáticamente? por TensorFlow (por ejemplo, utilizado por los optimizadores cuando se entrena) ¿O se espera que yo use esa colección yo solo?

Algunos aspectos de la respuesta existente no me quedaron claros de inmediato, así que aquí hay una guía paso a paso:

Definir un regularizador. Aquí es donde se puede establecer la constante de regularización, por ejemplo:

regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)
Crear variables a través de:

weights = tf.get_variable( name="weights", regularizer=regularizer, ... )
De manera equivalente, las variables se pueden crear a través del constructor de weights = tf.Variable(...) regulares weights = tf.Variable(...) , seguido de tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, weights) .
Defina algún término de loss y agregue el término de regularización:

reg_variables = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) reg_term = tf.contrib.layers.apply_regularization(regularizer, reg_variables) loss += reg_term
Nota: Parece que tf.contrib.layers.apply_regularization se implementa como un AddN , por lo que es más o menos equivalente a sum(reg_variables) .

Como dices en el segundo punto, usar el argumento regularizer es la forma recomendada. Puede usarlo en get_variable , o configurarlo una vez en su variable_scope y regularizar todas sus variables.

Las pérdidas se recopilan en el gráfico, y debe agregarlas manualmente a su función de costos de esta manera.

reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) reg_constant = 0.01 # Choose an appropriate one. loss = my_normal_loss + reg_constant * sum(reg_losses)

¡Espero que ayude!

Otra opción para hacer esto con la biblioteca contrib.learn es la siguiente, basada en el tutorial Deep MNIST en el sitio web de Tensorflow. Primero, suponiendo que haya importado las bibliotecas relevantes (como import tensorflow.contrib.layers as layers ), puede definir una red en un método separado:

def easier_network(x, reg): """ A network based on tf.contrib.learn, with input `x`. """ with tf.variable_scope(''EasyNet''): out = layers.flatten(x) out = layers.fully_connected(out, num_outputs=200, weights_initializer = layers.xavier_initializer(uniform=True), weights_regularizer = layers.l2_regularizer(scale=reg), activation_fn = tf.nn.tanh) out = layers.fully_connected(out, num_outputs=200, weights_initializer = layers.xavier_initializer(uniform=True), weights_regularizer = layers.l2_regularizer(scale=reg), activation_fn = tf.nn.tanh) out = layers.fully_connected(out, num_outputs=10, # Because there are ten digits! weights_initializer = layers.xavier_initializer(uniform=True), weights_regularizer = layers.l2_regularizer(scale=reg), activation_fn = None) return out

Luego, en un método principal, puede usar el siguiente fragmento de código:

def main(_): mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True) x = tf.placeholder(tf.float32, [None, 784]) y_ = tf.placeholder(tf.float32, [None, 10]) # Make a network with regularization y_conv = easier_network(x, FLAGS.regu) weights = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, ''EasyNet'') print("") for w in weights: shp = w.get_shape().as_list() print("- {} shape:{} size:{}".format(w.name, shp, np.prod(shp))) print("") reg_ws = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES, ''EasyNet'') for w in reg_ws: shp = w.get_shape().as_list() print("- {} shape:{} size:{}".format(w.name, shp, np.prod(shp))) print("") # Make the loss function `loss_fn` with regularization. cross_entropy = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)) loss_fn = cross_entropy + tf.reduce_sum(reg_ws) train_step = tf.train.AdamOptimizer(1e-4).minimize(loss_fn)

Para que esto funcione, debe seguir el tutorial de MNIST al que he vinculado anteriormente e importar las bibliotecas relevantes, pero es un buen ejercicio para aprender TensorFlow y es fácil ver cómo la regularización afecta el resultado. Si aplica una regularización como argumento, puede ver lo siguiente:

- EasyNet/fully_connected/weights:0 shape:[784, 200] size:156800 - EasyNet/fully_connected/biases:0 shape:[200] size:200 - EasyNet/fully_connected_1/weights:0 shape:[200, 200] size:40000 - EasyNet/fully_connected_1/biases:0 shape:[200] size:200 - EasyNet/fully_connected_2/weights:0 shape:[200, 10] size:2000 - EasyNet/fully_connected_2/biases:0 shape:[10] size:10 - EasyNet/fully_connected/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0 - EasyNet/fully_connected_1/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0 - EasyNet/fully_connected_2/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0

Tenga en cuenta que la parte de regularización le proporciona tres elementos, en función de los elementos disponibles.

Con regularizaciones de 0, 0.0001, 0.01 y 1.0, obtengo valores de precisión de prueba de 0.9468, 0.9476, 0.9183 y 0.1135, respectivamente, que muestran los peligros de los términos de alta regularización.

Proporcionaré una respuesta simple y correcta ya que no encontré una. Necesita dos pasos simples, el resto se realiza mediante magia tensorflow:

Agregue regularizadores al crear variables o capas:

tf.layers.dense(x, kernel_regularizer=tf.contrib.layers.l2_regularizer(0.001)) # or tf.get_variable(''a'', regularizer=tf.contrib.layers.l2_regularizer(0.001))
Agregue el término de regularización al definir la pérdida:

loss = ordinary_loss + tf.losses.get_regularization_loss()

Si alguien todavía está buscando, me gustaría agregar que en tf.keras puede agregar regularización de peso pasándola como argumentos en sus capas. Un ejemplo de agregar la regularización L2 tomada al por mayor del sitio Tutoriales de Keras de Tensorflow:

model = keras.models.Sequential([ keras.layers.Dense(16, kernel_regularizer=keras.regularizers.l2(0.001), activation=tf.nn.relu, input_shape=(NUM_WORDS,)), keras.layers.Dense(16, kernel_regularizer=keras.regularizers.l2(0.001), activation=tf.nn.relu), keras.layers.Dense(1, activation=tf.nn.sigmoid) ])

No hay necesidad de agregar manualmente las pérdidas de regularización con este método, que yo sepa.

Referencia: https://www.tensorflow.org/tutorials/keras/overfit_and_underfit#add_weight_regularization

Si tiene CNN, puede hacer lo siguiente:

En su función de modelo:

conv = tf.layers.conv2d(inputs=input_layer, filters=32, kernel_size=[3, 3], kernel_initializer=''xavier'', kernel_regularizer=tf.contrib.layers.l2_regularizer(1e-5), padding="same", activation=None) ...

En su función de pérdida:

onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=num_classes) loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits) regularization_losses = tf.losses.get_regularization_losses() loss = tf.add_n([loss] + regularization_losses)

tf.GraphKeys.REGULARIZATION_LOSSES no se agregará automáticamente, pero hay una manera simple de agregarlos:

reg_loss = tf.losses.get_regularization_loss() total_loss = loss + reg_loss

tf.losses.get_regularization_loss() usa tf.add_n para sumar las entradas de tf.GraphKeys.REGULARIZATION_LOSSES elementos. tf.GraphKeys.REGULARIZATION_LOSSES generalmente será una lista de escalares, calculada mediante funciones de regularizador. Obtiene entradas de llamadas a tf.get_variable que tienen el parámetro regularizer especificado. También puede agregar a esa colección manualmente. Eso sería útil al usar tf.Variable y también al especificar regularizadores de actividad u otros regularizadores personalizados. Por ejemplo:

#This will add an activity regularizer on y to the regloss collection regularizer = tf.contrib.layers.l2_regularizer(0.1) y = tf.nn.sigmoid(x) act_reg = regularizer(y) tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, act_reg)

(En este ejemplo, presumiblemente sería más efectivo regularizar x, ya que y realmente se aplana para x grande).

tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) y tf.losses.get_regularization_loss() con un l2_regularizer en el gráfico, y descubrí que devuelven el mismo valor. Al observar la cantidad del valor, supongo que reg_constant ya tiene sentido en el valor al establecer el parámetro de tf.contrib.layers.l2_regularizer .

cross_entropy = tf.losses.softmax_cross_entropy( logits=logits, onehot_labels=labels) l2_loss = weight_decay * tf.add_n( [tf.nn.l2_loss(tf.cast(v, tf.float32)) for v in tf.trainable_variables()]) loss = cross_entropy + l2_loss