sklearn - TensorFlow: alternativa numpy.repeat()

python knn algorithm (9)

Quiero comparar los valores pronosticados yp de mi red neuronal de manera pareada, y entonces estaba usando (en mi antigua implementación numpy):

idx = np.repeat(np.arange(len(yp)), len(yp)) jdx = np.tile(np.arange(len(yp)), len(yp)) s = yp[[idx]] - yp[[jdx]]

Esto básicamente crea una malla de indexación que luego uso. idx=[0,0,0,1,1,1,...] mientras que jdx=[0,1,2,0,1,2...] . No sé si hay una manera más simple de hacerlo ...

De todos modos, TensorFlow tiene un tf.tile() , pero parece que le falta un tf.repeat() .

idx = np.repeat(np.arange(n), n) v2 = v[idx]

Y me sale el error:

TypeError: Bad slice index [ 0 0 0 ..., 215 215 215] of type <type ''numpy.ndarray''>

Tampoco funciona usar una constante TensorFlow para la indexación:

idx = tf.constant(np.repeat(np.arange(n), n)) v2 = v[idx]

TypeError: Bad slice index Tensor("Const:0", shape=TensorShape([Dimension(46656)]), dtype=int64) of type <class ''tensorflow.python.framework.ops.Tensor''>

La idea es convertir mi implementación de RankNet a TensorFlow.

Aunque se han dado muchas soluciones limpias y de trabajo, parece que todas se basan en producir el conjunto de índices desde cero en cada iteración.

Si bien el costo para producir estos nodos generalmente no es significativo durante el entrenamiento, puede ser significativo si usa su modelo para inferencia.

La repetición de tf.range (como su ejemplo) ha aparecido varias veces, así que creé el siguiente creador de funciones. Dada la cantidad máxima de veces que se repetirá algo y la cantidad máxima de cosas que necesitarán repetirse, devuelve una función que produce los mismos valores que np.repeat(np.arange(len(multiples)), multiples) .

import tensorflow as tf import numpy as np def numpy_style_repeat_1d_creator(max_multiple=100, max_to_repeat=10000): board_num_lookup_ary = np.repeat( np.arange(max_to_repeat), np.full([max_to_repeat], max_multiple)) board_num_lookup_ary = board_num_lookup_ary.reshape(max_to_repeat, max_multiple) def fn_to_return(multiples): board_num_lookup_tensor = tf.constant(board_num_lookup_ary, dtype=tf.int32) casted_multiples = tf.cast(multiples, dtype=tf.int32) padded_multiples = tf.pad( casted_multiples, [[0, max_to_repeat - tf.shape(multiples)[0]]]) return tf.boolean_mask( board_num_lookup_tensor, tf.sequence_mask(padded_multiples, maxlen=max_multiple)) return fn_to_return #Here''s an example of how it can be used with tf.Session() as sess: repeater = numpy_style_repeat_1d_creator(5,4) multiples = tf.constant([4,1,3]) repeated_values = repeater(multiples) print(sess.run(repeated_values))

La idea general es almacenar un tensor repetido y luego enmascararlo, pero puede ser útil verlo visualmente (esto es para el ejemplo dado anteriormente):

In the example above the following Tensor is produced: [[0,0,0,0,0], [1,1,1,1,1], [2,2,2,2,2], [3,3,3,3,3]] For multiples [4,1,3] it will collect the non-X values: [[0,0,0,0,X], [1,X,X,X,X], [2,2,2,X,X], [X,X,X,X,X]] resulting in: [0,0,0,0,1,2,2,2]

tl; dr: Para evitar producir los índices cada vez (puede ser costoso), repita todo y luego enmascare ese tensor cada vez

De acuerdo con el document tf api, tf.keras.backend.repeat_elements() hace el mismo trabajo con np.repeat() . Por ejemplo,

x = tf.constant([1, 3, 3, 1], dtype=tf.float32) rep_x = tf.keras.backend.repeat_elements(x, 5, axis=0) # result: [1. 1. 1. 1. 1. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 1. 1. 1. 1. 1.]

En caso de que alguien esté interesado en un método 2D para copiar las matrices. Creo que esto podría funcionar:

TF_obj = tf.zeros([128, 128]) tf.tile(tf.expand_dims(TF_obj, 2), [1, 1, 2])

Parece que su pregunta es tan popular que las personas la refieren en el rastreador TF . Lamentablemente, la misma función aún no se implementa en TF.

Puede implementarlo combinando tf.tile , tf.reshape , tf.squeeze . Aquí hay una manera de convertir ejemplos de np.repeat :

import numpy as np import tensorflow as tf x = [[1,2],[3,4]] print np.repeat(3, 4) print np.repeat(x, 2) print np.repeat(x, 3, axis=1) x = tf.constant([[1,2],[3,4]]) with tf.Session() as sess: print sess.run(tf.tile([3], [4])) print sess.run(tf.squeeze(tf.reshape(tf.tile(tf.reshape(x, (-1, 1)), (1, 2)), (1, -1)))) print sess.run(tf.reshape(tf.tile(tf.reshape(x, (-1, 1)), (1, 3)), (2, -1)))

En el último caso donde las repeticiones son diferentes para cada elemento, lo más probable es que necesites loops .

Puede lograr el efecto de np.repeat() usando una combinación de tf.tile() y tf.reshape() :

idx = tf.range(len(yp)) idx = tf.reshape(idx, [-1, 1]) # Convert to a len(yp) x 1 matrix. idx = tf.tile(idx, [1, len(yp)]) # Create multiple columns. idx = tf.reshape(idx, [-1]) # Convert back to a vector.

Simplemente puede calcular jdx usando tf.tile() :

jdx = tf.range(len(yp)) jdx = tf.tile(jdx, [len(yp)])

Para la indexación, puede intentar usar tf.gather() para extraer tf.gather() no contiguos del tensor yp :

s = tf.gather(yp, idx) - tf.gather(yp, jdx)

Puede simular tf.repeat faltante tf.repeat el valor consigo mismo:

value = np.arange(len(yp)) # what to repeat repeat_count = len(yp) # how many times repeated = tf.stack ([value for i in range(repeat_count)], axis=1)

Aconsejo usar esto solo en pequeños recuentos repetidos.

Recientemente se agregó una implementación relativamente rápida con RaggedTensor utilidades RaggedTensor de 1.13, pero no es parte de la API exportada oficialmente. Todavía puede usarlo, pero existe la posibilidad de que desaparezca.

from tensorflow.python.ops.ragged.ragged_util import repeat

Del código fuente:

# This op is intended to exactly match the semantics of numpy.repeat, with # one exception: numpy.repeat has special (and somewhat non-intuitive) behavior # when axis is not specified. Rather than implement that special behavior, we # simply make `axis` be a required argument.

Solo para tensores 1-d, he hecho esta función

def tf_repeat(y,repeat_num) return tf.reshape(tf.tile(tf.expand_dims(y,axis=-1),[1,repeat_num]),[-1])

import numpy as np import tensorflow as tf import itertools x = np.arange(6).reshape(3,2) x = tf.convert_to_tensor(x) N = 3 # number of repetition K = x.shape[0] # for here 3 order = list(range(0, N*K, K)) order = [[x+i for x in order] for i in range(K)] order = list(itertools.chain.from_iterable(order)) x_rep = tf.gather(tf.tile(x, [N, 1]), order)

Resultados de:

[0, 1], [2, 3], [4, 5]]

[[0, 1], [0, 1], [0, 1], [2, 3], [2, 3], [2, 3], [4, 5], [4, 5], [4, 5]]

Si tu quieres:

[[0, 1], [2, 3], [4, 5], [0, 1], [2, 3], [4, 5], [0, 1], [2, 3], [4, 5]]

Simplemente use tf.tile(x, [N, 1])