tensorflow - descargar - paralelizando tf.data.Dataset.from_generator

tensorflow dataset batch (3)

Tengo un from_generator entrada no trivial que from_generator es perfecto para ...

dataset = tf.data.Dataset.from_generator(complex_img_label_generator, (tf.int32, tf.string)) dataset = dataset.batch(64) iter = dataset.make_one_shot_iterator() imgs, labels = iter.get_next()

Donde complex_img_label_generator genera dinámicamente imágenes y devuelve una matriz numpy que representa una imagen (H, W, 3) y una etiqueta de string simple. El procesamiento no es algo que pueda representar como lectura de archivos y operaciones de imagen.

Mi pregunta es acerca de cómo paralizar el generador? ¿Cómo tengo N de estos generadores corriendo en sus propios hilos?

Un pensamiento fue usar num_parallel_calls con num_parallel_calls para manejar el subproceso; pero el mapa opera con tensores ... Otro pensamiento fue crear múltiples generadores, cada uno con su propia prefetch y unirlos de alguna manera, pero ¿no puedo ver cómo me uniría a los generadores de N?

¿Algún ejemplo canónico que pudiera seguir?

Estoy trabajando en un from_indexable para tf.data.Dataset https://github.com/tensorflow/tensorflow/issues/14448

La ventaja para from_indexable es que puede ser paralelizado, mientras que un generador de python no puede ser paralelizado.

La función from_indexable hace un tf.data.range , envuelve el indexable en un tf.py_func generalizado y llama al mapa.

Para aquellos que quieren ahora un from_indexable , aquí el código lib

import tensorflow as tf import numpy as np from tensorflow.python.framework import tensor_shape from tensorflow.python.util import nest def py_func_decorator(output_types=None, output_shapes=None, stateful=True, name=None): def decorator(func): def call(*args): nonlocal output_shapes flat_output_types = nest.flatten(output_types) flat_values = tf.py_func( func, inp=args, Tout=flat_output_types, stateful=stateful, name=name ) if output_shapes is not None: # I am not sure if this is nessesary output_shapes = nest.map_structure_up_to( output_types, tensor_shape.as_shape, output_shapes) flattened_shapes = nest.flatten_up_to(output_types, output_shapes) for ret_t, shape in zip(flat_values, flattened_shapes): ret_t.set_shape(shape) return nest.pack_sequence_as(output_types, flat_values) return call return decorator def from_indexable(iterator, output_types, output_shapes=None, num_parallel_calls=None, stateful=True, name=None): ds = tf.data.Dataset.range(len(iterator)) @py_func_decorator(output_types, output_shapes, stateful=stateful, name=name) def index_to_entry(index): return iterator[index] return ds.map(index_to_entry, num_parallel_calls=num_parallel_calls)

y aquí un ejemplo (Nota: from_indexable tiene un argument num_parallel_calls)

class PyDataSet: def __len__(self): return 20 def __getitem__(self, item): return np.random.normal(size=(item+1, 10)) ds = from_indexable(PyDataSet(), output_types=tf.float64, output_shapes=[None, 10]) it = ds.make_one_shot_iterator() entry = it.get_next() with tf.Session() as sess: print(sess.run(entry).shape) print(sess.run(entry).shape)

Actualización del 10 de junio de 2018: dado que https://github.com/tensorflow/tensorflow/pull/15121 se fusiona, el código para from_indexable simplifica a:

import tensorflow as tf def py_func_decorator(output_types=None, output_shapes=None, stateful=True, name=None): def decorator(func): def call(*args, **kwargs): return tf.contrib.framework.py_func( func=func, args=args, kwargs=kwargs, output_types=output_types, output_shapes=output_shapes, stateful=stateful, name=name ) return call return decorator def from_indexable(iterator, output_types, output_shapes=None, num_parallel_calls=None, stateful=True, name=None): ds = tf.data.Dataset.range(len(iterator)) @py_func_decorator(output_types, output_shapes, stateful=stateful, name=name) def index_to_entry(index): return iterator[index] return ds.map(index_to_entry, num_parallel_calls=num_parallel_calls)

Limitar el trabajo realizado en el generator a un mínimo y paralelizar el procesamiento costoso utilizando un map es sensato.

Alternativamente, puede "unir" varios generadores usando parallel_interleave siguiente manera:

def generator(n): # returns n-th generator function def dataset(n): return tf.data.Dataset.from_generator(generator(n)) ds = tf.data.Dataset.range(N).apply(tf.contrib.data.parallel_interleave(dataset, cycle_lenght=N)) # where N is the number of generators you use

Resulta que puedo usar Dataset.map si hago que el generador sea súper liviano (solo genere metadatos) y luego muevo la iluminación intensa real a una función sin estado. De esta manera puedo paralelizar solo la parte pesada con .map usando un py_func .

Trabajos; pero se siente un poco torpe ... Sería genial poder simplemente agregar num_parallel_calls a from_generator :)

def pure_numpy_and_pil_complex_calculation(metadata, label): # some complex pil and numpy work nothing to do with tf ... dataset = tf.data.Dataset.from_generator(lightweight_generator, output_types=(tf.string, # metadata tf.string)) # label def wrapped_complex_calulation(metadata, label): return tf.py_func(func = pure_numpy_and_pil_complex_calculation, inp = (metadata, label), Tout = (tf.uint8, # (H,W,3) img tf.string)) # label dataset = dataset.map(wrapped_complex_calulation, num_parallel_calls=8) dataset = dataset.batch(64) iter = dataset.make_one_shot_iterator() imgs, labels = iter.get_next()