dictionaries comprehension python list dictionary join list-comprehension

comprehension - python dictionary methods



Python-Aplanar la lista de diccionarios (4)

Lista de diccionarios:

data = [{ ''a'':{''l'':''Apple'', ''b'':''Milk'', ''d'':''Meatball''}, ''b'':{''favourite'':''coke'', ''dislike'':''juice''} }, { ''a'':{''l'':''Apple1'', ''b'':''Milk1'', ''d'':''Meatball2''}, ''b'':{''favourite'':''coke2'', ''dislike'':''juice3''} }, ... ]

Necesito unirme a todos los diccionarios anidados para alcanzar el resultado esperado:

[{''d'': ''Meatball'', ''b'': ''Milk'', ''l'': ''Apple'', ''dislike'': ''juice'', ''favourite'': ''coke''}, {''d'': ''Meatball2'', ''b'': ''Milk1'', ''l'': ''Apple1'', ''dislike'': ''juice3'', ''favourite'': ''coke2''}]

Intento la comprensión de la lista anidada, pero no puedo unir a dict juntos:

L = [y for x in data for y in x.values()] print (L) [{''d'': ''Meatball'', ''b'': ''Milk'', ''l'': ''Apple''}, {''dislike'': ''juice'', ''favourite'': ''coke''}, {''d'': ''Meatball2'', ''b'': ''Milk1'', ''l'': ''Apple1''}, {''dislike'': ''juice3'', ''favourite'': ''coke2''}]

Estoy buscando la solución más rápida.


Puede hacer esto con 2 bucles anidados, y dict.update() para agregar diccionarios internos a un diccionario temporal y agregarlo al final:

L = [] for d in data: temp = {} for key in d: temp.update(d[key]) L.append(temp) # timeit ~1.4 print(L)

Qué salidas:

[{''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball'', ''favourite'': ''coke'', ''dislike'': ''juice''}, {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2'', ''favourite'': ''coke2'', ''dislike'': ''juice3''}]


Puede usar functools.reduce junto con una lista de comprensión simple para aplanar la lista de dictos

>>> from functools import reduce >>> data = [{''b'': {''dislike'': ''juice'', ''favourite'': ''coke''}, ''a'': {''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball''}}, {''b'': {''dislike'': ''juice3'', ''favourite'': ''coke2''}, ''a'': {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2''}}] >>> [reduce(lambda x,y: {**x,**y},d.values()) for d in data] >>> [{''dislike'': ''juice'', ''l'': ''Apple'', ''d'': ''Meatball'', ''b'': ''Milk'', ''favourite'': ''coke''}, {''dislike'': ''juice3'', ''l'': ''Apple1'', ''d'': ''Meatball2'', ''b'': ''Milk1'', ''favourite'': ''coke2''}]

El tiempo de referencia es el siguiente:

>>> import timeit >>> setup = """ from functools import reduce data = [{''b'': {''dislike'': ''juice'', ''favourite'': ''coke''}, ''a'': {''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball''}}, {''b'': {''dislike'': ''juice3'', ''favourite'': ''coke2''}, ''a'': {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2''}}] """ >>> min(timeit.Timer("[reduce(lambda x,y: {**x,**y},d.values()) for d in data]",setup=setup).repeat(3,1000000)) >>> 1.525032774952706

Tiempo de referencia de otras respuestas en mi máquina

>>> setup = """ data = [{''b'': {''dislike'': ''juice'', ''favourite'': ''coke''}, ''a'': {''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball''}}, {''b'': {''dislike'': ''juice3'', ''favourite'': ''coke2''}, ''a'': {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2''}}] """ >>> min(timeit.Timer("[{k: v for x in d.values() for k, v in x.items()} for d in data]",setup=setup).repeat(3,1000000)) >>> 2.2488374650129117 >>> min(timeit.Timer("[{k: x[k] for x in d.values() for k in x} for d in data]",setup=setup).repeat(3,1000000)) >>> 1.8990078769857064 >>> code = """ L = [] for d in data: temp = {} for key in d: temp.update(d[key]) L.append(temp) """ >>> min(timeit.Timer(code,setup=setup).repeat(3,1000000)) >>> 1.4258553800173104 >>> setup = """ from itertools import chain data = [{''b'': {''dislike'': ''juice'', ''favourite'': ''coke''}, ''a'': {''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball''}}, {''b'': {''dislike'': ''juice3'', ''favourite'': ''coke2''}, ''a'': {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2''}}] """ >>> min(timeit.Timer("[dict(chain(*map(dict.items, d.values()))) for d in data]",setup=setup).repeat(3,1000000)) >>> 3.774383604992181


Puedes hacer lo siguiente, usando itertools.chain :

>>> from itertools import chain # timeit: ~3.40 >>> [dict(chain(*map(dict.items, d.values()))) for d in data] [{''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball'', ''favourite'': ''coke'', ''dislike'': ''juice''}, {''l'': ''Apple1'', ''b'': ''Milk1'', ''dislike'': ''juice3'', ''favourite'': ''coke2'', ''d'': ''Meatball2''}]

El uso de chain , map , * hace que esta expresión sea una abreviatura de la siguiente comprensión anidada que realmente funciona mejor en mi sistema (Python 3.5.2) y no es mucho más larga:

# timeit: ~2.04 [{k: v for x in d.values() for k, v in x.items()} for d in data] # Or, not using items, but lookup by key # timeit: ~1.67 [{k: x[k] for x in d.values() for k in x} for d in data]

Nota:

RoadRunner''s enfoque RoadRunner''s bucle y actualización de RoadRunner''s supera a estas dos líneas en el timeit: ~1.37


Si tiene diccionarios anidados con solo las teclas ''a'' y ''b'', sugiero la siguiente solución que encuentro rápida y muy fácil de entender (para facilitar la lectura):

L = [x[''a''] for x in data] b = [x[''b''] for x in data] for i in range(len(L)): L[i].update(b[i]) # timeit ~1.4 print(L)