comprehension - python dictionary methods
Python-Aplanar la lista de diccionarios (4)
Lista de diccionarios:
data = [{
''a'':{''l'':''Apple'',
''b'':''Milk'',
''d'':''Meatball''},
''b'':{''favourite'':''coke'',
''dislike'':''juice''}
},
{
''a'':{''l'':''Apple1'',
''b'':''Milk1'',
''d'':''Meatball2''},
''b'':{''favourite'':''coke2'',
''dislike'':''juice3''}
}, ...
]
Necesito unirme a todos los diccionarios anidados para alcanzar el resultado esperado:
[{''d'': ''Meatball'', ''b'': ''Milk'', ''l'': ''Apple'', ''dislike'': ''juice'', ''favourite'': ''coke''},
{''d'': ''Meatball2'', ''b'': ''Milk1'', ''l'': ''Apple1'', ''dislike'': ''juice3'', ''favourite'': ''coke2''}]
Intento la comprensión de la lista anidada, pero no puedo unir a dict juntos:
L = [y for x in data for y in x.values()]
print (L)
[{''d'': ''Meatball'', ''b'': ''Milk'', ''l'': ''Apple''},
{''dislike'': ''juice'', ''favourite'': ''coke''},
{''d'': ''Meatball2'', ''b'': ''Milk1'', ''l'': ''Apple1''},
{''dislike'': ''juice3'', ''favourite'': ''coke2''}]
Estoy buscando la solución más rápida.
Puede hacer esto con 2 bucles anidados, y dict.update()
para agregar diccionarios internos a un diccionario temporal y agregarlo al final:
L = []
for d in data:
temp = {}
for key in d:
temp.update(d[key])
L.append(temp)
# timeit ~1.4
print(L)
Qué salidas:
[{''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball'', ''favourite'': ''coke'', ''dislike'': ''juice''}, {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2'', ''favourite'': ''coke2'', ''dislike'': ''juice3''}]
Puede usar functools.reduce
junto con una lista de comprensión simple para aplanar la lista de dictos
>>> from functools import reduce
>>> data = [{''b'': {''dislike'': ''juice'', ''favourite'': ''coke''}, ''a'': {''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball''}}, {''b'': {''dislike'': ''juice3'', ''favourite'': ''coke2''}, ''a'': {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2''}}]
>>> [reduce(lambda x,y: {**x,**y},d.values()) for d in data]
>>> [{''dislike'': ''juice'', ''l'': ''Apple'', ''d'': ''Meatball'', ''b'': ''Milk'', ''favourite'': ''coke''}, {''dislike'': ''juice3'', ''l'': ''Apple1'', ''d'': ''Meatball2'', ''b'': ''Milk1'', ''favourite'': ''coke2''}]
El tiempo de referencia es el siguiente:
>>> import timeit
>>> setup = """
from functools import reduce
data = [{''b'': {''dislike'': ''juice'', ''favourite'': ''coke''}, ''a'': {''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball''}}, {''b'': {''dislike'': ''juice3'', ''favourite'': ''coke2''}, ''a'': {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2''}}]
"""
>>> min(timeit.Timer("[reduce(lambda x,y: {**x,**y},d.values()) for d in data]",setup=setup).repeat(3,1000000))
>>> 1.525032774952706
Tiempo de referencia de otras respuestas en mi máquina
>>> setup = """
data = [{''b'': {''dislike'': ''juice'', ''favourite'': ''coke''}, ''a'': {''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball''}}, {''b'': {''dislike'': ''juice3'', ''favourite'': ''coke2''}, ''a'': {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2''}}]
"""
>>> min(timeit.Timer("[{k: v for x in d.values() for k, v in x.items()} for d in data]",setup=setup).repeat(3,1000000))
>>> 2.2488374650129117
>>> min(timeit.Timer("[{k: x[k] for x in d.values() for k in x} for d in data]",setup=setup).repeat(3,1000000))
>>> 1.8990078769857064
>>> code = """
L = []
for d in data:
temp = {}
for key in d:
temp.update(d[key])
L.append(temp)
"""
>>> min(timeit.Timer(code,setup=setup).repeat(3,1000000))
>>> 1.4258553800173104
>>> setup = """
from itertools import chain
data = [{''b'': {''dislike'': ''juice'', ''favourite'': ''coke''}, ''a'': {''l'': ''Apple'', ''b'': ''Milk'', ''d'': ''Meatball''}}, {''b'': {''dislike'': ''juice3'', ''favourite'': ''coke2''}, ''a'': {''l'': ''Apple1'', ''b'': ''Milk1'', ''d'': ''Meatball2''}}]
"""
>>> min(timeit.Timer("[dict(chain(*map(dict.items, d.values()))) for d in data]",setup=setup).repeat(3,1000000))
>>> 3.774383604992181
Puedes hacer lo siguiente, usando itertools.chain
:
>>> from itertools import chain
# timeit: ~3.40
>>> [dict(chain(*map(dict.items, d.values()))) for d in data]
[{''l'': ''Apple'',
''b'': ''Milk'',
''d'': ''Meatball'',
''favourite'': ''coke'',
''dislike'': ''juice''},
{''l'': ''Apple1'',
''b'': ''Milk1'',
''dislike'': ''juice3'',
''favourite'': ''coke2'',
''d'': ''Meatball2''}]
El uso de chain
, map
, *
hace que esta expresión sea una abreviatura de la siguiente comprensión anidada que realmente funciona mejor en mi sistema (Python 3.5.2) y no es mucho más larga:
# timeit: ~2.04
[{k: v for x in d.values() for k, v in x.items()} for d in data]
# Or, not using items, but lookup by key
# timeit: ~1.67
[{k: x[k] for x in d.values() for k in x} for d in data]
Nota:
RoadRunner''s enfoque RoadRunner''s bucle y actualización de RoadRunner''s supera a estas dos líneas en el timeit: ~1.37
Si tiene diccionarios anidados con solo las teclas ''a'' y ''b'', sugiero la siguiente solución que encuentro rápida y muy fácil de entender (para facilitar la lectura):
L = [x[''a''] for x in data]
b = [x[''b''] for x in data]
for i in range(len(L)):
L[i].update(b[i])
# timeit ~1.4
print(L)