recorrer - listas en python
Obteniendo la diferencia(delta) entre dos listas de diccionarios (4)
En caso de que quiera la diferencia recursivamente, he escrito un paquete para python: https://github.com/seperman/deepdiff
Instalación
Instalar desde PyPi:
pip install deepdiff
Ejemplo de uso
Importador
>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2
El mismo objeto devuelve vacío
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}
El tipo de un artículo ha cambiado
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ ''type_changes'': { ''root[2]'': { ''newtype'': <class ''str''>,
''newvalue'': ''2'',
''oldtype'': <class ''int''>,
''oldvalue'': 2}}}
El valor de un artículo ha cambiado
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{''values_changed'': {''root[2]'': {''newvalue'': 4, ''oldvalue'': 2}}}
Artículo añadido y / o eliminado
>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{''dic_item_added'': [''root[5]'', ''root[6]''],
''dic_item_removed'': [''root[4]''],
''values_changed'': {''root[2]'': {''newvalue'': 4, ''oldvalue'': 2}}}
Diferencia de cadena
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ ''values_changed'': { ''root[2]'': {''newvalue'': 4, ''oldvalue'': 2},
"root[4][''b'']": { ''newvalue'': ''world!'',
''oldvalue'': ''world''}}}
Diferencia de cadena 2
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!/nGoodbye!/n1/n2/nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world/n1/n2/nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ ''values_changed'': { "root[4][''b'']": { ''diff'': ''--- /n''
''+++ /n''
''@@ -1,5 +1,4 @@/n''
''-world!/n''
''-Goodbye!/n''
''+world/n''
'' 1/n''
'' 2/n''
'' End'',
''newvalue'': ''world/n1/n2/nEnd'',
''oldvalue'': ''world!/n''
''Goodbye!/n''
''1/n''
''2/n''
''End''}}}
>>>
>>> print (ddiff[''values_changed'']["root[4][''b'']"]["diff"])
---
+++
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
1
2
End
Cambio de tipo
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world/n/n/nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ ''type_changes'': { "root[4][''b'']": { ''newtype'': <class ''str''>,
''newvalue'': ''world/n/n/nEnd'',
''oldtype'': <class ''list''>,
''oldvalue'': [1, 2, 3]}}}
Diferencia de lista
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{''iterable_item_removed'': {"root[4][''b''][2]": 3, "root[4][''b''][3]": 4}}
Diferencia de lista 2:
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ ''iterable_item_added'': {"root[4][''b''][3]": 3},
''values_changed'': { "root[4][''b''][1]": {''newvalue'': 3, ''oldvalue'': 2},
"root[4][''b''][2]": {''newvalue'': 2, ''oldvalue'': 3}}}
Enumere las diferencias ignorando el orden o los duplicados: (con los mismos diccionarios que arriba)
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}
Lista que contiene el diccionario:
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ ''dic_item_removed'': ["root[4][''b''][2][2]"],
''values_changed'': {"root[4][''b''][2][1]": {''newvalue'': 3, ''oldvalue'': 1}}}
Conjuntos:
>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{''set_item_added'': [''root[3]'', ''root[5]''], ''set_item_removed'': [''root[8]'']}
Tuplas nombradas:
>>> from collections import namedtuple
>>> Point = namedtuple(''Point'', [''x'', ''y''])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{''values_changed'': {''root.y'': {''newvalue'': 23, ''oldvalue'': 22}}}
Objetos personalizados:
>>> class ClassA(object):
... a = 1
... def __init__(self, b):
... self.b = b
...
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>>
>>> pprint(DeepDiff(t1, t2))
{''values_changed'': {''root.b'': {''newvalue'': 2, ''oldvalue'': 1}}}
Atributo de objeto añadido:
>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{''attribute_added'': [''root.c''],
''values_changed'': {''root.b'': {''newvalue'': 2, ''oldvalue'': 1}}}
Tengo las siguientes estructuras de datos de Python:
data1 = [{''name'': u''String 1''}, {''name'': u''String 2''}]
data2 = [{''name'': u''String 1''}, {''name'': u''String 2''}, {''name'': u''String 3''}]
Estoy buscando la mejor manera de obtener el delta entre las dos listas. ¿Hay algo en Python que sea tan conveniente como la biblioteca JavaScript Underscore.js (_.difference)?
Qué tal esto:
>>> [x for x in data2 if x not in data1]
[{''name'': u''String 3''}]
Editar :
Si necesitas diferencia simétrica puedes usar:
>>> [x for x in data1 + data2 if x not in data1 or x not in data2]
o
>>> [x for x in data1 if x not in data2] + [y for y in data2 if y not in data1]
Una edición más
También puedes usar sets:
>>> from functools import reduce
>>> s1 = set(reduce(lambda x, y: x + y, [x.items() for x in data1]))
>>> s2 = set(reduce(lambda x, y: x + y, [x.items() for x in data2]))
>>> s2.difference(s1)
>>> s2.symmetric_difference(s1)
Utilice itertools.filterfalse
:
import itertools
r = list(itertools.filterfalse(lambda x: x in data1, data2))
+ list(itertools.filterfalse(lambda x: x in data2, data1))
assert r == [{''name'': ''String 3''}]
data1 = [{''name'': u''String 1''}, {''name'': u''String 2''}]
data2 = [{''name'': u''String 1''}, {''name'': u''String 2''}, {''name'': u''String 3''}]
delta = list({dict2[''name''] for dict2 in data2} -
{dict1[''name''] for dict1 in data1})
delta_dict = [{''name'': value} for value in delta]
print delta_dict