python - Convierta diccionarios con una lista de valores en un marco de datos
pandas dictionary (4)
IIUC, puedes hacer:
pd.concat([pd.DataFrame(d).stack() for d in (d1,d2,d3)], axis=1)
Salida:
0 1 2
0 MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz
ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz
YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz
1 MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz
ASP 2 ASP_L002_R1_001.gz ASP_L002_R2_001.gz
YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz
Digamos que tengo tres diccionarios
dictionary_col2
{''MOB'': [1, 2], ''ASP'': [1, 2], ''YIP'': [1, 2]}
dictionary_col3
{''MOB'': [''MOB_L001_R1_001.gz'',
''MOB_L002_R1_001.gz''],
''ASP'': [''ASP_L001_R1_001.gz'',
''ASP_L002_R1_001.gz''],
''YIP'': [''YIP_L001_R1_001.gz'',
''YIP_L002_R1_001.gz'']}
dictionary_col4
{''MOB'': [''MOB_L001_R2_001.gz'',
''MOB_L002_R2_001.gz''],
''ASP'': [''ASP_L001_R2_001.gz'',
''ASP_L002_R2_001.gz''],
''YIP'': [''YIP_L001_R2_001.gz'',
''YIP_L002_R2_001.gz'']}
Quiero convertir los diccionarios anteriores en un marco de datos. He probado lo siguiente,
df = pd.DataFrame([dictionary_col2, dictionary_col3, dictionary_col4])
El marco de datos
df
se ve así,
ASP MOB YIP
0 [1, 2] [1, 2] [1, 2]
1 [ASP_L001_R1_001.gz, ASP_L002_R1_001.gz] [MOB_L001_R1_001.gz, MOB_L002_R1_001.gz] [YIP_L001_R1_001.gz, YIP_L002_R1_001.gz]
2 [ASP_L001_R2_001.gz, ASP_L002_R2_001.gz] [MOB_L001_R2_001.gz, MOB_L002_R2_001.gz] [YIP_L001_R2_001.gz, YIP_L002_R2_001.gz]
Mi objetivo es tener un marco de datos con las siguientes columnas:
col1 col2 col3 col4
MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz
MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz
ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz
ASP 2 ASP_L002_R1_001.gz MOB_L002_R2_001.gz
YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz
YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz
¡Cualquier ayuda / sugerencia es apreciada!
Lo que puede hacer con
concat
con aviso de
explode
en pandas 0.25.0
pd.concat([pd.Series(x).explode() for x in [d1,d2]],axis=1)
dict_list = [dictionary_col2, dictionary_col3, dictionary_col4]
df = pd.concat([pd.DataFrame.from_dict(x, orient = ''index'').unstack() for x in dict_list], axis = 1)
salida:
>>> df
0 1 2
0 MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz
ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz
YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz
1 MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz
ASP 2 ASP_L002_R1_001.gz ASP_L002_R2_001.gz
YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz
pd.DataFrame({''col2'': pd.DataFrame(col2).unstack(),
''col3'': pd.DataFrame(col3).unstack(),
''col4'': pd.DataFrame(col4).unstack()}).reset_index(level=0)
devoluciones
level_0 col2 col3 col4
0 ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz
1 ASP 2 ASP_L002_R1_001.gz ASP_L002_R2_001.gz
0 MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz
1 MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz
0 YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz
1 YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz