python pandas dictionary

python - Convierta diccionarios con una lista de valores en un marco de datos



pandas dictionary (4)

IIUC, puedes hacer:

pd.concat([pd.DataFrame(d).stack() for d in (d1,d2,d3)], axis=1)

Salida:

0 1 2 0 MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz 1 MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz ASP 2 ASP_L002_R1_001.gz ASP_L002_R2_001.gz YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz

Digamos que tengo tres diccionarios

dictionary_col2 {''MOB'': [1, 2], ''ASP'': [1, 2], ''YIP'': [1, 2]}

dictionary_col3 {''MOB'': [''MOB_L001_R1_001.gz'', ''MOB_L002_R1_001.gz''], ''ASP'': [''ASP_L001_R1_001.gz'', ''ASP_L002_R1_001.gz''], ''YIP'': [''YIP_L001_R1_001.gz'', ''YIP_L002_R1_001.gz'']}

dictionary_col4 {''MOB'': [''MOB_L001_R2_001.gz'', ''MOB_L002_R2_001.gz''], ''ASP'': [''ASP_L001_R2_001.gz'', ''ASP_L002_R2_001.gz''], ''YIP'': [''YIP_L001_R2_001.gz'', ''YIP_L002_R2_001.gz'']}

Quiero convertir los diccionarios anteriores en un marco de datos. He probado lo siguiente,

df = pd.DataFrame([dictionary_col2, dictionary_col3, dictionary_col4]) El marco de datos df se ve así,

ASP MOB YIP 0 [1, 2] [1, 2] [1, 2] 1 [ASP_L001_R1_001.gz, ASP_L002_R1_001.gz] [MOB_L001_R1_001.gz, MOB_L002_R1_001.gz] [YIP_L001_R1_001.gz, YIP_L002_R1_001.gz] 2 [ASP_L001_R2_001.gz, ASP_L002_R2_001.gz] [MOB_L001_R2_001.gz, MOB_L002_R2_001.gz] [YIP_L001_R2_001.gz, YIP_L002_R2_001.gz]

Mi objetivo es tener un marco de datos con las siguientes columnas:

col1 col2 col3 col4 MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz ASP 2 ASP_L002_R1_001.gz MOB_L002_R2_001.gz YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz

¡Cualquier ayuda / sugerencia es apreciada!


Lo que puede hacer con concat con aviso de explode en pandas 0.25.0

pd.concat([pd.Series(x).explode() for x in [d1,d2]],axis=1)


dict_list = [dictionary_col2, dictionary_col3, dictionary_col4] df = pd.concat([pd.DataFrame.from_dict(x, orient = ''index'').unstack() for x in dict_list], axis = 1)

salida:

>>> df 0 1 2 0 MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz 1 MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz ASP 2 ASP_L002_R1_001.gz ASP_L002_R2_001.gz YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz


pd.DataFrame({''col2'': pd.DataFrame(col2).unstack(), ''col3'': pd.DataFrame(col3).unstack(), ''col4'': pd.DataFrame(col4).unstack()}).reset_index(level=0)

devoluciones

level_0 col2 col3 col4 0 ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz 1 ASP 2 ASP_L002_R1_001.gz ASP_L002_R2_001.gz 0 MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz 1 MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz 0 YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz 1 YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz