Table with dictionary rewrite into separate columns



  • I have a column:

    all_rel
    {'Иван': 0.358, 'Михаил':0.25, 'Илья': 0.456}
    {'Иван': 0.698, 'Михаил':0.125, 'Илья': 0.426}
    {'Иван': 0.568, 'Михаил':0.145, 'Илья': 0.464}
    {'Иван': 0.698, 'Михаил':0.125, 'Илья': 0.426}

    We need to rewrite the data into several columns in a way:

    all_relIvanMikhailIlya
    {'Иван': 0.358, 'Михаил':0.25, 'Илья': 0.456}0.3580.250.456
    {'Иван': 0.698, 'Михаил':0.125, 'Илья': 0.426}0.6980.1250.426
    {'Иван': 0.568, 'Михаил':0.145, 'Илья': 0.464}0.5680.1450.464
    {'Иван': 0.698, 'Михаил':0.125, 'Илья': 0.426}0.6980.1250.426

    I don't know the exact meaning of the keys of the dictionary and their number, but there's a dictionary with all the possible names. name_dict = {'Иван': 0, 'Михаил':0, 'Илья': 0,'Алексей': 0, 'Андрей': 0}

    My code:

    
    for i in range(srt.shape[0]):
        name_dict.update(ast.literal_eval(srt.all_rel[i]))
        srt.rel_dict[i] = name_dict.copy()
    
    for key in srt.rel_dict[i]:
        srt[key] = ' '
    

    for i in range(srt.shape[0]):
    for key in srt.rel_dict[i]:
    srt[key][i] = srt.rel_dict[i][key]

    srt - the name of the date.

    The final column may contain columns with other names of the reference dictionary and 0 in each line.

    Is there any way to do this differently?



  • In English SO, there's interesting https://stackoverflow.com/a/55355928/8324991 In your case:

    import pandas as pd
    

    df = pd.DataFrame({'all_rel':[
    {'Иван': 0.358, 'Михаил':0.25, 'Илья': 0.456},
    {'Иван': 0.698, 'Михаил':0.125, 'Илья': 0.426},
    {'Иван': 0.568, 'Михаил':0.145, 'Илья': 0.464},
    {'Иван': 0.698, 'Михаил':0.125, 'Илья': 0.426},
    ]})

    pd.json_normalize(df['all_rel'])

    Conclusion:

        Иван    Михаил  Илья
    0 0.358 0.250 0.456
    1 0.698 0.125 0.426
    2 0.568 0.145 0.464
    3 0.698 0.125 0.426

    There are other interesting options on the reference, but it's kind of fast and simple.

    The conversion data with the main date may be combined by index:

    df = df.drop(columns='all_rel') 
    .join(pd.json_normalize(df['all_rel']))

    Or so, just as the respected MaxU says:

    df = df.join(pd.json_normalize(df.pop('all_rel')))



Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2