I have a <code>dataframe</code> and 2 separate <code>dictionaries</code>. Both dictionaries have the same keys but have different values. <code>dict_1</code> has key-value pairs where the values are unique ids that correspond with the dataframe <code>df</code>. I want to be able to use the 2 dictionaries and the unique ids from the <code>dict_1</code> to append the values of <code>dict_2</code> into the dataframe <code>df</code>. Example of dataframe <code>df</code>: <pre class="prettyprint"><code>col_1 col_2 id col_3 100 500 a1 478 785 400 a1 490 ... ... a1 ... ... ... a2 ... ... ... a2 ... ... ... a2 ... ... ... a3 ... ... ... a3 ... ... ... a3 ... ... ... a4 ... ... ... a4 ... ... ... a4 ... </code></pre> Example of <code>dict_1</code>: <pre class="prettyprint"><code>1:['a1', 'a3'], 2:['a2', 'a4'], 3:[...], 4:[...], 5:[...], . </code></pre> Example of <code>dict_2</code>: <pre class="prettyprint"><code>1:[0, 1], 2:[1, 1], 3:[...], 4:[...], 5:[...], . </code></pre> I'm trying to append the data from <code>dict_2</code> using id's from <code>dict_1</code> into the main <code>df</code>. In a sense add the 2 values (or n values) from the lists of <code>dict_2</code> as 2 columns (or n columns) into the <code>df</code>. Resultant <code>df</code>: <pre class="prettyprint"><code>col_1 col_2 id col_3 new_col_1 new_col_2 100 500 a1 478 0 1 785 400 a1 490 0 1 ... ... a1 ... 0 1 ... ... a2 ... 1 1 ... ... a2 ... 1 1 ... ... a2 ... 1 1 ... ... a3 ... 0 1 ... ... a3 ... 0 1 ... ... a3 ... 0 1 ... ... a4 ... 1 1 ... ... a4 ... 1 1 ... ... a4 ... 1 1 </code></pre>

Assume you have 'n values from the lists of dict_2 and want to construct n new columns in <code>df</code>' such as <pre class="prettyprint"><code>dict_2 = {1: [0, 1], 2: [1, 1, 6, 9]} </code></pre> Using dict comprehension to construct a new dictionary from <code>dict_2</code> and <code>dict_1</code> and use it to construct a new dataframe with <code>orient='index'</code>. Chaining <code>rename</code> and <code>add_prefix</code>. Finally, merge it back to <code>df</code> with option <code>left_on='id', right_index=True</code> <pre class="prettyprint"><code>key_dict = {x: v for k, v in dict_2.items() for x in dict_1[k]} df_add = (pd.DataFrame.from_dict(key_dict, orient='index') .rename(lambda x: int(x)+1, axis=1).add_prefix('newcol_')) df_final = df.merge(df_add, left_on='id', right_index=True) Out[33]: col_1 col_2 id col_3 newcol_1 newcol_2 newcol_3 newcol_4 0 100 500 a1 478 0 1 NaN NaN 1 785 400 a1 490 0 1 NaN NaN 2 ... ... a1 ... 0 1 NaN NaN 3 ... ... a2 ... 1 1 6.0 9.0 4 ... ... a2 ... 1 1 6.0 9.0 5 ... ... a2 ... 1 1 6.0 9.0 6 ... ... a3 ... 0 1 NaN NaN 7 ... ... a3 ... 0 1 NaN NaN 8 ... ... a3 ... 0 1 NaN NaN 9 ... ... a4 ... 1 1 6.0 9.0 10 ... ... a4 ... 1 1 6.0 9.0 11 ... ... a4 ... 1 1 6.0 9.0 </code></pre>

Use dictionary data to append data to pandas dataframe

Tags:

python

dictionary

pandas

I have a dataframe and 2 separate dictionaries. Both dictionaries have the same keys but have different values. dict_1 has key-value pairs where the values are unique ids that correspond with the dataframe df. I want to be able to use the 2 dictionaries and the unique ids from the dict_1 to append the values of dict_2 into the dataframe df.

Example of dataframe df:

col_1    col_2    id   col_3
 100      500     a1    478
 785      400     a1    490
 ...      ...     a1    ...
 ...      ...     a2    ...
 ...      ...     a2    ...
 ...      ...     a2    ...
 ...      ...     a3    ...
 ...      ...     a3    ...
 ...      ...     a3    ...
 ...      ...     a4    ...
 ...      ...     a4    ...
 ...      ...     a4    ...

Example of dict_1:

1:['a1', 'a3'],
2:['a2', 'a4'],
3:[...],
4:[...],
5:[...],
.

Example of dict_2:

1:[0, 1],
2:[1, 1],
3:[...],
4:[...],
5:[...],
.

I'm trying to append the data from dict_2 using id's from dict_1 into the main df. In a sense add the 2 values (or n values) from the lists of dict_2 as 2 columns (or n columns) into the df.

Resultant df:

col_1    col_2    id   col_3   new_col_1   new_col_2 
 100      500     a1    478        0           1
 785      400     a1    490        0           1
 ...      ...     a1    ...        0           1
 ...      ...     a2    ...        1           1
 ...      ...     a2    ...        1           1
 ...      ...     a2    ...        1           1
 ...      ...     a3    ...        0           1
 ...      ...     a3    ...        0           1
 ...      ...     a3    ...        0           1
 ...      ...     a4    ...        1           1
 ...      ...     a4    ...        1           1
 ...      ...     a4    ...        1           1

848

asked May 26 '20 16:05

user9996043

4 Answers

IIUC, the keys in your two dictionaries are aligned. One way is to create a dataframe with a column id containing the values in dict_1 and 2 (in this case but can be more) columns from the values in dict_2 aligned on the same key. Then use merge on id to get the result back in df

# the two dictionaries. note in dict_2 I added an element for the list in key 2
# to show it works for any number of columns
dict_1 = {1:['a1', 'a3'],2:['a2', 'a4'],}
dict_2 = {1:[0,1],2:[1,1,2]} 

#create a dataframe from dict_2, here it might be something easier but can't find it
df_2 = pd.concat([pd.Series(vals, name=key) 
                  for key, vals in dict_2.items()], axis=1).T
print(df_2) #index are the keys, and columns are the future new_col_x
     0    1    2
1  0.0  1.0  NaN
2  1.0  1.0  2.0

#concat with the dict_1 once explode the values in the list, 
# here just a print to see what it's doing
print (pd.concat([pd.Series(dict_1, name='id').explode(),df_2], axis=1))
   id    0    1    2
1  a1  0.0  1.0  NaN
1  a3  0.0  1.0  NaN
2  a2  1.0  1.0  2.0
2  a4  1.0  1.0  2.0

# use previous concat, with a rename to change column names and merge to df
df = df.merge(pd.concat([pd.Series(dict_1, name='id').explode(),df_2], axis=1)
                .rename(columns=lambda x: f'new_col_{x+1}' 
                                          if isinstance(x, int) else x), 
              on='id', how='left')

and you get

print (df)
   col_1 col_2  id col_3  new_col_1  new_col_2  new_col_3
0    100   500  a1   478        0.0        1.0        NaN
1    785   400  a1   490        0.0        1.0        NaN
2    ...   ...  a1   ...        0.0        1.0        NaN
3    ...   ...  a2   ...        1.0        1.0        2.0
4    ...   ...  a2   ...        1.0        1.0        2.0
5    ...   ...  a2   ...        1.0        1.0        2.0
6    ...   ...  a3   ...        0.0        1.0        NaN
7    ...   ...  a3   ...        0.0        1.0        NaN
8    ...   ...  a3   ...        0.0        1.0        NaN
9    ...   ...  a4   ...        1.0        1.0        2.0
10   ...   ...  a4   ...        1.0        1.0        2.0
11   ...   ...  a4   ...        1.0        1.0        2.0

181

answered Oct 08 '22 11:10

Ben.T

Let us try explode with map

s=pd.Series(dict_1).explode().reset_index()
s.columns=[1,2]
df['new_1']=df.id.map(dict(zip(s[2],s[1])))

#s=pd.Series(dict_2).explode().reset_index()
#s.columns=[1,2]
#df['new_2']=df.id.map(dict(zip(s[2],s[1])))

answered Oct 08 '22 11:10

BENY

Assume you have 'n values from the lists of dict_2 and want to construct n new columns in df' such as

dict_2 = {1: [0, 1], 2: [1, 1, 6, 9]}

Using dict comprehension to construct a new dictionary from dict_2 and dict_1 and use it to construct a new dataframe with orient='index'. Chaining rename and add_prefix. Finally, merge it back to df with option left_on='id', right_index=True

key_dict = {x: v for k, v in dict_2.items() for x in dict_1[k]}

df_add = (pd.DataFrame.from_dict(key_dict, orient='index')
                      .rename(lambda x: int(x)+1, axis=1).add_prefix('newcol_'))
    
df_final = df.merge(df_add, left_on='id', right_index=True)

Out[33]:
   col_1 col_2  id col_3  newcol_1  newcol_2  newcol_3  newcol_4
0    100   500  a1   478         0         1       NaN       NaN
1    785   400  a1   490         0         1       NaN       NaN
2    ...   ...  a1   ...         0         1       NaN       NaN
3    ...   ...  a2   ...         1         1       6.0       9.0
4    ...   ...  a2   ...         1         1       6.0       9.0
5    ...   ...  a2   ...         1         1       6.0       9.0
6    ...   ...  a3   ...         0         1       NaN       NaN
7    ...   ...  a3   ...         0         1       NaN       NaN
8    ...   ...  a3   ...         0         1       NaN       NaN
9    ...   ...  a4   ...         1         1       6.0       9.0
10   ...   ...  a4   ...         1         1       6.0       9.0
11   ...   ...  a4   ...         1         1       6.0       9.0

answered Oct 08 '22 11:10

Andy L.

Construct a DataFrame that combines both of the dicts along the keys. Use the DataFrame.from_dict constructor and pandas will deal with the alignment on keys.

Then use wide_to_long to reshape it so that each 'id' in dict_1 gets linked with all of the columns in dict_2. Then this is a simple merge to join back to the original.

Sample Data

dict_1 = {1: ['a1', 'a3'], 2: ['a2', 'a4']}
dict_2 = {1: [0, 1], 2: [1, 1, 2]}

Code

df1 = pd.concat([pd.DataFrame.from_dict(dict_1, orient='index').add_prefix('id'),
                 pd.DataFrame.from_dict(dict_2, orient='index').add_prefix('new_col')], axis=1)
#  id0 id1  new_col0  new_col1  new_col2
#1  a1  a3         0         1       NaN
#2  a2  a4         1         1       2.0

df1 = (pd.wide_to_long(df1, i=[x for x in df1.columns if 'new_col' in x],
                       j='will_drop', stubnames=['id'])
         .reset_index().drop(columns='will_drop'))
#   new_col0  new_col1  new_col2  id
#0         0         1       NaN  a1
#1         0         1       NaN  a3
#2         1         1       2.0  a2
#3         1         1       2.0  a4

df = df.merge(df1, how='left')

   col_1 col_2  id col_3  new_col0  new_col1  new_col2
0    100   500  a1   478         0         1       NaN
1    785   400  a1   490         0         1       NaN
2    ...   ...  a1   ...         0         1       NaN
3    ...   ...  a2   ...         1         1       2.0
4    ...   ...  a2   ...         1         1       2.0
5    ...   ...  a2   ...         1         1       2.0
6    ...   ...  a3   ...         0         1       NaN
7    ...   ...  a3   ...         0         1       NaN
8    ...   ...  a3   ...         0         1       NaN
9    ...   ...  a4   ...         1         1       2.0
10   ...   ...  a4   ...         1         1       2.0
11   ...   ...  a4   ...         1         1       2.0

answered Oct 08 '22 11:10

ALollz

Related questions
                            
                                How to split a dataframe based on consecutive index?
                            
                                python3 os.rename() won't rename files with the word 'Copy' in name
                            
                                How can I change the size of my python turtle window?
                            
                                Discord.py - Changing prefix with command
                            
                                How to use apache airflow in a virtual environment?
                            
                                How to interpret Python output dtype='<U32'?
                            
                                How to combine The video and audio files in ffmpeg-python
                            
                                Disable logging in gunicorn for a specific request / URL / endpoint
                            
                                adding row from one dataframe to another
                            
                                How to check if sklearn model is classifier or regressor
                            
                                How do I use pytest with bazel?
                            
                                Why Flask Migrations does not detect a field's length change?
                            
                                AttributeError: module 'win32ctypes.pywin32.win32api' has no attribute 'error'
                            
                                Dual nested dictionary to stacked DataFrame
                            
                                How to get a list of every Point inside a MultiPolygon using Shapely
                            
                                Difference between Shuffle and Random_State in train test split?
                            
                                Type-check Jupyter Notebooks with mypy
                            
                                AWS Lambda not importing Asyncio
                            
                                Better way to iterate over python dataclass keys and values?
                            
                                How to Bypass Google Recaptcha while scraping with Requests

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Use dictionary data to append data to pandas dataframe

Tags:

python

dictionary

pandas

user9996043

People also ask

4 Answers

Ben.T

BENY

Andy L.

Sample Data

Code

ALollz

Recent Activity

Donate For Us