I have a dataframe
and 2 separate dictionaries
. Both dictionaries have the same keys but have different values. dict_1
has key-value pairs where the values are unique ids that correspond with the dataframe df
. I want to be able to use the 2 dictionaries and the unique ids from the dict_1
to append the values of dict_2
into the dataframe df
.
Example of dataframe df
:
col_1 col_2 id col_3
100 500 a1 478
785 400 a1 490
... ... a1 ...
... ... a2 ...
... ... a2 ...
... ... a2 ...
... ... a3 ...
... ... a3 ...
... ... a3 ...
... ... a4 ...
... ... a4 ...
... ... a4 ...
Example of dict_1
:
1:['a1', 'a3'],
2:['a2', 'a4'],
3:[...],
4:[...],
5:[...],
.
Example of dict_2
:
1:[0, 1],
2:[1, 1],
3:[...],
4:[...],
5:[...],
.
I'm trying to append the data from dict_2
using id's from dict_1
into the main df
.
In a sense add the 2 values (or n values) from the lists of dict_2
as 2 columns (or n columns) into the df
.
Resultant df
:
col_1 col_2 id col_3 new_col_1 new_col_2
100 500 a1 478 0 1
785 400 a1 490 0 1
... ... a1 ... 0 1
... ... a2 ... 1 1
... ... a2 ... 1 1
... ... a2 ... 1 1
... ... a3 ... 0 1
... ... a3 ... 0 1
... ... a3 ... 0 1
... ... a4 ... 1 1
... ... a4 ... 1 1
... ... a4 ... 1 1
Append Dict as Row to DataFrame You can create a DataFrame and append a new row to this DataFrame from dict, first create a Python Dictionary and use append() function, this method is required to pass ignore_index=True in order to append dict as a row to DataFrame, not using this will get you an error.
Appending element(s) to a dictionaryTo append an element to an existing dictionary, you have to use the dictionary name followed by square brackets with the key name and assign a value to it.
You can convert a dictionary to Pandas Dataframe using df = pd. DataFrame. from_dict(my_dict) statement.
Let’s say there are two keys to it that is the name of the country and its capital. I will make a list of all the dictionary that represents the keys and value field in each dictionary. Now when you get the list of dictionary then You will use the pandas function DataFrame () to modify it into dataframe.
Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.append() function is used to append rows of other dataframe to the end of the given dataframe, returning a new dataframe object. Columns not in the original dataframes are added as new columns and the new cells are populated with NaN value.
How to Add Dictionary Keys and Values as Pandas Columns? 1 Step 1: Import the necessary libraries#N#Here I will use only the pandas library for creating dataframe.#N#import pandas as... 2 Step 2: Create a List of Dictionary items#N#Before converting a dictionary into the data frame lets creates a sample... 3 Step 3: Create a Dataframe More ...
What you need to do is to build your dictionary with your loop, then at then end of your loop, you can use your dictionary to create a dataframe with: df1 = pd.DataFrame(podcast_dict) And append using pd.concat: df_podcast = pd.concat([df_podcast, df1])
IIUC, the keys in your two dictionaries are aligned. One way is to create a dataframe with a column id containing the values in dict_1
and 2 (in this case but can be more) columns from the values in dict_2
aligned on the same key. Then use merge
on id to get the result back in df
# the two dictionaries. note in dict_2 I added an element for the list in key 2
# to show it works for any number of columns
dict_1 = {1:['a1', 'a3'],2:['a2', 'a4'],}
dict_2 = {1:[0,1],2:[1,1,2]}
#create a dataframe from dict_2, here it might be something easier but can't find it
df_2 = pd.concat([pd.Series(vals, name=key)
for key, vals in dict_2.items()], axis=1).T
print(df_2) #index are the keys, and columns are the future new_col_x
0 1 2
1 0.0 1.0 NaN
2 1.0 1.0 2.0
#concat with the dict_1 once explode the values in the list,
# here just a print to see what it's doing
print (pd.concat([pd.Series(dict_1, name='id').explode(),df_2], axis=1))
id 0 1 2
1 a1 0.0 1.0 NaN
1 a3 0.0 1.0 NaN
2 a2 1.0 1.0 2.0
2 a4 1.0 1.0 2.0
# use previous concat, with a rename to change column names and merge to df
df = df.merge(pd.concat([pd.Series(dict_1, name='id').explode(),df_2], axis=1)
.rename(columns=lambda x: f'new_col_{x+1}'
if isinstance(x, int) else x),
on='id', how='left')
and you get
print (df)
col_1 col_2 id col_3 new_col_1 new_col_2 new_col_3
0 100 500 a1 478 0.0 1.0 NaN
1 785 400 a1 490 0.0 1.0 NaN
2 ... ... a1 ... 0.0 1.0 NaN
3 ... ... a2 ... 1.0 1.0 2.0
4 ... ... a2 ... 1.0 1.0 2.0
5 ... ... a2 ... 1.0 1.0 2.0
6 ... ... a3 ... 0.0 1.0 NaN
7 ... ... a3 ... 0.0 1.0 NaN
8 ... ... a3 ... 0.0 1.0 NaN
9 ... ... a4 ... 1.0 1.0 2.0
10 ... ... a4 ... 1.0 1.0 2.0
11 ... ... a4 ... 1.0 1.0 2.0
Let us try explode
with map
s=pd.Series(dict_1).explode().reset_index()
s.columns=[1,2]
df['new_1']=df.id.map(dict(zip(s[2],s[1])))
#s=pd.Series(dict_2).explode().reset_index()
#s.columns=[1,2]
#df['new_2']=df.id.map(dict(zip(s[2],s[1])))
Assume you have 'n values from the lists of dict_2 and want to construct n new columns in df
' such as
dict_2 = {1: [0, 1], 2: [1, 1, 6, 9]}
Using dict comprehension to construct a new dictionary from dict_2
and dict_1
and use it to construct a new dataframe with orient='index'
. Chaining rename
and add_prefix
. Finally, merge it back to df
with option left_on='id', right_index=True
key_dict = {x: v for k, v in dict_2.items() for x in dict_1[k]}
df_add = (pd.DataFrame.from_dict(key_dict, orient='index')
.rename(lambda x: int(x)+1, axis=1).add_prefix('newcol_'))
df_final = df.merge(df_add, left_on='id', right_index=True)
Out[33]:
col_1 col_2 id col_3 newcol_1 newcol_2 newcol_3 newcol_4
0 100 500 a1 478 0 1 NaN NaN
1 785 400 a1 490 0 1 NaN NaN
2 ... ... a1 ... 0 1 NaN NaN
3 ... ... a2 ... 1 1 6.0 9.0
4 ... ... a2 ... 1 1 6.0 9.0
5 ... ... a2 ... 1 1 6.0 9.0
6 ... ... a3 ... 0 1 NaN NaN
7 ... ... a3 ... 0 1 NaN NaN
8 ... ... a3 ... 0 1 NaN NaN
9 ... ... a4 ... 1 1 6.0 9.0
10 ... ... a4 ... 1 1 6.0 9.0
11 ... ... a4 ... 1 1 6.0 9.0
Construct a DataFrame that combines both of the dicts along the keys. Use the DataFrame.from_dict
constructor and pandas will deal with the alignment on keys.
Then use wide_to_long
to reshape it so that each 'id'
in dict_1
gets linked with all of the columns in dict_2
. Then this is a simple merge to join back to the original.
dict_1 = {1: ['a1', 'a3'], 2: ['a2', 'a4']}
dict_2 = {1: [0, 1], 2: [1, 1, 2]}
df1 = pd.concat([pd.DataFrame.from_dict(dict_1, orient='index').add_prefix('id'),
pd.DataFrame.from_dict(dict_2, orient='index').add_prefix('new_col')], axis=1)
# id0 id1 new_col0 new_col1 new_col2
#1 a1 a3 0 1 NaN
#2 a2 a4 1 1 2.0
df1 = (pd.wide_to_long(df1, i=[x for x in df1.columns if 'new_col' in x],
j='will_drop', stubnames=['id'])
.reset_index().drop(columns='will_drop'))
# new_col0 new_col1 new_col2 id
#0 0 1 NaN a1
#1 0 1 NaN a3
#2 1 1 2.0 a2
#3 1 1 2.0 a4
df = df.merge(df1, how='left')
col_1 col_2 id col_3 new_col0 new_col1 new_col2
0 100 500 a1 478 0 1 NaN
1 785 400 a1 490 0 1 NaN
2 ... ... a1 ... 0 1 NaN
3 ... ... a2 ... 1 1 2.0
4 ... ... a2 ... 1 1 2.0
5 ... ... a2 ... 1 1 2.0
6 ... ... a3 ... 0 1 NaN
7 ... ... a3 ... 0 1 NaN
8 ... ... a3 ... 0 1 NaN
9 ... ... a4 ... 1 1 2.0
10 ... ... a4 ... 1 1 2.0
11 ... ... a4 ... 1 1 2.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With