Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas concat dictionary to dataframe

I have an existing dataframe and I'm trying to concatenate a dictionary where the length of the dictionary is different from the dataframe

>>> df
         A        B        C
0  0.46324  0.32425  0.42194
1  0.10596  0.35910  0.21004
2  0.69209  0.12951  0.50186
3  0.04901  0.31203  0.11035
4  0.43104  0.62413  0.20567
5  0.43412  0.13720  0.11052
6  0.14512  0.10532  0.05310

and

test = {"One": [0.23413, 0.19235, 0.51221], "Two": [0.01293, 0.12235, 0.63291]}

I'm trying to add test to df, while changing the keys to "D" and "C" and I've had a look at

http://pandas.pydata.org/pandas-docs/stable/merging.html and http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html

which indicates that I should be able to concatenate the dictionary to the dataframe

I've tried:

pd.concat([df, test], axis=1, ignore_index=True, keys=["D", "E"])
pd.concat([df, test], axis=1, ignore_index=True)

but I'm not having any luck, the result I'm trying to achieve is

df
         A        B        C        D        E
0  0.46324  0.32425  0.42194  0.23413  0.01293  
1  0.10596  0.35910  0.21004  0.19235  0.12235
2  0.69209  0.12951  0.50186  0.51221  0.63291
3  0.04901  0.31203  0.11035      NaN      NaN
4  0.43104  0.62413  0.20567      NaN      NaN 
5  0.43412  0.13720  0.11052      NaN      NaN
6  0.14512  0.10532  0.05310      NaN      NaN
like image 366
Lukasz Avatar asked Apr 01 '16 21:04

Lukasz


People also ask

How do I concatenate Pandas DataFrame?

We'll pass two dataframes to pd. concat() method in the form of a list and mention in which axis you want to concat, i.e. axis=0 to concat along rows, axis=1 to concat along columns.

How do I concatenate columns in Pandas?

By use + operator simply you can concatenate two or multiple text/string columns in pandas DataFrame. Note that when you apply + operator on numeric columns it actually does addition instead of concatenation.

How do I concatenate a series in Pandas?

By using pandas. concat() you can combine pandas objects for example multiple series along a particular axis (column-wise or row-wise) to create a DataFrame. concat() method takes several params, for our scenario we use list that takes series to combine and axis=1 to specify merge series as columns instead of rows.


2 Answers

The only way you can do that is with:

df.join(pd.DataFrame(test).rename(columns={'One':'D','Two':'E'}))

          A       B       C       D       E
0   0.46324 0.32425 0.42194 0.23413 0.01293
1   0.10596 0.35910 0.21004 0.19235 0.12235
2   0.69209 0.12951 0.50186 0.51221 0.63291
3   0.04901 0.31203 0.11035     NaN     NaN
4   0.43104 0.62413 0.20567     NaN     NaN
5   0.43412 0.13720 0.11052     NaN     NaN
6   0.14512 0.10532 0.05310     NaN     NaN

because as @Alexander mentioned correctly the number of rows being concatenated should match. Otherwise, as in your case, missing rows will be filled with NaN

like image 80
Sergey Bushmanov Avatar answered Oct 13 '22 21:10

Sergey Bushmanov


Assuming you want to add them as rows:

>>> pd.concat([df, pd.DataFrame(test.values(), columns=df.columns)], ignore_index=True)
         A        B        C
0  0.46324  0.32425  0.42194
1  0.10596  0.35910  0.21004
2  0.69209  0.12951  0.50186
3  0.04901  0.31203  0.11035
4  0.43104  0.62413  0.20567
5  0.43412  0.13720  0.11052
6  0.14512  0.10532  0.05310
7  0.01293  0.12235  0.63291
8  0.23413  0.19235  0.51221

If added as new columns:

df_new = pd.concat([df, pd.DataFrame(test.values()).T], ignore_index=True, axis=1)
df_new.columns = \
    df.columns.tolist() + [{'One': 'D', 'Two': 'E'}.get(k) for k in test.keys()]

>>> df_new
         A        B        C        E        D
0  0.46324  0.32425  0.42194  0.01293  0.23413
1  0.10596  0.35910  0.21004  0.12235  0.19235
2  0.69209  0.12951  0.50186  0.63291  0.51221
3  0.04901  0.31203  0.11035      NaN      NaN
4  0.43104  0.62413  0.20567      NaN      NaN
5  0.43412  0.13720  0.11052      NaN      NaN
6  0.14512  0.10532  0.05310      NaN      NaN

Order is not guaranteed in dictionaries (e.g. test), so the new column names actually need to be mapped to the keys.

like image 25
Alexander Avatar answered Oct 13 '22 20:10

Alexander