I have a dataframe called ref(first dataframe) with columns c1, c2 ,c3 and c4.
ref= pd.DataFrame([[1,3,.3,7],[0,4,.5,4.5],[2,5,.6,3]], columns=['c1','c2','c3','c4'])
print(ref)
c1 c2 c3 c4
0 1 3 0.3 7.0
1 0 4 0.5 4.5
2 2 5 0.6 3.0
I wanted to create a new column i.e, c5 ( second dataframe) that has all the values from columns c1,c2,c3 and c4.
I tried concat, merge columns but i cannot get it work.
Please let me know if you have a solutions?
To combine the values of all the column and append them into a single column, we will use apply() method inside which we will write our expression to do the same. Whenever we want to perform some operation on the entire DataFrame, we use apply() method.
To merge two pandas DataFrames on multiple columns use pandas. merge() method. merge() is considered more versatile and flexible and we also have the same method in DataFrame.
To merge two Pandas DataFrame with common column, use the merge() function and set the ON parameter as the column name.
groupby() can take the list of columns to group by multiple columns and use the aggregate functions to apply single or multiple aggregations at the same time.
You can use unstack
for creating Series
from DataFrame
and then concat
to original:
print (pd.concat([ref, ref.unstack().reset_index(drop=True).rename('c5')], axis=1))
c1 c2 c3 c4 c5
0 1.0 3.0 0.3 7.0 1.0
1 0.0 4.0 0.5 4.5 0.0
2 2.0 5.0 0.6 3.0 2.0
3 NaN NaN NaN NaN 3.0
4 NaN NaN NaN NaN 4.0
5 NaN NaN NaN NaN 5.0
6 NaN NaN NaN NaN 0.3
7 NaN NaN NaN NaN 0.5
8 NaN NaN NaN NaN 0.6
9 NaN NaN NaN NaN 7.0
10 NaN NaN NaN NaN 4.5
11 NaN NaN NaN NaN 3.0
Alternative solution for creating Series
is convert df
to numpy array
by values
and then reshape by ravel
:
print (pd.concat([ref, pd.Series(ref.values.ravel('F'), name='c5')], axis=1))
c1 c2 c3 c4 c5
0 1.0 3.0 0.3 7.0 1.0
1 0.0 4.0 0.5 4.5 0.0
2 2.0 5.0 0.6 3.0 2.0
3 NaN NaN NaN NaN 3.0
4 NaN NaN NaN NaN 4.0
5 NaN NaN NaN NaN 5.0
6 NaN NaN NaN NaN 0.3
7 NaN NaN NaN NaN 0.5
8 NaN NaN NaN NaN 0.6
9 NaN NaN NaN NaN 7.0
10 NaN NaN NaN NaN 4.5
11 NaN NaN NaN NaN 3.0
using join
+ ravel('F')
ref.join(pd.Series(ref.values.ravel('F')).to_frame('c5'), how='right')
using join
+ T.ravel()
ref.join(pd.Series(ref.values.T.ravel()).to_frame('c5'), how='right')
pd.concat
+ T.stack()
+ rename
pd.concat([ref, ref.T.stack().reset_index(drop=True).rename('c5')], axis=1)
way too many transposes + append
ref.T.append(ref.T.stack().reset_index(drop=True).rename('c5')).T
combine_first
+ ravel('F')
<--- my favorite
ref.combine_first(pd.Series(ref.values.ravel('F')).to_frame('c5'))
All yield
c1 c2 c3 c4 c5
0 1.0 3.0 0.3 7.0 1.0
1 0.0 4.0 0.5 4.5 0.0
2 2.0 5.0 0.6 3.0 2.0
3 NaN NaN NaN NaN 3.0
4 NaN NaN NaN NaN 4.0
5 NaN NaN NaN NaN 5.0
6 NaN NaN NaN NaN 0.3
7 NaN NaN NaN NaN 0.5
8 NaN NaN NaN NaN 0.6
9 NaN NaN NaN NaN 7.0
10 NaN NaN NaN NaN 4.5
11 NaN NaN NaN NaN 3.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With