I have panadas dataframe (df) like ['key','col1','col2','col3'] and I have pandas series (sr) for which the index is the same as 'key' in data frame. I want to append the series to the dataframe at the new column called col4 with the same 'key'. I have the following code:
for index, row in segmention.iterrows():
df[df['key']==row['key']]['col4']=sr.loc[row['key']]
The code is very slow. I assume there should be more efficient and better way to do that. could you please help?
Here are two commands which can be used: Use Dataframe join command to append the columns. Use Pandas concat command to append the columns. Both methods can be used to join multiple columns from different data frames and create one data frame.
Pandas str. repeat() method is used to repeat string values in the same position of passed series itself. An array can also be passed in case to define the number of times each element should be repeated in series.
append() Pandas DataFrame. append() will append rows (add rows) of other DataFrame, Series, Dictionary or list of these to another DataFrame.
To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.
You can simply do:
df['col4'] = sr
If don't misunderstand.
Use map
as mentioned EdChum
:
df['col4'] = df['key'].map(sr)
print (df)
col1 col2 col3 key col4
0 4 7 1 A 2
1 5 8 3 B 4
2 6 9 5 C 1
Or assign with set_index
:
df = df.set_index('key')
df['col4'] = sr
print (df)
col1 col2 col3 col4
key
A 4 7 1 2
B 5 8 3 4
C 6 9 5 1
If dont need align
data in Series
by key
use (see difference 2,1,4
vs 4,1,2
):
df['col4'] = sr.values
print (df)
col1 col2 col3 key col4
0 4 7 1 A 4
1 5 8 3 B 1
2 6 9 5 C 2
Sample:
df = pd.DataFrame({'key':[1,2,3],
'col1':[4,5,6],
'col2':[7,8,9],
'col3':[1,3,5]}, index=list('ABC'))
print (df)
col1 col2 col3 key
A 4 7 1 1
B 5 8 3 2
C 6 9 5 3
sr = pd.Series([4,1,2], index=list('BCA'))
print (sr)
B 4
C 1
A 2
dtype: int64
df['col4'] = df['key'].map(sr)
print (df)
col1 col2 col3 key col4
0 4 7 1 A 2
1 5 8 3 B 4
2 6 9 5 C 1
df = df.set_index('key')
df['col4'] = sr
print (df)
col1 col2 col3 col4
key
A 4 7 1 2
B 5 8 3 4
C 6 9 5 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With