I have DataFrame similat to this. How to add new column with names of rows that have same value in one of the column? For example:
Have this:
name building
a blue
b white
c blue
d red
e blue
f red
How to get this?
name building in_building_with
a blue [c, e]
b white []
c blue [a, e]
d red [f]
e blue [a, c]
f red [d]
Using apply() method If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas. DataFrame. apply() method should do the trick.
You can use the assign() function to add a new column to the end of a pandas DataFrame: df = df. assign(col_name=[value1, value2, value3, ...])
This is approach(worst) I can only think of :
r = df.groupby('building')['name'].agg(dict)
df['in_building_with'] = df.apply(lambda x: [r[x['building']][i] for i in (r[x['building']].keys()-[x.name])], axis=1)
df:
name building in_building_with
0 a blue [c, e]
1 b white []
2 c blue [a, e]
3 d red [f]
4 e blue [a, c]
5 f red [d]
Approach:
building
blue {0: 'a', 2: 'c', 4: 'e'}
red {3: 'd', 5: 'f'}
white {1: 'b'}
dtype: object
r[x['building']].keys()-[x.name]
If order is not important, you could do:
# create groups
groups = df.groupby('building').transform(dict.fromkeys).squeeze()
# remove value from each group
df['in_building_with'] = [list(group.keys() - (e,)) for e, group in zip(df['name'], groups)]
print(df)
Output
name building in_building_with
0 a blue [e, c]
1 b white []
2 c blue [e, a]
3 d red [f]
4 e blue [a, c]
5 f red [d]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With