Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create new column with data that has same column

I have DataFrame similat to this. How to add new column with names of rows that have same value in one of the column? For example:

Have this:

  name  building 
  a     blue
  b     white
  c     blue
  d     red
  e     blue
  f     red

How to get this?

  name  building  in_building_with
  a     blue      [c, e]
  b     white     []
  c     blue      [a, e]
  d     red       [f]
  e     blue      [a, c]
  f     red       [d]
like image 760
cvakodobro Avatar asked Dec 07 '20 12:12

cvakodobro


People also ask

How do you add a new column to a DataFrame based on another column?

Using apply() method If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas. DataFrame. apply() method should do the trick.

How do I create a column with the same value in pandas?

You can use the assign() function to add a new column to the end of a pandas DataFrame: df = df. assign(col_name=[value1, value2, value3, ...])


2 Answers

This is approach(worst) I can only think of :

r = df.groupby('building')['name'].agg(dict)
df['in_building_with'] = df.apply(lambda  x: [r[x['building']][i] for i in (r[x['building']].keys()-[x.name])], axis=1)

df:

name    building    in_building_with
0   a   blue    [c, e]
1   b   white   []
2   c   blue    [a, e]
3   d   red     [f]
4   e   blue    [a, c]
5   f   red     [d]

Approach:

  1. Make a dictionary which will give your indices where the building occurs.

building
blue     {0: 'a', 2: 'c', 4: 'e'}
red              {3: 'd', 5: 'f'}
white                    {1: 'b'}
dtype: object

  1. subtract the index of the current building from the list since you are looking at the element other than it to get the indices of appearance.

r[x['building']].keys()-[x.name]

  1. Get the values at those indices and make them into a list.
like image 138
Pygirl Avatar answered Sep 24 '22 04:09

Pygirl


If order is not important, you could do:

# create groups
groups = df.groupby('building').transform(dict.fromkeys).squeeze()

# remove value from each group
df['in_building_with'] = [list(group.keys() - (e,)) for e, group in zip(df['name'], groups)]

print(df)

Output

  name building in_building_with
0    a     blue           [e, c]
1    b    white               []
2    c     blue           [e, a]
3    d      red              [f]
4    e     blue           [a, c]
5    f      red              [d]
like image 43
Dani Mesejo Avatar answered Sep 25 '22 04:09

Dani Mesejo