more than one value with comma in dataframe pandas

Question

colum 1, colum2 a,b,c 30 b,c,f 40 a,g,z 50 . . . Using above dataframe with col1,2 I'd like to have dataframe as below dataframe with col3, 4. Additionally, col1 consists of values with commas. col4 consists of sum of col2 following col3. column3, column4 a 80 b 70 c 70 f 40 g 50 z 50

jezrael · Accepted Answer

Use:

df = (df.set_index('colum2')['colum1']
        .str.split(',', expand=True)
        .stack()
        .reset_index(name='column3')
        .groupby('column3', as_index=False)['colum2']
        .sum()
        .rename(columns={'colum2':'column4'})
      )
print (df)
  column3  column4
0       a       80
1       b       70
2       c       70
3       f       40
4       g       50
5       z       50

Explanation:

First set_index by column colum2
Create DataFrame by split
Reshape by stack
Create index by columns by reset_index
groupby and aggregate sum
Last rename column if necessary

Another solution:

from itertools import chain

a = df['colum1'].str.split(',')
lens = a.str.len()

df = pd.DataFrame({
    'column3' : list(chain.from_iterable(a)), 
    'column4' : df['colum2'].repeat(lens)
}).groupby('column3', as_index=False)['column4'].sum()

print (df)
  column3  column4
0       a       80
1       b       70
2       c       70
3       f       40
4       g       50
5       z       50

Explanation:

Create lists by split
Get lengths of lsits by len
Last repeat columns and flatten colum1
groupby and aggregate sum

more than one value with comma in dataframe pandas

Tags:

python

pandas

Ken Kim

1 Answers

jezrael

Recent Activity

Donate For Us

more than one value with comma in dataframe pandas

Tags:

python

pandas

Ken Kim

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us