Groupby to create new columns

Question

From a dataframe, I want to create a dataframe with new columns if the index is already found BUT I don't know how many columns I will create :

pd.DataFrame([["John","guitar"],["Michael","football"],["Andrew","running"],["John","dancing"],["Andrew","cars"]])

and I want :

pd.DataFrame([["John","guitar","dancing"],["Michael","Football",None],["Andrew","running","cars"]])

without knowing how many columns I should create at the start.

yatu · Accepted Answer

df = pd.DataFrame([["John","guitar"],["Michael","football"],["Andrew","running"],["John","dancing"],["Andrew","cars"]], columns = ['person','hobby'])

You can groupby person and search for unique in hobby. Then use .apply(pd.Series) to expand lists into columns:

df.groupby('person').hobby.unique().apply(pd.Series).reset_index()
    person         0        1
0   Andrew   running     cars
1     John    guitar  dancing
2  Michael  football      NaN

In the case of having a large dataframe, try the more efficient alternative:

df = df.groupby('person').hobby.unique()
df = pd.DataFrame(df.values.tolist(), index=df.index).reset_index()

Which in essence does the same, but avoids looping over rows when applying pd.Series.

Groupby to create new columns

Tags:

python

pandas

group-by

pandas-groupby

FFL75

1 Answers

yatu

Recent Activity

Donate For Us

Groupby to create new columns

Tags:

python

pandas

group-by

pandas-groupby

FFL75

1 Answers

yatu

Related questions

Recent Activity

Donate For Us