I have a table like this
user company company2 company3 company4
1 Mac Lenovo Hp null
2 Mac MSI Sony
And using pandas I would like it to be
user company
1 Mac
1 Lenovo
1 Hp
2 Mac
and so on Here I tried it but didnt work with pandas pivot.
dataframe = pd.read_csv('data.csv')
dataframe.fillna(value='', inplace=True)
#dataframe.pivot(index='user', columns='company')
Above code doesnt work and gives error.
Now to melt or unpivot this data you can use the pandas pd.melt () function. id_vars – All the columns passed to id_vars will remain as it is. Value_vars – All the columns passed to value_vars gets melt down (unpivot). By default it will unpivot all the columns that is not specified in the id_vars column.
I want to "unpivot" this data from a wide format to a long format using the pandas melt () method. On the df DataFrame, we'll call the melt () method and set the following arguments: After this "unpivot", we can easily calculate the minimum (essentially the fastest) 0-60 time that we'd publish in a final report.
Key Terms: unpivot, melt, python, pandas In pandas, we can "unpivot" a DataFrame - turn it from a wide format - many columns - to a long format - few columns but many rows. We can accomplish this with the pandas melt () method. This can be helpful for further analysis of our new unpivoted DataFrame.
We can use pandas melt function to convert this wide data frame to a data frame in long form. In this example, we would like to keep both continent and country as columns, so we specify that using ‘id_vars’ argument.
you can use pd.melt method:
In [211]: pd.melt(df, id_vars='user', value_vars=df.columns.drop('user').tolist())
Out[211]:
user variable value
0 1 company Mac
1 2 company Mac
2 1 company2 Lenovo
3 2 company2 MSI
4 1 company3 Hp
5 2 company3 Sony
6 1 company4 null
7 2 company4 NaN
or
In [213]: pd.melt(df,
id_vars='user', value_vars=df.columns.drop('user').tolist(),
value_name='Company') \
.drop('variable',1)
Out[213]:
user Company
0 1 Mac
1 2 Mac
2 1 Lenovo
3 2 MSI
4 1 Hp
5 2 Sony
6 1 null
7 2 NaN
UPDATE: dropping NaN's and sorting resulting DF by user
:
In [218]: pd.melt(df,
...: id_vars='user', value_vars=df.columns.drop('user').tolist(),
...: value_name='Company') \
...: .drop('variable',1) \
...: .dropna() \
...: .sort_values('user')
...:
Out[218]:
user Company
0 1 Mac
2 1 Lenovo
4 1 Hp
6 1 null
1 2 Mac
3 2 MSI
5 2 Sony
PS if you want to get rid of null
values - use df.replace('null', np.nan)
instead of df
:
In [219]: pd.melt(df.replace('null', np.nan),
...: id_vars='user', value_vars=df.columns.drop('user').tolist(),
...: value_name='Company') \
...: .drop('variable',1) \
...: .dropna() \
...: .sort_values('user')
...:
Out[219]:
user Company
0 1 Mac
2 1 Lenovo
4 1 Hp
1 2 Mac
3 2 MSI
5 2 Sony
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With