Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change table to tall format using panda (UNPIVOT)

I have a table like this

   user         company company2 company3 company4
    1           Mac     Lenovo    Hp      null              
    2           Mac       MSI     Sony                          

And using pandas I would like it to be

     user    company
     1          Mac
     1          Lenovo
     1          Hp
     2         Mac

and so on Here I tried it but didnt work with pandas pivot.

dataframe = pd.read_csv('data.csv')
dataframe.fillna(value='', inplace=True)
#dataframe.pivot(index='user', columns='company')

Above code doesnt work and gives error.

like image 486
Aurora Avatar asked Apr 14 '17 19:04

Aurora


People also ask

How to UNPIVOT data in pandas PD?

Now to melt or unpivot this data you can use the pandas pd.melt () function. id_vars – All the columns passed to id_vars will remain as it is. Value_vars – All the columns passed to value_vars gets melt down (unpivot). By default it will unpivot all the columns that is not specified in the id_vars column.

How do I UNPIVOT from a wide format to a long format?

I want to "unpivot" this data from a wide format to a long format using the pandas melt () method. On the df DataFrame, we'll call the melt () method and set the following arguments: After this "unpivot", we can easily calculate the minimum (essentially the fastest) 0-60 time that we'd publish in a final report.

How do you UNPIVOT a Dataframe in Python?

Key Terms: unpivot, melt, python, pandas In pandas, we can "unpivot" a DataFrame - turn it from a wide format - many columns - to a long format - few columns but many rows. We can accomplish this with the pandas melt () method. This can be helpful for further analysis of our new unpivoted DataFrame.

How to convert wide data frame to long form in pandas?

We can use pandas melt function to convert this wide data frame to a data frame in long form. In this example, we would like to keep both continent and country as columns, so we specify that using ‘id_vars’ argument.


1 Answers

you can use pd.melt method:

In [211]: pd.melt(df, id_vars='user', value_vars=df.columns.drop('user').tolist())
Out[211]:
   user  variable   value
0     1   company     Mac
1     2   company     Mac
2     1  company2  Lenovo
3     2  company2     MSI
4     1  company3      Hp
5     2  company3    Sony
6     1  company4    null
7     2  company4     NaN

or

In [213]: pd.melt(df,
                  id_vars='user', value_vars=df.columns.drop('user').tolist(),
                  value_name='Company') \
            .drop('variable',1)
Out[213]:
   user Company
0     1     Mac
1     2     Mac
2     1  Lenovo
3     2     MSI
4     1      Hp
5     2    Sony
6     1    null
7     2     NaN

UPDATE: dropping NaN's and sorting resulting DF by user:

In [218]: pd.melt(df,
     ...:         id_vars='user', value_vars=df.columns.drop('user').tolist(),
     ...:         value_name='Company') \
     ...:   .drop('variable',1) \
     ...:   .dropna() \
     ...:   .sort_values('user')
     ...:
Out[218]:
   user Company
0     1     Mac
2     1  Lenovo
4     1      Hp
6     1    null
1     2     Mac
3     2     MSI
5     2    Sony

PS if you want to get rid of null values - use df.replace('null', np.nan) instead of df:

In [219]: pd.melt(df.replace('null', np.nan),
     ...:         id_vars='user', value_vars=df.columns.drop('user').tolist(),
     ...:         value_name='Company') \
     ...:   .drop('variable',1) \
     ...:   .dropna() \
     ...:   .sort_values('user')
     ...:
Out[219]:
   user Company
0     1     Mac
2     1  Lenovo
4     1      Hp
1     2     Mac
3     2     MSI
5     2    Sony
like image 83
MaxU - stop WAR against UA Avatar answered Sep 28 '22 12:09

MaxU - stop WAR against UA