Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select columns from groupby object in pandas?

Tags:

python

pandas

People also ask

How do you get columns in Groupby?

You can also reset_index() on your groupby result to get back a dataframe with the name column now accessible. If you perform an operation on a single column the return will be a series with multiindex and you can simply apply pd. DataFrame to it and then reset_index. Show activity on this post.

How do you get Groupby rows in pandas?

You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.

What can you do with a Groupby object?

Pandas' groupby() allows us to split data into separate groups to perform computations for better analysis.


Set as_index = False during groupby

df = pandas.DataFrame({"a":[1,1,3], "b":[4,5.5,6], "c":[7,8,9], "name":["hello","hello","foo"]})
df.groupby(["a", "name"] , as_index = False).median()

You need to get the index values, they are not columns. In this case level 1

df.groupby(["a", "name"]).median().index.get_level_values(1)

Out[2]:

Index([u'hello', u'foo'], dtype=object)

You can also pass the index name

df.groupby(["a", "name"]).median().index.get_level_values('name')

as this will be more intuitive than passing integer values.

You can convert the index values to a list by calling tolist()

df.groupby(["a", "name"]).median().index.get_level_values(1).tolist()

Out[5]:

['hello', 'foo']

You can also reset_index() on your groupby result to get back a dataframe with the name column now accessible.

import pandas as pd
df = pd.DataFrame({"a":[1,1,3], "b":[4,5.5,6], "c":[7,8,9], "name":["hello","hello","foo"]})
df_grouped = df.groupby(["a", "name"]).median().reset_index()
df_grouped.name
 0    hello
 1      foo
 Name: name, dtype: object

If you perform an operation on a single column the return will be a series with multiindex and you can simply apply pd.DataFrame to it and then reset_index.


Using reset_index() after the group by will do the trick:

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': ['7L', '8L', '9L'],
                   'name': ['hello', 'hello', 'foo']})
df.groupby(['a', 'name']).median().reset_index().name

here is the result:

 0    hello
 1      foo
 Name: name, dtype: object

and if you want the list of the values, you can simply:

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': ['7L', '8L', '9L'],
                   'name': ['hello', 'hello', 'foo']})

df.groupby(['a', 'name']).median().reset_index().name.values

The result of using values will be a list containing the values for the name column. The code above returns the following list as the results:

array(['hello', 'foo'], dtype=object)