Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to select columns from groupby object in pandas?




People also ask

How do you get columns in Groupby?

You can also reset_index() on your groupby result to get back a dataframe with the name column now accessible. If you perform an operation on a single column the return will be a series with multiindex and you can simply apply pd. DataFrame to it and then reset_index. Show activity on this post.

How do you get Groupby rows in pandas?

You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.

What can you do with a Groupby object?

Pandas' groupby() allows us to split data into separate groups to perform computations for better analysis.

Set as_index = False during groupby

df = pandas.DataFrame({"a":[1,1,3], "b":[4,5.5,6], "c":[7,8,9], "name":["hello","hello","foo"]})
df.groupby(["a", "name"] , as_index = False).median()

You need to get the index values, they are not columns. In this case level 1

df.groupby(["a", "name"]).median().index.get_level_values(1)


Index([u'hello', u'foo'], dtype=object)

You can also pass the index name

df.groupby(["a", "name"]).median().index.get_level_values('name')

as this will be more intuitive than passing integer values.

You can convert the index values to a list by calling tolist()

df.groupby(["a", "name"]).median().index.get_level_values(1).tolist()


['hello', 'foo']

You can also reset_index() on your groupby result to get back a dataframe with the name column now accessible.

import pandas as pd
df = pd.DataFrame({"a":[1,1,3], "b":[4,5.5,6], "c":[7,8,9], "name":["hello","hello","foo"]})
df_grouped = df.groupby(["a", "name"]).median().reset_index()
 0    hello
 1      foo
 Name: name, dtype: object

If you perform an operation on a single column the return will be a series with multiindex and you can simply apply pd.DataFrame to it and then reset_index.

Using reset_index() after the group by will do the trick:

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': ['7L', '8L', '9L'],
                   'name': ['hello', 'hello', 'foo']})
df.groupby(['a', 'name']).median().reset_index().name

here is the result:

 0    hello
 1      foo
 Name: name, dtype: object

and if you want the list of the values, you can simply:

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': ['7L', '8L', '9L'],
                   'name': ['hello', 'hello', 'foo']})

df.groupby(['a', 'name']).median().reset_index().name.values

The result of using values will be a list containing the values for the name column. The code above returns the following list as the results:

array(['hello', 'foo'], dtype=object)