Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas 'DataFrame' object has no attribute 'unique'

I'm working in pandas doing pivot tables and when doing the groupby (to count distinct observations) aggfunc={"person":{lambda x: len(x.unique())}} gives me the following error: 'DataFrame' object has no attribute 'unique' any ideas how to fix it?

like image 697
jwzinserl Avatar asked Mar 24 '15 23:03

jwzinserl


People also ask

What does DataFrame object has no attribute mean?

The part 'DataFrame' object has no attribute 'str'' tells us that the DataFrame object we are handling does not have the str attribute. str is a Series and Index attribute. We can get a Series from a DataFrame by referring to a column name or using values.

How do I get a list of unique values from a column in pandas?

You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.

What does unique () do in pandas?

The unique function in pandas is used to find the unique values from a series. A series is a single column of a data frame. We can use the unique function on any possible set of elements in Python. It can be used on a series of strings, integers, tuples, or mixed elements.

How do you solve a DataFrame object has no attribute?

If you try to call concat() on a DataFrame object, you will raise the AttributeError: 'DataFrame' object has no attribute 'concat'. You have to pass the columns to concatenate to pandas. concat() and define the axis to concatenate along.


4 Answers

One very easy solution to get the unique combinations of >1 columns from a DF is the following:

unique_A_B_combos = df[['A', 'B']].value_counts().index.values
like image 111
emilaz Avatar answered Sep 20 '22 15:09

emilaz


DataFrames do not have that method; columns in DataFrames do:

df['A'].unique()

Or, to get the names with the number of observations (using the DataFrame given by closedloop):

>>> df.groupby('person').person.count()
Out[80]: 
person
0         2
1         3
Name: person, dtype: int64
like image 34
Alexander Avatar answered Sep 17 '22 15:09

Alexander


Rather than removing duplicates during the pivot table process, use the df.drop_duplicates() function to selectively drop duplicates.

For example if you are pivoting using these index='c0' and columns='c1' then this simple step yields the correct counts.

In this example the 5th row is a duplicate of the 4th (ignoring the non-pivoted c2 column

import pandas as pd
data = {'c0':[0,1,0,1,1], 'c1':[0,0,1,1,1], 'person':[0,0,1,1,1], 'c_other':[1,2,3,4,5]}
df = pd.DataFrame(data)
df2 = df.drop_duplicates(subset=['c0','c1','person'])
pd.pivot_table(df2, index='c0',columns='c1',values='person', aggfunc='count')

This correctly outputs

c1  0  1
c0      
0   1  1
1   1  1
like image 20
closedloop Avatar answered Sep 19 '22 15:09

closedloop


df[['col1', 'col2']].nunique()

Try this instead of separate function

like image 24
Амир Джанибеков Avatar answered Sep 16 '22 15:09

Амир Джанибеков