Pandas grouby and transform('count') gives placement error - works fine on smaller dataset

Question

Really simple task in Pandas is throwing an error I don't understand. With a simple dataset like this:

test=pd.DataFrame([[1,3],[1,6],[2,4],[3,9],[3,2]],columns=['a','b'])

I can execute the following to count the number of times a value appears in the 'a' column of test.

test['count']=test.groupby('a').transform('count')

This yields:

>>> test
       a  b  count
    0  1  3      2
    1  1  6      2
    2  2  4      1
    3  3  9      2
    4  3  2      2

Perfect. But with my real data, this doesn't work. Here's a tiny snippet of my data that works for reproducing the problem:

newtest=pd.DataFrame([['010010201001000','001','0220','AL','0'],['010010201001001','001','0220','AL','0'],['010010201001002','001','0220','AL','0'],['010010201001003','001','0160','AL','0'],['010010201001004','001','0160','AL','0']],columns=['BlockID','CountyFP','District','state_x','HD'])
newtest['blocks']=newtest.groupby(['CountyFP','District','state_x']).transform('count')

Trying that gives me this error:

ValueError: Wrong number of items passed 2, placement implies 1

I don't really see what makes my 'real' example any different from the play set, and googling this error produces other examples of the error but it's still unclear to me exactly why it is being produced here.

More confusing still, if I just execute the righthand side of the above code, it works fine - produces newtest with counts in every column. So it's like the assignment is what is giving it problems.

EdChum · Accepted Answer

You didn't select any columns to perform the aggregation on so it did it on the remaining columns which are 2, if you select one of the columns then you get the desired result:

In [6]:
newtest['blocks'] = newtest.groupby(['CountyFP','District','state_x'])['BlockID'].transform('count')
newtest

Out[6]:
           BlockID CountyFP District state_x HD  blocks
0  010010201001000      001     0220      AL  0       3
1  010010201001001      001     0220      AL  0       3
2  010010201001002      001     0220      AL  0       3
3  010010201001003      001     0160      AL  0       2
4  010010201001004      001     0160      AL  0       2

output of your attempt:

In [9]:
newtest.groupby(['CountyFP','District','state_x']).transform('count')

Out[9]:
   BlockID  HD
0        3   3
1        3   3
2        3   3
3        2   2
4        2   2

You can see that it generates 2 columns as these are the remaining columns hence the error message you observed.

Pandas grouby and transform('count') gives placement error - works fine on smaller dataset

Tags:

python

pandas

AustinC

1 Answers

EdChum

Recent Activity

Donate For Us

Pandas grouby and transform('count') gives placement error - works fine on smaller dataset

Tags:

python

pandas

AustinC

1 Answers

EdChum

Related questions

Recent Activity

Donate For Us