How to convert pandas dataframe so that index is the unique set of values and data is the count of each value?

Question

I have a dataframe from a multiple choice questions and it is formatted like so:

      Sex Qu1  Qu2  Qu3
Name
Bob    M   1    2    1
John   M   3    3    5
Alex   M   4    1    2
Jen    F   3    2    4
Mary   F   4    3    4

The data is a rating from 1 to 5 for the 3 multiple choice questions. I want rearrange the data so that the index is range(1,6) where 1='bad', 2='poor', 3='ok', 4='good', 5='excellent', the columns are the same and the data is the count of the number occurrences of the values (excluding the Sex column). This is basically a histogram of fixed bin sizes and the x-axis labeled with strings. I like the output of df.plot() much better than df.hist() for this but I can't figure out how to rearrange the table to give me a histogram of data. Also, how do you change x-labels to be strings?

Wes McKinney · Accepted Answer

Series.value_counts gives you the histogram you're looking for:

In [9]: df['Qu1'].value_counts()
Out[9]: 
4    2
3    2
1    1

So, apply this function to each of those 3 columns:

In [13]: table = df[['Qu1', 'Qu2', 'Qu3']].apply(lambda x: x.value_counts())

In [14]: table
Out[14]: 
   Qu1  Qu2  Qu3
1    1    1    1
2  NaN    2    1
3    2    2  NaN
4    2  NaN    2
5  NaN  NaN    1

In [15]: table = table.fillna(0)

In [16]: table
Out[16]: 
   Qu1  Qu2  Qu3
1    1    1    1
2    0    2    1
3    2    2    0
4    2    0    2
5    0    0    1

Using table.reindex or table.ix[some_array] you can rearrange the data.

To transform to strings, use table.rename:

In [17]: table.rename(index=str)
Out[17]: 
   Qu1  Qu2  Qu3
1    1    1    1
2    0    2    1
3    2    2    0
4    2    0    2
5    0    0    1

In [18]: table.rename(index=str).index[0]
Out[18]: '1'

How to convert pandas dataframe so that index is the unique set of values and data is the count of each value?

Tags:

python

pandas

dailyglen

1 Answers

Wes McKinney

Recent Activity

Donate For Us

How to convert pandas dataframe so that index is the unique set of values and data is the count of each value?

Tags:

python

pandas

dailyglen

1 Answers

Wes McKinney

Related questions

Recent Activity

Donate For Us