Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to assign a value_count output to a dataframe

Tags:

python

pandas

I am trying to assign the output from a value_count to a new df. My code follows.

import pandas as pd
import glob


df = pd.concat((pd.read_csv(f, names=['date','bill_id','sponsor_id']) for f in glob.glob('/home/jayaramdas/anaconda3/df/s11?_s_b')))


column_list = ['date', 'bill_id']

df = df.set_index(column_list, drop = True)
df = df['sponsor_id'].value_counts()

df.columns=['sponsor', 'num_bills']
print (df)

The value count is not being assigned the column headers specified 'sponsor', 'num_bills'. I'm getting the following output from print.head

1036    426
791     408
1332    401
1828    388
136     335
Name: sponsor_id, dtype: int64
like image 502
Collective Action Avatar asked Mar 09 '16 13:03

Collective Action


People also ask

How do I Count the number of values in a Dataframe?

You just type the name of the dataframe then .value_counts (). When you use value_counts on a dataframe, it will count the number of records for every combination of unique values for every column. This may be more information than you want, and it may be better to subset the dataframe down to only a few columns.

How to use the value_counts() function to the columns of a Dataframe?

Remember, the value_counts () function applies to Series, and selecting any column from the DataFrame becomes the Series; that is why we can use the value_counts () function to the columns. Let’s count how many times each City has appeared in the dataset. Write the following code in the next cell.

What is count () function in pandas Dataframe?

Finally, I would like to conclude by saying that Pandas dataframe. count () is a function that helps us to analyze the values in the Python dataframe and helps us count all the number of rows and column values and give us a specific output.

What is the value_counts() function in Python?

Finally, the value_counts () function implements these parameters and returns to the series, and the output is as sown in the above snapshot. Using the value_counts () function to count all the unique integers in the given program.


1 Answers

your column length doesn't match, you read 3 columns from the csv and then set the index to 2 of them, you calculated value_counts which produces a Series with the column values as the index and the value_counts as the values, you need to reset_index and then overwrite the column names:

df = df.reset_index()
df.columns=['sponsor', 'num_bills']

Example:

In [276]:
df = pd.DataFrame({'col_name':['a','a','a','b','b']})
df

Out[276]:
  col_name
0        a
1        a
2        a
3        b
4        b

In [277]:
df['col_name'].value_counts()

Out[277]:
a    3
b    2
Name: col_name, dtype: int64

In [278]:    
type(df['col_name'].value_counts())

Out[278]:
pandas.core.series.Series

In [279]:
df = df['col_name'].value_counts().reset_index()
df.columns = ['col_name', 'count']
df

Out[279]:
  col_name  count
0        a      3
1        b      2
like image 106
EdChum Avatar answered Oct 28 '22 14:10

EdChum