I am trying to assign the output from a value_count to a new df. My code follows.
import pandas as pd
import glob
df = pd.concat((pd.read_csv(f, names=['date','bill_id','sponsor_id']) for f in glob.glob('/home/jayaramdas/anaconda3/df/s11?_s_b')))
column_list = ['date', 'bill_id']
df = df.set_index(column_list, drop = True)
df = df['sponsor_id'].value_counts()
df.columns=['sponsor', 'num_bills']
print (df)
The value count is not being assigned the column headers specified 'sponsor', 'num_bills'. I'm getting the following output from print.head
1036 426
791 408
1332 401
1828 388
136 335
Name: sponsor_id, dtype: int64
You just type the name of the dataframe then .value_counts (). When you use value_counts on a dataframe, it will count the number of records for every combination of unique values for every column. This may be more information than you want, and it may be better to subset the dataframe down to only a few columns.
Remember, the value_counts () function applies to Series, and selecting any column from the DataFrame becomes the Series; that is why we can use the value_counts () function to the columns. Let’s count how many times each City has appeared in the dataset. Write the following code in the next cell.
Finally, I would like to conclude by saying that Pandas dataframe. count () is a function that helps us to analyze the values in the Python dataframe and helps us count all the number of rows and column values and give us a specific output.
Finally, the value_counts () function implements these parameters and returns to the series, and the output is as sown in the above snapshot. Using the value_counts () function to count all the unique integers in the given program.
your column length doesn't match, you read 3 columns from the csv and then set the index to 2 of them, you calculated value_counts which produces a Series with the column values as the index and the value_counts as the values, you need to reset_index
and then overwrite the column names:
df = df.reset_index()
df.columns=['sponsor', 'num_bills']
Example:
In [276]:
df = pd.DataFrame({'col_name':['a','a','a','b','b']})
df
Out[276]:
col_name
0 a
1 a
2 a
3 b
4 b
In [277]:
df['col_name'].value_counts()
Out[277]:
a 3
b 2
Name: col_name, dtype: int64
In [278]:
type(df['col_name'].value_counts())
Out[278]:
pandas.core.series.Series
In [279]:
df = df['col_name'].value_counts().reset_index()
df.columns = ['col_name', 'count']
df
Out[279]:
col_name count
0 a 3
1 b 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With