I'm trying to use value_counts() function from Python's pandas package to find the frequency of items in a column. This works and outputs the following:
57 1811
62 630
71 613
53 217
59 185
68 88
52 70
Name: hospitalized, dtype: int64
In which the first column is the item and the right column is its frequency in the column.
From there, I wanted to access the first column of items and iterate through that in a for loop. I want to be able to access the item of each row and check if it is equal to another value. If this is true, I want to be able to access the second column and divide it by another number.
My big issue is accessing the first column from the .value_counts() output. Is it possible to access this column and if so, how? The columns aren't named anything specific (since it's just the value_counts() output) so I'm unsure how to access them.
Return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.
You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.
Accessing the First Element The first element is at the index 0 position. So it is accessed by mentioning the index value in the series. We can use both 0 or the custom index to fetch the value.
Use Panda's iteritems()
:
df = pd.DataFrame({'mycolumn': [1,2,2,2,3,3,4]})
for val, cnt in df.mycolumn.value_counts().iteritems():
print 'value', val, 'was found', cnt, 'times'
value 2 was found 3 times
value 3 was found 2 times
value 4 was found 1 times
value 1 was found 1 times
value_counts
returns a Pandas Series:
df = pd.DataFrame(np.random.choice(list("abc"), size=10), columns = ["X"])
df["X"].value_counts()
Out[243]:
c 4
b 3
a 3
Name: X, dtype: int64
For the array of individual values, you can use the index of the Series:
vl_list = df["X"].value_counts().index
Index(['c', 'b', 'a'], dtype='object')
It is of type "Index" but you can iterate over it:
for idx in vl_list:
print(idx)
c
b
a
Or for the numpy array, you can use df["X"].value_counts().index.values
You can access the first column by using .keys()
or index
as below:
df.column_name.value_counts().keys()
df.column_name.value_counts().index
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With