Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Analog to Pandas Series.value_counts() in R? [duplicate]

Tags:

pandas

r

dplyr

In Python, one can get the counts of values in a list by using Series.value_counts():

import pandas as pd

df = pd.DataFrame()
df['x'] = ['a','b','b','c','c','d']
df['y'] = list(range(1,7))

df['x'].value_counts()

c    2
b    2
a    1
d    1
Name: x, dtype: int64

In R, I have to use three separate commands.

df <- tibble(x=c('a','b','b','c','c','d'), y=1:6)

df %>% group_by(x) %>% summarise(n=n()) %>% arrange(desc(n))

x   n
b   2
c   2
a   1
d   1

Is there a shorter / more idiomatic way of doing this in R? Or am I better off writing a custom function?

like image 826
max Avatar asked Feb 27 '20 20:02

max


People also ask

What does value_counts () do in Pandas?

Return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.

Is value_counts () sorted?

By default, value_counts will sort the data by numeric count in descending order. The ascending parameter enables you to change this. When you set ascending = True , value counts will sort the data by count from low to high (i.e., ascending order).

Does value_counts count NaN?

By default, the value_counts() function does not show the frequency of NaN values.

What is the difference between value_counts and count?

count() should be used when you want to find the frequency of valid values present in columns with respect to specified col . . value_counts() should be used to find the frequencies of a series.

What is pandas series value_counts?

Pandas Series.value_counts () function return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.

What is Python pandas data analysis?

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable type.

How do I get the Count of unique values in pandas?

Series containing counts of unique values in Pandas . The value_counts() function is used to get a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default. Syntax:

How to sort value_counts alphabetically in pandas?

Value_counts () with sort_index (ascending=True) sorts by index (column that you are running value_counts () on: If you want to list value_counts () in reverse alphabetical order you will need to change ascending to False sort_index (ascending=False) 4.) Pandas value_counts (): sort by value, then alphabetically


1 Answers

The tidyverse has dplyr::count, which is a shortcut for 'group_by' and 'summarize' to get counts.

df <- tibble(x=c('a','b','b','c','c','d'), y=1:6)

dplyr::count(df, x, sort = TRUE)

# A tibble: 4 x 2
  x         n
  <chr> <int>
1 b         2
2 c         2
3 a         1
4 d         1

like image 187
rpolicastro Avatar answered Sep 27 '22 16:09

rpolicastro