Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - number of unique values in a column of data frame

Tags:

for a dataframe df, I need to find the unique values for some_col. Tried the following

length(unique(df["some_col"]))

but this is not giving the expected results. However length(unique(some_vector)) works on a vector and gives the expected results.

Some preceding steps while the df is created

df <- read.csv(file, header=T) typeof(df) #=> "list" typeof(unique(df["some_col"])) #=> "list" length(unique(df["some_col"])) #=> 1  
like image 429
user3206440 Avatar asked Jan 28 '17 06:01

user3206440


People also ask

How do I count unique values in a column in R?

To find unique values in a column in a data frame, use the unique() function in R. In Exploratory Data Analysis, the unique() function is crucial since it detects and eliminates duplicate values in the data.

How do you count the number of unique values in a column in a DataFrame?

You can use the nunique() function to count the number of unique values in a pandas DataFrame.


2 Answers

Try with [[ instead of [. [ returns a list (a data.frame in fact), [[ returns a vector.

df <- data.frame( some_col = c(1,2,3,4),                   another_col = c(4,5,6,7) )  length(unique(df[["some_col"]])) #[1] 4  class( df[["some_col"]] ) [1] "numeric"  class( df["some_col"] ) [1] "data.frame" 

You're getting a value of 1 because the list is of length 1 (1 column), even though that 1 element contains several values.

like image 135
rosscova Avatar answered Sep 20 '22 02:09

rosscova


you need to use

length(unique(unlist(df[c("some_col")]))) 

When you call column by df[c("some_col")] or by df["some_col"] ; it pulls it as a list. Unlist will convert it into the vector and you can work easily with it. When you call column by df$some_col .. it pulls the data column as vector

like image 32
Mandar Avatar answered Sep 19 '22 02:09

Mandar