for a dataframe df
, I need to find the unique values for some_col
. Tried the following
length(unique(df["some_col"]))
but this is not giving the expected results. However length(unique(some_vector))
works on a vector and gives the expected results.
Some preceding steps while the df is created
df <- read.csv(file, header=T) typeof(df) #=> "list" typeof(unique(df["some_col"])) #=> "list" length(unique(df["some_col"])) #=> 1
To find unique values in a column in a data frame, use the unique() function in R. In Exploratory Data Analysis, the unique() function is crucial since it detects and eliminates duplicate values in the data.
You can use the nunique() function to count the number of unique values in a pandas DataFrame.
Try with [[
instead of [
. [
returns a list
(a data.frame
in fact), [[
returns a vector
.
df <- data.frame( some_col = c(1,2,3,4), another_col = c(4,5,6,7) ) length(unique(df[["some_col"]])) #[1] 4 class( df[["some_col"]] ) [1] "numeric" class( df["some_col"] ) [1] "data.frame"
You're getting a value of 1 because the list
is of length 1 (1 column), even though that 1 element contains several values.
you need to use
length(unique(unlist(df[c("some_col")])))
When you call column by df[c("some_col")] or by df["some_col"] ; it pulls it as a list. Unlist will convert it into the vector and you can work easily with it. When you call column by df$some_col .. it pulls the data column as vector
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With