I would like to see a list of all the possible values, without repetition, in a column of a data frame. Something like:
as.set(series["begin_year"][,1])
for the column "begin_year" although as.set
doesn't exist.
unique()
[or levels()
, if the column is a factor].
Here's the reproducible example:
dat <- OrchardSprays
dat$rowpos
unique(dat$rowpos)
dat$treatment
unique(dat$treatment)
levels(dat$treatment)
EDIT
Note that levels()
will return unique levels of the factor, even if the level is unused. Consider:
dat2 <- subset(dat, treatment != "A")
unique(dat2$treatment)
# [1] D E B H G F C
# Levels: A B C D E F G H
levels(dat2$treatment)
# [1] "A" "B" "C" "D" "E" "F" "G" "H"
You can get rid of the unused levels with droplevels()
:
dat2$treatment <- droplevels(dat2$treatment)
levels(dat2$treatment)
# [1] "B" "C" "D" "E" "F" "G" "H"
The unique function should do this, and there's also a few other set-related functions: union, intersect, setdiff, setequal and is.element that are documented on the help(union) page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With