Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When subsetting in R is it necessary to include `which` or can I just put a logical test?

Tags:

r

subset

Say I have a data frame df and want to subset it based on the value of column a.

df <- data.frame(a = 1:4, b = 5:8)
df

Is it necessary to include a which function in the brackets or can I just include the logical test?

df[df$a == "2",]
#  a b
#2 2 6
df[which(df$a == "2"),]
#  a b
#2 2 6

It seems to work the same either way... I was getting some strange results in a large data frame (i.e., getting empty rows returned as well as the correct ones) but once I cleaned the environment and reran my script it worked fine.

like image 393
Andrew Jackson Avatar asked Apr 29 '17 04:04

Andrew Jackson


People also ask

How does subsetting work in R?

Subsetting in R is a useful indexing feature for accessing object elements. It can be used to select and filter variables and observations. You can use brackets to select rows and columns from your dataframe.

What are the three subsetting operators in R?

There are three subsetting operators, [[ , [ , and $ . Subsetting operators interact differently with different vector types (e.g., atomic vectors, lists, factors, matrices, and data frames). Subsetting can be combined with assignment.


1 Answers

df$a == "2" returns a logical vector, while which(df$a=="2") returns indices. If there are missing values in the vector, the first approach will include them in the returned value, but which will exclude them.

For example:

x=c(1,NA,2,10)

x[x==2]
[1] NA  2
x[which(x==2)]
[1] 2
x==2
[1] FALSE    NA  TRUE FALSE
which(x==2)
[1] 3
like image 80
eipi10 Avatar answered Oct 25 '22 04:10

eipi10