Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Similar function to SQL 'WHERE' clause in R

Tags:

sql

r

I have a data set assigned to a variable named 'temps', which have columns 'date', 'temperature', 'country'.
I want to do something like this, which I can do in SQL

SELECT * FROM temps WHERE country != 'mycountry'

How can I do similar selection in R?

like image 941
Lasitha Yapa Avatar asked Jun 12 '16 03:06

Lasitha Yapa


2 Answers

We can use similar syntax in base R

temps[temps$country != "mycountry",]

Benchmarks

set.seed(24)
temps1 <- data.frame(country = sample(LETTERS, 1e7, replace=TRUE),
                  val = rnorm(1e7))
system.time(temps1[!temps1$country %in% "A",])
#  user  system elapsed 
#   0.92    0.11    1.04 
system.time(temps1[temps1$country != "A",])
#   user  system elapsed 
#   0.70    0.17    0.88 

If we are using package solutions

library(sqldf)
system.time(sqldf("SELECT * FROM temps1 WHERE country != 'A'"))
#   user  system elapsed 
# 12.78    0.37   13.15 

library(data.table)
system.time(setDT(temps1, key = 'country')[!("A")])
#   user  system elapsed 
#  0.62    0.19    0.37 
like image 109
akrun Avatar answered Sep 29 '22 23:09

akrun


This should do it.

temps2 <- temps[!temps$country %in% "mycountry",]
like image 30
milan Avatar answered Sep 29 '22 23:09

milan