Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Is there a "Un-Character" Command in R?

I am working with the R programming language.

I have the following dataset:

v <- c(1,2,3,4,5,6,7,8,9,10)

var_1 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))

var_2 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))

var_3 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))

var_4 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))

var_5 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))

my_data = data.frame(var_1, var_2, var_3, var_4, var_5)

I also have another dataset of "conditions" that will be used for querying this data frame:

conditions = data.frame(cond_1 = c("1,3,4", "4,5,6"), cond_2 = c("5,6", "7,8,9"))

My Question: I tried to run the following command to select rows from "my_data" based on the first row of "conditions" - but this returns an empty result:

my_data[my_data$var_1 %in% unlist(conditions[1,1]) &
            my_data$var_2 %in% unlist(conditions[1,2]), ]

[1] var_1 var_2 var_3 var_4 var_5
<0 rows> (or 0-length row.names)

I tried to look more into this by "inspecting" these conditions:

class(conditions[1,1])
[1] "character"

This makes me think that the "unlist()" command is not working because the conditions themselves are a "character" instead of a "list".

Is there an equivalent command that can be used here that plays the same role as the "unlist()" command so that the above statement can be run?

In general, I am trying to produce the same results as I would have gotten from this code - but keeping the format I was using above:

my_data[my_data$var_1 %in% c("1", "3", "4") &
            my_data$var_2 %in% c("5", "6"), ]

Thanks!

Reference: Selecting Rows of Data Based on Multiple Conditions

like image 402
stats_noob Avatar asked Nov 27 '25 04:11

stats_noob


1 Answers

Up front, "1,3,4" != 1. It seems you should look to split the strings using strsplit(., ",").

expected <- my_data[my_data$var_1 %in% c("1", "3", "4") & my_data$var_2 %in% c("5", "6"), ]
head(expected)
#     var_1 var_2 var_3 var_4 var_5
# 18      3     6     2     2     9
# 129     3     5     3     2     8
# 133     4     5     6     5     8
# 186     1     6     6    10    10
# 204     4     6     4     2     6
# 207     1     5     3     2     9

out <- my_data[do.call(`&`, 
  Map(`%in%`,
      lapply(my_data[,1:2], as.character), 
      lapply(conditions, function(z) strsplit(z, ",")[[1]]))),]
head(out)
#     var_1 var_2 var_3 var_4 var_5
# 18      3     6     2     2     9
# 129     3     5     3     2     8
# 133     4     5     6     5     8
# 186     1     6     6    10    10
# 204     4     6     4     2     6
# 207     1     5     3     2     9

Edit: update for new conditions: change do.call to Reduce:

conditions = data.frame(cond_1 = c("1,3,4", "4,5,6"), cond_2 = c("5,6", "7,8,9"), cond_3 = c("4,6", "9"))
out <- my_data[Reduce(`&`,
  Map(`%in%`,
      lapply(my_data[,1:3], as.character),
      lapply(conditions, function(z) strsplit(z, ",")[[1]]))),]
head(out)
#     var_1 var_2 var_3 var_4 var_5
# 133     4     5     6     5     8
# 186     1     6     6    10    10
# 204     4     6     4     2     6
# 232     1     5     6     5     8
# 332     3     6     6     5    10
# 338     1     5     6     3     6
like image 190
r2evans Avatar answered Nov 29 '25 19:11

r2evans



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!