Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the Column Index for a Specific Value

Tags:

string

r

I am having a brain cramp. Below is a toy dataset:

df <- data.frame(
        id = 1:6, 
        v1 = c("a", "a", "c", NA, "g", "h"),
        v2 = c("z", "y", "a", NA, "a", "g"),
        stringsAsFactors=F)

I have a specific value that I want to find across a set of defined columns and I want to identify the position it is located in. The fields I am searching are characters and the trick is that the value I am looking for might not exist. In addition, null strings are also present in the dataset.

Assuming I knew how to do this, the variable position indicates the values I would like returned.

> df
  id   v1   v2 position
1  1    a    z        1
2  2    a    y        1
3  3    c    a        2
4  4 <NA> <NA>       99
5  5    g    a        2
6  6    h    g       99

The general rule is that I want to find the position of value "a", and if it is not located or if v1 is missing, then I want 99 returned.

In this instance, I am searching across v1 and v2, but in reality, I have 10 different variables. It is also worth noting that the value I am searching for can only exist once across the 10 variables.

What is the best way to generate this recode?

Many thanks in advance.

like image 786
Btibert3 Avatar asked Dec 22 '22 17:12

Btibert3


2 Answers

Use match:

> df$position <- apply(df,1,function(x) match('a',x[-1], nomatch=99 ))
> df
  id   v1   v2 position
1  1    a    z        1
2  2    a    y        1
3  3    c    a        2
4  4 <NA> <NA>       99
5  5    g    a        2
6  6    h    g       99
like image 155
Prasad Chalasani Avatar answered Jan 12 '23 02:01

Prasad Chalasani


Firstly, drop the first column:

df <- df[, -1]

Then, do something like this (disclaimer: I'm feeling terribly sleepy*):

( df$result <- unlist(lapply(apply(df, 1, grep, pattern = "a"), function(x) ifelse(length(x) == 0, 99, x))) )
    v1   v2 result
1    a    z      1
2    a    y      1
3    c    a      2
4 <NA> <NA>     99
5    g    a      2
6    h    g     99

* sleepy = code is not vectorised

EDIT (slightly different solution, I still feel sleepy):

df$result <- rapply(apply(df, 1, grep, pattern = "a"), function(x) ifelse(length(x) == 0, 99, x))
like image 32
aL3xa Avatar answered Jan 12 '23 00:01

aL3xa