Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

grep on any cell in a data.frame

A simple "is there a better way" question. I want to find if any cell in a data.frame contains the sub-string I'm looking for:

d=data.frame(V1=c("xxx","yyy","zzz"), V2=c(NA,"ewruinwe",NA))
grepl("ruin",d[2,2])  #TRUE
grepl("ruin",d)  #FALSE FALSE
any(grepl("ruin",as.character(as.matrix(d))))   #TRUE

The last line does what I want, but it looks so ugly I'm wondering if I'm missing something simpler.

Background: d is one of the elements in t=readHTMLTable(url) (XML package). I was doing the d[2,2] approach, to check for an error message, and just discovered the website sometimes add another row to the HTML table, pushing the error message I was looking for to another cell.

UPDATE: so, it seems the two choices (thanks to mathematical.coffee and Roman Luštrik) are:

any(grepl("ruin",as.matrix(d)))
any(apply(d, 2, function(x) grepl("ruin", x)))
like image 254
Darren Cook Avatar asked Jan 25 '12 08:01

Darren Cook


1 Answers

What about this?

d=data.frame(V1=c("xxx","yyy","zzz"), V2=c(NA,"ewruinwe",NA))
apply(d, c(1,2), function(x) grepl("ruin", x))
        V1    V2
[1,] FALSE FALSE
[2,] FALSE  TRUE
[3,] FALSE FALSE

As noted in the comments "2" does the same as "c(1,2)". Then to give a single boolean value:

any(apply(d, 2, function(x) grepl("ruin", x)))
[1] TRUE
like image 156
Roman Luštrik Avatar answered Oct 04 '22 14:10

Roman Luštrik