I have a dataframe like x where the column genes is a factor. I want to remove all the rows where column genes has nothing. So in table X I want to remove row 4. Is there a way to do this for a large dataframe?
X
names values genes
1 A 0.2876113 EEF1A1
2 B 0.6681894 GAPDH
3 C 0.1375420 SLC35E2
4 D -1.9063386
5 E -0.4949905 RPS28
Finally result:
X
names values genes
1 A 0.2876113 EEF1A1
2 B 0.6681894 GAPDH
3 C 0.1375420 SLC35E2
5 E -0.4949905 RPS28
Thank you all!
It's not completely obvious from your question what the empty values are, but you should be able to adopt the solution below (here I assume the 'empty' values are empty strings):
toBeRemoved<-which(X$genes=="")
X<-X[-toBeRemoved,]
@Nick Sabbe provided a great answer, but it has one caveat:
Using -which(...)
is a neat trick to (sometimes) speed up the subsetting operation when there are only a few elements to remove.
...But if there are no elements to remove, it fails!
So, if X$genes
does not contain any empty strings, which
will return an empty integer vector. Negating that is still an empty vector. And X[integer(0)] returns an empty data.frame!
toBeRemoved <- which(X$genes=="")
if (length(toBeRemoved>0)) { # MUST check for 0-length
X<-X[-toBeRemoved,]
}
Or, if the speed gain isn't important, simply:
X<-X[X$genes!="",]
Or, as @nullglob pointed out,
subset(X, genes != "")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With