Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter row names based on string length [duplicate]

Tags:

r

I want to filter rows that contain rownames longer than 35 and shorter than 10. I was looking at the nchar function.

                                    79_CGTACG.collapsed.gz 80_ACAGTG.collapsed.gz
CACCCGCACGTATAGACGGACA                                   0                      0
GTGCTGATGTCCTTGGCAGGCTTCGGCCGTCCGGC                      0                      0
CGTGGAACCTG                                              0                      0
TAATGGTCATTAG                                            2                      1
GGCGATGCGGGATGAACCGAAC                                   0                      0
AAGGATGT                                                 0                      0
like image 364
user2300940 Avatar asked Mar 09 '16 20:03

user2300940


People also ask

How do I select rows with certain strings in R?

Often you may want to filter rows in a data frame in R that contain a certain string. Fortunately this is easy to do using the filter() function from the dplyr package and the grepl() function in Base R.


1 Answers

I think your idea to use nchar() is good. This can be applied to rownames() and combined with a logical susetting of the data frame:

df1[nchar(rownames(df1)) > 35 | nchar(rownames(df1)) < 10,]
#         X79_CGTACG.collapsed.gz X80_ACAGTG.collapsed.gz
#AAGGATGT                       0                       0

data

 df1 <- structure(list(X79_CGTACG.collapsed.gz = c(0L, 0L, 0L, 2L, 0L, 
0L), X80_ACAGTG.collapsed.gz = c(0L, 0L, 0L, 1L, 0L, 0L)), 
 .Names = c("X79_CGTACG.collapsed.gz", "X80_ACAGTG.collapsed.gz"), 
  class = "data.frame", row.names = c("CACCCGCACGTATAGACGGACA", 
  "GTGCTGATGTCCTTGGCAGGCTTCGGCCGTCCGGC", "CGTGGAACCTG", "TAATGGTCATTAG", 
 "GGCGATGCGGGATGAACCGAAC", "AAGGATGT"))
like image 186
RHertel Avatar answered Oct 31 '22 09:10

RHertel