Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

list.files pattern argument in R, extended regular expression use

Tags:

regex

r

I run

  dir.create('./junk_data')
  file.create(paste('./junk_data/QWE',01:12,01:31,2005:2015,'.3',sep=''))
  file.create(paste('./junk_data/RTY',01:12,01:31,2005:2015,'.3',sep=''))

and want to list all the files that begin with QWE and end with 2011.3. I tried

list.files('./junk_data/',pattern='QWE....2011.3',full.names=T)

and

list.files('./junk_data/',pattern='QWE....2011.3',full.names=T,perl=T)

but I guess '.' doesn't mean one what I think, as I get none of the files I want.

I tried a few tutorials on regex, but no joy.

like image 948
Yoda Avatar asked Jan 16 '13 14:01

Yoda


People also ask

How do I list files in a directory in R?

To list all files in a directory in R programming language we use list. files(). This function produces a list containing the names of files in the named directory. It returns a character vector containing the names of the files in the specified directories.

What type of regex does r use?

Two types of regular expressions are used in R, extended regular expressions (the default) and Perl-like regular expressions used by perl = TRUE . There is also fixed = TRUE which can be considered to use a literal regular expression.

Why r is used in regular expression?

Placing r or R before a string literal creates what is known as a raw-string literal. Raw strings do not process escape sequences ( \n , \b , etc.) and are thus commonly used for Regex patterns, which often contain a lot of \ characters.


1 Answers

As Arun showed in his example, a dot usually means "match any character", so to match a dot you need to escape it: \\.. You can create the pattern most easily with glob2rx, which uses * as a wildcard and matches other characters as though they are fixed.

glob2rx("QWE*2011.3")   #"^QWE.*2011\\.3$"
list.files("./junk_data/", pattern = glob2rx("QWE*2011.3"), full.names = TRUE)
like image 172
Richie Cotton Avatar answered Sep 30 '22 17:09

Richie Cotton