Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression in R. Capture specific field

Tags:

regex

r

I have the following vector in R:

x <- c("id: capture this , something: the useless chunk , otherstuff: useless , more stuff")

And I want to get the string "capture this". I have used this regular expression:

library(rex)
r <- rex(
  start,
  anything,
  "id: ",
  capture(anything),
  " , ", 
  anything
)
r
# > r
# > ^.*id: (.*) , .*
re_matches(x,r)

But what I got is:

> re_matches(x,r)
                                                                  1
1 capture this , something: the useless chunk , otherstuff: useless

It captures what I want but also the rest of the string. I just want the "capture this" field. Even if I use the gsub function:

gsub("^.*id: (.*) , .*", "\\1", x)

using the same regular expression I got the same result.

This is the information of R: R version 3.1.3 (2015-03-09) -- "Smooth Sidewalk" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)

And the version of ubuntu: No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.2 LTS Release: 14.04 Codename: trusty

like image 324
Joseah Avatar asked Apr 23 '26 23:04

Joseah


1 Answers

Are you working with yaml? If so you might find the yaml package useful

x <- c("id: capture this , something: the useless chunk , otherstuff: useless , more: stuff")

yaml::yaml.load(gsub(' , ', '\n', x))$id
# [1] "capture this"

Note that I had to add a colon to get the above to work, but the nice thing about this solution is that you can extract each part based on a key field.

This next one is using your example string and doesn't use a package:

x <- c("id: capture this , something: the useless chunk , otherstuff: useless , more stuff")

gsub('id: (.*?) ,.*', '\\1', x)
# [1] "capture this"
like image 196
rawr Avatar answered Apr 27 '26 16:04

rawr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!