Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract number after a certain word

Tags:

regex

r

I am trying to build a regex expression to extract a 6 digit number (positive or negative) after a certain string, namely 'LogL='.

It comes from text output from certain software.

   7 LogL=-3695.47     S2=  9.0808       1891 df    2.263     0.2565    
   9 LogL= 2456.30     S2=  1.2789       1785 df    1.244     0.1354    

I tried the following in R:

txt <- "   9 LogL= 2456.30     S2=  1.2789       1785 df    1.244     0.1354   "
as.numeric(unlist(strsplit(sub(".*LogL=*", "", txt), " "))[1])

Doesn't work for positive numbers. And I imagine its very crude/ugly way of going about it. I tried meddling on regex101.com

Stackoverflow related questions tried: (1) (2) (3)

I am kind of lost and can't seem to understand regex expressions. I am sure this is a piece of cake. Help?

like image 569
tstev Avatar asked Nov 30 '22 16:11

tstev


2 Answers

I'd use a look-behind regex:

txt <- "   7 LogL=-3695.47     S2=  9.0808       1891 df    2.263     0.2565    
           9 LogL= 2456.30     S2=  1.2789       1785 df    1.244     0.1354   "
pattern <- "(?<=LogL\\=)\\s*\\-*[0-9.]+"
m <- gregexpr(pattern, txt, perl = TRUE)
as.numeric(unlist(regmatches(txt, m)))
#1] -3695.47  2456.30
like image 103
Roland Avatar answered Dec 05 '22 13:12

Roland


Try

LogL=\s*(-?\d+(?:\.\d+)?)

It matches your text (LogL), an equal sign followed by any number of spaces. Then it captures:

  • an optional -
  • digits, at least one
  • and optionally, a . followed by at least one digit.

Check it here at regex101.

like image 25
SamWhan Avatar answered Dec 05 '22 13:12

SamWhan