Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the equivalent of the LEFT plus FIND function in R?

Tags:

regex

r

excel

I am trying to extract the first few characters from a column in a data frame. What I need is first few character till a "," is encountered.

Data:

texts
12/5/15, 11:49 - thanks, take care
12/5/15, 11:51 - cool

What I need is

texts                                   date
12/5/15, 11:49 - thanks, take care     12/5/15
12/10/15, 11:51 - cool                 12/10/15

I tired using this, but this returned everything without the columns

df$date <- sub(", ", "", df$date, fixed = TRUE)

 and 

df$date <- gsub( ".,","", df$texts) 

Excel equivalent

=LEFT(A1, FIND(",",A1,1)-1)
like image 452
Anubhav Dikshit Avatar asked Dec 12 '25 03:12

Anubhav Dikshit


1 Answers

You can use sub:

sub('(^.*?),.*', '\\1', df$texts)
# [1] "12/5/15" "12/5/15"

The pattern matches

  • the start of the line ^ followed by any character . repeated zero to infinity times, but as few as possible *?, all captured ( ... )
  • followed by a comma ,
  • followed by any character, repeated zero to infinity times .*

which will match the whole line, and replaces it with

  • the captured group \\1.

Other options: substr, strsplit, stringr::str_extract.

If you're planning on using said dates, as.Date (or strptime, if you want the times too) can actually strip out what it needs:

as.Date(df$texts, '%m/%d/%y')`  # or '%d/%m/%y', if that's the format
# [1] "2015-12-05" "2015-12-05"

Data:

df <- structure(list(texts = structure(1:2, .Label = c("12/5/15, 11:49 - thanks, take care", 
                "12/5/15, 11:51 - cool"), class = "factor")), .Names = "texts", 
                class = "data.frame", row.names = c(NA, -2L))
like image 103
alistaire Avatar answered Dec 13 '25 15:12

alistaire



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!