Extract date from given string in r

Tags:

r

string<-c("Posted 69 months ago (7/4/2011)")
library(gsubfn)
strapplyc(string, "(.*)", simplify = TRUE)

I apply above function but nothing happens.

In this I want to extract only date part i.e 7/4/2011.

558

asked Apr 14 '17 05:04

1 Answers

The first one shows how to fix the code in the question to give the desired answer. The next 2 solutions are the same except they use different regular expressions. The fourth solution shows how to do it with gsub. The fifth breaks the gsub into two sub calls and the sixth uses read.table.

1) Escape parens The problem is that ( and ) have special meaning in regular expressions so you must escape them if you want to match them literally. By using "[(]" as we do below (or writing them as "\\(" ) they are matched literally. The inner parentheses define the capture group as we don't want that group to include the literal parentheses themselves:

strapplyc(string, "[(](.*)[)]", simplify = TRUE)
## [1] "7/4/2011"

2) Match content Another way to do it is to match the data itself rather than the surrounding parentheses. Here "\\d+" matches one or more digits:

strapplyc(string, "\\d+/\\d+/\\d+", simplify = TRUE)
## [1] "7/4/2011"

You could specify the number of digits if you want to be even more specific but it seems unnecessary here if the data looks similar to that in the question.

3) Match 8 or more digits and slashes Given that there are no other sequences of 8 or more characters consisting only of slashes and digits in the rest of the string we could just pick out that:

strapplyc(string, "[0-9/]{8,}", simplify = TRUE)
## [1] "7/4/2011"

4) Remove text before and after Another way of doing it is to remove everything up to the ( and after the ) like this:

gsub(".*[(]|[)].*", "", string)
## [1] "7/4/2011"

5) sub This is the same as (4) except it breaks the gsub into two sub invocations, one removing everything up to ( and the other removing ) onwards. The regular expressions are therefore slightly simpler.

sub(".*\\(", "", sub("\\).*", "", string))

6) read.table This solution uses no regular expressions at all. It defines sep and comment.char in read.table so that the second column of the result of read.table is the required date or dates.

read.table(text = string, sep = "(", comment.char = ")", as.is = TRUE)$V2
## [1] "7/4/2011"

Note: Note that you don't need the c in defining string

string <- c("Posted 69 months ago (7/4/2011)")
string2 <- "Posted 69 months ago (7/4/2011)"
identical(string, string2)
## [1] TRUE

151

answered Sep 20 '22 06:09

G. Grothendieck

Related questions
                            
                                multi-dimensional list? List of lists? array of lists?
                            
                                R divide each column in dataframe by last row value
                            
                                How to make an overlapping barplot?
                            
                                How to add manual colors for a ggplot2 (geom_smooth/geom_line)
                            
                                How to substring every element in vector of strings?
                            
                                how to pass the "..." parameters in the parent function to its two children functions in r
                            
                                R Regexp - extract number with 5 digits
                            
                                Faster way to transform text vector to numeric matrix/data.frame in R?
                            
                                how to deploy shiny app that uses local data
                            
                                Increasing font size in RStudio
                            
                                IF "OR" multiple conditions
                            
                                Manipulation of Large Files in R
                            
                                R: ggfortify: "Objects of type prcomp not supported by autoplot"
                            
                                add quotation mark to a vector in R [duplicate]
                            
                                R - How to re-order row index number
                            
                                Running count based on field in R
                            
                                Lower case for a (factor) data frame column
                            
                                Extract distinct characters that differ between two strings
                            
                                Shiny + CSS: Aligning actionButtons in shinydashboard sidebar
                            
                                What does the span argument control in geom_smooth?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Extract date from given string in r

Tags:

date-formatting

r

Avinash

People also ask

1 Answers

G. Grothendieck

Recent Activity

Donate For Us