R regex: remove times from character string

Question

I am attempting to remove/extract times from a character string. The logic is that I am grabbing things that:

must begin with 0-2 digits
must be followed by a single colon
may be followed by either a colon or a period but do not have to
may be followed by 1-infinite digits (if previous condition is true)

Here's a MWE and what I've tried. I'm almost there but I do not want "6:33." to be extracted but instead "6:33" as the advent of a colon or comma must be followed by 1 or more digits. In this case the period is the end of the sentence not a part of the time.

text.var <-  c("R uses 1:5 for 1, 2, 3, 4, 5.", 
    "At 3:00 we'll meet up and leave by 4:30:20.",
    "We'll meet at 6:33.", "He ran it in :22.34.")

pattern <- "$?[0-9]{0,2}$?\:$?[0-9]{2}$?$?[:.]{0,1}$?$?[0-9]{0,}$?"

regmatches(text.var, gregexpr(pattern, text.var, perl = TRUE))

## [[1]]
## character(0)
## 
## [[2]]
## [1] "3:00"    "4:30:20"
## 
## [[3]]
## [1] "6:33."
## 
## [[4]]
## [1] ":22.34"

Desired Output

## [[1]]
## character(0)
## 
## [[2]]
## [1] "3:00"    "4:30:20"
## 
## [[3]]
## [1] "6:33"
## 
## [[4]]
## [1] ":22.34"

hwnd · Accepted Answer

If I understand you correctly, you can use the following to fix your problem.

regmatches(text.var, gregexpr('\d{0,2}:\d{2}(?:[:.]\d+)?', text.var, perl=T))

Explanation:

\d{0,2}   # digits (0-9) (between 0 and 2 times)
:         # ':'
\d{2}     # digits (0-9) (2 times)
(?:       # group, but do not capture (optional):
  [:.]    #   any character of: ':', '.'
  \d+     #   digits (0-9) (1 or more times)
)?        # end of grouping

Note: I removed the escaped parentheses because I am unclear why they are being used in the first place..

Federico Piazza · Answer

Is this what you want:

regmatches(text.var, gregexpr("(\d{0,2}:\d{2}(?:\.\d+)?)", text.var))

Working demo

MATCH 1
1.  [42-46] `3:00`
MATCH 2
1.  [74-78] `4:30`
MATCH 3
1.  [78-81] `:20`
MATCH 4
1.  [104-108]   `6:33`
MATCH 5
1.  [126-132]   `:22.34`

R regex: remove times from character string

Tags:

regex

r

Tyler Rinker

2 Answers

hwnd

Federico Piazza

Recent Activity

Donate For Us

R regex: remove times from character string

Tags:

regex

r

Tyler Rinker

2 Answers

hwnd

Federico Piazza

Related questions

Recent Activity

Donate For Us