Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all dots but first in a string using R

Tags:

regex

r

gsub

I have some errors in some numbers showing numbers like "59.34343.23". I know the first dot is correct but the second one (or any after the first) should be remove. How can I remove those?

I tried using gsub in R:

gsub("(?<=\\..*)\\.", "", "59.34343.23", perl=T)

or

gsub("(?<!^[^.]*)\\.", "", "59.34343.23", perl=T)

However it gets the following error "invalid regular expression". But I have been trying the same code in a regex tester and it works. What is my mistake here?

like image 505
user2246905 Avatar asked Dec 31 '25 16:12

user2246905


1 Answers

You can use

gsub("^([^.]*\\.)|\\.", "\\1", "59.34343.23")
gsub("^([^.]*\\.)|\\.", "\\1", "59.34343.23", perl=TRUE)

See the R demo online and the regex demo.

Details:

  • ^([^.]*\.) - Capturing group 1 (referred to as \1 from the replacement pattern): any zero or more chars from the start of string and then a . char (the first in the string)
  • | - or
  • \. - any other dot in the string.

Since the replacement, \1, refers to Group 1, and Group 1 only contains a value after the text before and including the first dot is matched, the replacement is either this part of text, or empty string (i.e. the second and all subsequent occurrences of dots are removed).

like image 122
Wiktor Stribiżew Avatar answered Jan 02 '26 06:01

Wiktor Stribiżew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!