Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R regex to selectively replace characters only at specific string positions

Tags:

regex

r

I'm error checking Canadian postal codes in the format A1A1A1. Common typos are capital O instead of zeros in positions 2, 4 or 6, which should be replaced by a zero.

I'm fairly new to regex, and this one has me stumped. Thanks so much!

like image 568
Carrie Smith Avatar asked Nov 24 '25 01:11

Carrie Smith


1 Answers

You can do

x <- c("A0A0A0", "AOB0C0", "A0BOC0", "A0B0CO", "OOOOOO")

gsub("([A-Z])O", "\\10", x)
# [1] "A0A0A0" "A0B0C0" "A0B0C0" "A0B0C0" "O0O0O0"

A bit of explanation:

  • [A-Z] is any character from A to Z
  • the parentheses ([A-Z]) are here to capture the character so it can be referenced as \\1 in the replacement
  • ([A-Z])O is a character from A to Z followed by a O
  • \\1 is the captured character from A to Z
  • \\10 is the captured character followed by a 0
like image 133
flodel Avatar answered Nov 25 '25 17:11

flodel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!