Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to increment numbers using regex substitution?

Tags:

regex

Is it possible to increment numbers using regex substitution? Not using evaluated/function-based substitution, of course.

This question was inspired by another one, where the asker wanted to increment numbers in a text editor. There are probably more text editors that support regex substitution than ones that support full-on scripting, so a regex might be convenient to float around, if one exists.

Also, often I've learned neat things from clever solutions to practically useless problems, so I'm curious.

Assume we're only talking about non-negative decimal integers, i.e. \d+.

  • Is it possible in a single substitution? Or, a finite number of substitutions?

  • If not, is it at least possible given an upper bound, e.g. numbers up to 9999?

Of course it's doable given a while-loop (substituting while matched), but we're going for a loopless solution here.

like image 468
slackwing Avatar asked Oct 17 '12 18:10

slackwing


People also ask

How does regex substitution work?

Substitutions are language elements that are recognized only within replacement patterns. They use a regular expression pattern to define all or part of the text that is to replace matched text in the input string. The replacement pattern can consist of one or more substitutions along with literal characters.

How do you add numbers in regex?

For each step add the regex (without delimiters), the modifiers and the substitution string. For the above example this would be (6 + 1 + 3) + (3 + 0 + 2) + (2 + 1 + 0) = 18 .

Can you use regex on numbers?

To match any number from 0 to 9 we use \d in regex. It will match any single digit number from 0 to 9. \d means [0-9] or match any number from 0 to 9. Instead of writing 0123456789 the shorthand version is [0-9] where [] is used for character range.

Which regex matches one or more digits?

Occurrence Indicators (or Repetition Operators): +: one or more ( 1+ ), e.g., [0-9]+ matches one or more digits such as '123' , '000' . *: zero or more ( 0+ ), e.g., [0-9]* matches zero or more digits. It accepts all those in [0-9]+ plus the empty string.


1 Answers

This question's topic amused me for one particular implementation I did earlier. My solution happens to be two substitutions so I'll post it.

My implementation environment is solaris, full example:

echo "0 1 2 3 7 8 9 10 19 99 109 199 909 999 1099 1909" | perl -pe 's/\b([0-9]+)\b/0$1~01234567890/g' | perl -pe 's/\b0(?!9*~)|([0-9])(?=9*~[0-9]*?\1([0-9]))|~[0-9]*/$2/g'  1 2 3 4 8 9 10 11 20 100 110 200 910 1000 1100 1910 

Pulling it apart for explanation:

s/\b([0-9]+)\b/0$1~01234567890/g 

For each number (#) replace it with 0#~01234567890. The first 0 is in case rounding 9 to 10 is needed. The 01234567890 block is for incrementing. The example text for "9 10" is:

09~01234567890 010~01234567890 

The individual pieces of the next regex can be described seperately, they are joined via pipes to reduce substitution count:

s/\b0(?!9*~)/$2/g 

Select the "0" digit in front of all numbers that do not need rounding and discard it.

s/([0-9])(?=9*~[0-9]*?\1([0-9]))/$2/g 

(?=) is positive lookahead, \1 is match group #1. So this means match all digits that are followed by 9s until the '~' mark then go to the lookup table and find the digit following this number. Replace with the next digit in the lookup table. Thus "09~" becomes "19~" then "10~" as the regex engine parses the number.

s/~[0-9]*/$2/g 

This regex deletes the ~ lookup table.

like image 66
DKATyler Avatar answered Sep 22 '22 15:09

DKATyler