Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can i match the character directly below another character in a string of arbitrary length?

Tags:

regex

Given a string of arbitrary length, a newline and another string of the same length, is it possible to produce regex that will match the character directly below a character on the first line?

For example, what single regex pattern could capture the character below X for all these inputs:

........X..  and  .X.........  and  .....X..... etc.
...........       ...........       ...........

It seems to me, that you must know the position of X in order to get match the character underneath. Manually i can figure out that the pattern

X\.+\n.{8}(.)

that captures the character underneath X in this example

........X..
...........

since i know that X is is the 9th character on the first line. This however doesn't work if X has any other position, which is the core of the problem.

So the question is: Is it possible to create a pattern in regex, that matches the character underneath another character, and what would that look like?

like image 292
joulsen Avatar asked Oct 31 '25 17:10

joulsen


2 Answers

Assuming you know the length of the first line ahead of time, something like this should work for an X in any position:

/.*?X.{11}(.)/gs

Replace 11 with your desired line length.

https://regex101.com/r/HOA9p1/2/

like image 81
Joshua Wade Avatar answered Nov 02 '25 13:11

Joshua Wade


Without knowing the length it's possible by a technique introduced in Vertical Regex Matching.

^(?:.(?=.*\n(\1?.)))*?X.*\n\1?(.)

Here is a demo at regex101

This works in regex flavors supporting forward references like PCRE, .NET, Python, Java (not in JS).
Using possessive \1?+ it will even fail if the offset of X is beyond the end of the line below.

How it works: While the outer (?: non-capturing group ) gets repeated, the capturing group inside a lookahead is growing from itself by adding a character from the consecutive line on each repitition until X will be matched. \1 is a reference to what got captured by the first group from the consecutive line, where it finally gets pasted. The second group shows the character below X.


A variant to replace the character below X with Y by use of another capturing group:
Search for ^((?:.(?=.*\n(\2?.)))*?X.*\n)\2?. and replace with $1$2Y (regex101 demo).

like image 34
bobble bubble Avatar answered Nov 02 '25 13:11

bobble bubble



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!