Using a single replacement operation replace all leading tabs with spaces

Question

In my text I want to replace all leading tabs with two spaces but leave the non-leading tabs alone.

For example:

a
	b
		c
	d	e
f		g

("a b c d e f g")

should turn into:

a
  b
    c
  d	e
f		g

("a b c d e f g")

For my case I could do that with multiple replacement operations, repeating as many times as the many maximum nesting level or until nothing changes.

But wouldn't it also be possible to do in a single run?

I tried but didn't manage to come up with something, the best I came up yet was with lookarounds:

re.sub(r'(^|(?<=	))	', '  ', a, flags=re.MULTILINE)

Which "only" makes one wrong replacement (second tab between f and g).

Now it might be that it's simply impossible to do in regex in a single run because the already replaced parts can't be matched again (or rather the replacement does not happen right away) and you can't sort-of "count" in regex, in this case I would love to see some more detailed explanations on why (as long as this won't shift too much into [cs.se] territory).

I am working in Python currently but this could apply to pretty much any similar regex implementation.

Wiktor Stribiżew · Accepted Answer

You may match the tabs at the start of the lines, and use a lambda inside re.sub to replace with the double spaces multiplied by the length of the match:

import re
s = "a
	b
		c
	d	e
f		g";
print(re.sub(r"^	+", lambda m: "  "*len(m.group()), s, flags=re.M))

See the Python demo

Using a single replacement operation replace all leading tabs with spaces

Tags:

python

regex

phk

1 Answers

Wiktor Stribiżew

Recent Activity

Donate For Us

Using a single replacement operation replace all leading tabs with spaces

Tags:

python

regex

phk

1 Answers

Wiktor Stribiżew

Related questions

Recent Activity

Donate For Us