Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to replace spaces with tabs at the start of the line

I'd like to be able to fix a Text File's tabs/spaces indentation.

Currently each line has spaces in random locations for some reason.

For example:

space tab if -> tab if

space tab space tab if -> tab tab if

tab tab space if -> tab tab if

etc.

It should not affect anything after the first word, so only the indentation will be affected: So tab space if space boolean should be changed to tab if space boolean not tab if tab boolean.

The regex command should keep the correct number of tabs and just remove the spaces. If there are 4 spaces in a row it should be converted to a tab instead.

Thank you for your help. If you could also explain how your regex works it would be very much appreciated as I'm trying to learn how to do my own regex instead of always asking others to do it.

If you need any more information or details please ask I'll respond as quickly as I can.


I can accomplish this for a single case at a time like so:

For spaces first: Find: space*if Replace: if This only works for lines with no tabs and where the first word is if so I would do this for the starting word of the line.

Then I would repeat with space*\tif.

Looks like I can match a word without capturing by doing (?:[A-Za-z]) So I can just swap out the if for this and it'll work better.

like image 366
Aequitas Avatar asked Oct 09 '15 00:10

Aequitas


1 Answers

You could probably do this in one step, but I'm more partial to simple approaches.

Translate the 4 spaces to tabs first. First line is the match, second is the replace.

^(\s*)[ ]{4}(\s*)
$1\t$2

Then replace all remaining single spaces with nothing.

^(\t*)[ ]+
$1

You don't need the square brackets in this case, but it's a little hard to be sure that there's a space, even with SO's code formatting.

The first line searches for the start of the line ^, then finds any amount of whitespace (including tabs) and puts them in a matching group later named $1 with (\s*). The middle finds exactly four spaces [ ]{4}. The last part repeats the matching group in case there are tabs or more spaces on that side, too.

Since the second match is supposed to be finding all the remaining spaces, the second just looks for 0 or more tabs, puts them in a capture group, and then finds any remaining spaces left. Since it finds and replaces as it goes along, it gobbles up all spaces and replaces with tabs.

like image 172
Jeremy Fortune Avatar answered Sep 29 '22 01:09

Jeremy Fortune