Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a more sensible way of writing this regular expression?

Tags:

python

regex

The following regular expression is written in the Python dialect:

^(    )*#(\s+\S(.*\S)?)?$

Can anyone see a better way to wright this? For those not sure what it is saying:

  • It matches an entire line.
  • The lines starts with any multiple of four spaces.
  • A hash-tag follows those spaces.
  • Either nothing or the following comes after the hash-tag:
    • At least one whitespace character follows the hash-tag.
    • One non-whitespace character comes after those.
    • Either nothing or the following comes next:
      • Any number of characters follow.
      • The last character is a non-whitespace character.

Can it be simplified anymore than this?

^(    )*#(\s.*\S)?$
like image 459
Noctis Skytower Avatar asked Jan 20 '26 00:01

Noctis Skytower


1 Answers

One way to re-write the regexp to enhance readability (to reduce the chance to count consecutive whitespaces):

^( {4})*#(\s.*\S)?$

In the words of @Noctis, it shortens the compiler debug output.

Procedure to get (\s.*\S)? from (\s+\S(.*\S)?)?

\s+ => \s(\s)*

\S(.*\S)? => \S or \S.*\S => (\S.*)?\S

(\s+\S(.*\S)?)? => (\s(\s)*(\S.*)?\S)? => (\s.*\S)? because (\s)*(\S.*)? => .*

like image 198
RAM Avatar answered Jan 22 '26 17:01

RAM



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!