I'm having to parse a text dump of a spreadsheet. I have a regular expression that correctly parses each line of the data, but it's rather long. It's basically just matching a certain pattern 12 or 13 times.
The pattern I want to repeat is
\s+(\w*\.*\w*);
This is the regular expression (shortened)
^\s+(\w*\.*\w*);\s+(\w*\.*\w*);\s+(\w*\.*\w*);\s+(\w*\.*\w*);\s+(\w*\.*\w*);\s+(\w*\.*\w*);
Is there a way to match a pattern a set number of times without copy pasting like this? Each of those sections correspond to data columns, all of which I need. I'm using Python by the way. Thanks!
An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once. An expression followed by '? ' may be repeated zero or one times only.
means "zero or one digits, but not two or more". [0-9]* means "zero or more digits (no limit, could be 42 of them)". Note that some languages require that floats are written with a leading 0 before the .
- a "dot" indicates any character. * - means "0 or more instances of the preceding regex token"
Quantifiers specify how many instances of a character, group, or character class must be present in the input for a match to be found.
(\s+(\w*\.*\w*);){12}
The {n}
is a "repeat n times"
if you want "12 - 13" times,
(\s+(\w*\.*\w*);){12,13}
if you want "12+" times,
(\s+(\w*\.*\w*);){12,}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With