I'm wondering how to find a set of all matches to a given regex with a finite number of matches.
For example:
All of these example you can assume they start with ^
and end with $
`hello?` -> (hell, hello)
`[1-9][0-9]{0,3}` -> (1,2,3 ..., 9998, 9999)
`My (cat|dog) is awesome!` -> (My cat is awesome!, My dog is awesome!)
`1{1,10}` -> (1,11, ..., 111111111, 1111111111)
`1*` -> //error
`1+` -> //error
`(1|11){2}` -> (1,11,111,1111) //notice how it doesn't repeat any of the possibilities
I'd also be interested if there was a way of retrieving count a unique a solutions to the regex or if there is a way to determine if the regex has a finite solutions.
It would be nice if the algorithm could parse any regex, but a powerful enough subset of the regex would be fine.
I'm interested in a PHP solution to this problem, but other languages would also be fine.
EDIT:
I've learned in my Formal Theory class about DFA which can be used to implement regex (and other regular languages). If I could transform the regex into a DFA the solution seems fairly straight forward to me, but that transformation seems rather tricky to me.
EDIT 2:
Thanks for all the suggestions, see my post about the public github project I'm working on to "answer" this question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With