I assume there's some beautiful Pythonic way to do this, but I haven't quite figured it out yet. Basically I'm looking to create a testing module and would like a nice simple way for users to define a character set to pull from. I could potentially concatenate a list of the various charsets associated with string, but that strikes me as a very unclean solution. Is there any way to get the charset that the regex represents?
Example:
def foo(regex_set):
re.something(re.compile(regex_set))
foo("[a-z]")
>>> abcdefghijklmnopqrstuvwxyz
The compile is of course optional, but in my mind that's what this function would look like.
Paul McGuire, author of Pyparsing, has written an inverse regex parser, with which you could do this:
import invRegex
print(''.join(invRegex.invert('[a-z]')))
# abcdefghijklmnopqrstuvwxyz
If you do not want to install Pyparsing, there is also a regex inverter that uses only modules from the standard library with which you could write:
import inverse_regex
print(''.join(inverse_regex.ipermute('[a-z]')))
# abcdefghijklmnopqrstuvwxyz
Note: neither module can invert all regex patterns.
And there are differences between the two modules:
import invRegex
import inverse_regex
print(repr(''.join(invRegex.invert('.'))))
print(repr(''.join(inverse_regex.ipermute('.'))))
yields
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'
Here is another difference, this time pyparsing enumerates a larger set of matches:
x = list(invRegex.invert('[a-z][0-9]?.'))
y = list(inverse_regex.ipermute('[a-z][0-9]?.'))
print(len(x))
# 26884
print(len(y))
# 1100
A regex is not needed here. If you want to have users select a character set, let them just pick characters. As I said in my comment, simply listing all the characters and putting checkboxes by them would be sufficent. If you want something that is more compact, or just looks cooler, you could do something like one of these:
Of course, if you actually use this, what you come up with will undoubtedly look better than these (And they will also actually have all the letters in them, not just "A").
If you need, you could include a button to invert the selection, select all, clear selection, save selection, or anything else you need to do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With