Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate random string from regex character set

I assume there's some beautiful Pythonic way to do this, but I haven't quite figured it out yet. Basically I'm looking to create a testing module and would like a nice simple way for users to define a character set to pull from. I could potentially concatenate a list of the various charsets associated with string, but that strikes me as a very unclean solution. Is there any way to get the charset that the regex represents?

Example:

def foo(regex_set):
    re.something(re.compile(regex_set))

foo("[a-z]")
>>> abcdefghijklmnopqrstuvwxyz

The compile is of course optional, but in my mind that's what this function would look like.

like image 270
Slater Victoroff Avatar asked Jul 08 '13 19:07

Slater Victoroff


2 Answers

Paul McGuire, author of Pyparsing, has written an inverse regex parser, with which you could do this:

import invRegex
print(''.join(invRegex.invert('[a-z]')))
# abcdefghijklmnopqrstuvwxyz

If you do not want to install Pyparsing, there is also a regex inverter that uses only modules from the standard library with which you could write:

import inverse_regex
print(''.join(inverse_regex.ipermute('[a-z]')))
# abcdefghijklmnopqrstuvwxyz

Note: neither module can invert all regex patterns.


And there are differences between the two modules:

import invRegex
import inverse_regex
print(repr(''.join(invRegex.invert('.'))))
print(repr(''.join(inverse_regex.ipermute('.'))))

yields

'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'

Here is another difference, this time pyparsing enumerates a larger set of matches:

x = list(invRegex.invert('[a-z][0-9]?.'))
y = list(inverse_regex.ipermute('[a-z][0-9]?.'))
print(len(x))
# 26884
print(len(y))
# 1100

like image 136
unutbu Avatar answered Oct 31 '22 17:10

unutbu


A regex is not needed here. If you want to have users select a character set, let them just pick characters. As I said in my comment, simply listing all the characters and putting checkboxes by them would be sufficent. If you want something that is more compact, or just looks cooler, you could do something like one of these:

One way of displaying the letter selection. (green = selected)Another way of displaying the letter selection. (no x = selectedYet another way of displaying the letter selection. (black bg = selected)

Of course, if you actually use this, what you come up with will undoubtedly look better than these (And they will also actually have all the letters in them, not just "A").

If you need, you could include a button to invert the selection, select all, clear selection, save selection, or anything else you need to do.

like image 32
AJMansfield Avatar answered Oct 31 '22 19:10

AJMansfield