I'm trying to build a regular expression that places a limit on the input length, but not all characters count equal in this length. I'll put the rationale at the bottom of the question. As a simple example, let's limit the maximum length to 12 and allow only a
and b
, but b
counts for 3 characters.
Allowed are:
aa
(anything less than 12 is fine).aaaaaaaaaaaa
(exactly 12 is fine).aaabaaab
(6 + 2 * 3 = 12, which is fine).abaaaaab
(still 6 + 2 * 3 = 12).Disallowed is:
aaaaaaaaaaaaa
(13 a
's).bbbba
(1 + 4 * 3 = 13, which is too much).baaaaaaab
(7 + 2 * 3 = 13, which is too much).I've made an attempt that gets fairly close:
^(a{0,3}|b){0,4}$
This matches on up to 4 clusters that may consist of 0-3 a
's or one b
.
However, it fails to match on my last positive example: abaaaaab
, because that forces the first cluster to be the single a
at the beginning, consumes a second cluster for the b
, then leaves only 2 more clusters for the rest, aaaaab
, which is too long.
Why do I need to do this with a regular expression?
It's for a user interface in Qt via PyQt and QML. The user can type a name in a text field here for a profile. This profile name is url-encoded (special characters are replaced by %XX), and then saved on the user's file system. We encounter problems when the user types a lot of special characters, such as Chinese, which then encode to a very long file name. Turns out that at somewhere like 17 characters, this file name becomes too long for some file systems. The URL-encoding encodes as UTF-8, which has up to 4 bytes per character, resulting in up to 12 characters in the file name (as each of these gets percent-encoded).
16 characters is too short for profile names. Even some of our default names exceed that. We need a variable limit based on these special characters.
Qt normally allows you to specify a Validator to determine which values are acceptable in a text box. We tried implementing such a validator, but that resulted in a segfault upstream, due to a bug in PyQt. It can't seem to handle custom Validator implementations at the moment. However, PyQt also exposes three built-in validators. Two apply only to numbers. The third is a regex validator that allows you to put a regular expression that matches all valid strings. Hence the need for this regular expression.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With