While I was parsing the Snort regex set I found a very odd character class syntax, like [\x80-t] or [\x01-t\x0B\x0C\x0E-t\x80-t], and I can't figure out (really no clue) what -t means. I don't even know if it's standard PCRE or a sort of Snort extension.
Here are some regular expression that contains these character classes:
/\x3d\x00\x12\x00..........(.[\x80-t]|...[\x80-t])/smiR
/^To\x3A[^\r\n]+[\x01-t\x0B\x0C\x0E-t\x80-t]/smi
PS: please note that \x80-t is not even a valid range in the standard way because character t is \x74.
This could reference a different character encoding where t is larger than x80 and x80 can't be addressed normally.
Take EBCDIC Scan codes for example (see here for a reference).
(But I too have no clue why somebody would want to write it that way)
For ASCII I have a wild guess: If -t means "until the next token -1" or if placed last in line "until the end of allowed characters" the second query would state this:
To:(not a newline, more than one character)(not a newline)
So basically the expression [\x01-t\x0B\x0C\x0E-t\x80-t] would mean [^\r\n].
If one applies that to (.Ç-t]|...[Ç-t]) that would address any character larger than 7bit ASCII which also could address all of unicode (besides the first 127 characters).
(That being said, I still have no clue why somebody should write it like this, but at least thats a coherent explanation besides "Its a bug")
Maybe helpful: What does the rexexes you posted mean if one writes out the \xYY? ASCII:
/=\NULL\DEVICE_CONTROL_2\NULL\.{10}\(.Ç-t]|...[Ç-t])/smiR
/^To\:[^\r\n]+[\START_OF_HEADING-t\VERTICALTAB\FORMFEED\SHIFTOUT\Ç-t]/smi
Looking after the \0x12 aka Device control 2 could help, because that won't show up in text, but maybe in net traffic.
The second regex matches lines that begin with To: (case-insensitive) followed by at least one character that isn't a line feed or carriage return. Since this is a greedy match, I'd expect \r or \n to be the only possible terminating matches in the [\x01-t\x0B\x0C\x0E-t\x80-t] character class. Note: \r is equivalent to \x0D and \n is equivalent to \x0A. Not sure what -t means but let's pretend it was - instead. Then the character class would be [\x01-\x0B\x0C\x0E-\x80-], which is still a bit convoluted but would make a little bit more sense - i.e. allowing a \n as a terminating character but not \r.
This is a very long shot but is there any chance this could be some kind of search-and-replace gone wrong?! (Guess this can probably be quickly discounted if there are other regexes that have normal ranges without the t.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With