Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to identify Reddit usernames

I am making a bot with a option to not post if the username is not a certain user.

Reddit usernames can contain letters in both cases, and have numbers.

Which regex can be used to identify such a username? The format is /u/USERNAME where username can have letters of both cases and numbers, such as ExaMp13.

I have tried /u/[A-Z][a-z][0-9]

like image 492
Pokestar Fan Avatar asked Dec 10 '22 10:12

Pokestar Fan


2 Answers

Valid characters for Reddit usernames are preceded by /u/ and include:

  • UPPERCASE
  • lowercase
  • Digits
  • Underscore
  • Hyphen

This regex meets those criteria:

/u/[A-Za-z0-9_-]+
like image 159
MrGeek Avatar answered Dec 21 '22 00:12

MrGeek


Brief

Thanks for updating your post with something you've tried as this gives us an idea of what you may not be understanding (and helps us explain where you went wrong and how to fix it).

Your regex doesn't work because it checks for [A-Z] followed by [a-z], then by [0-9]. So your regex will only match something like Be1

Answer

What you should instead try for is [a-zA-Z0-9] or \w and specifying a quantifier such as + (one or more).

For your specific problem, you should use \/u\/(\w+) (or /u/(\w+) since python doesn't care about escaping). This will allow you to then check the first capture group against a list of users you want to not post for.

These regular expressions will ensure that it matches /u/ followed by any word character [a-zA-Z0-9_] between 1 and unlimited times.

See a working example here

like image 33
ctwheels Avatar answered Dec 21 '22 00:12

ctwheels