Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a regex for Twitter-like names?

I have been coding for a while but never had the need for regular expressions until recently. I need to do a regular expression that accepts usernames as Twitter does. Basically, I want to allow one underscore at a time. There can be more than one underscore in a name but these should not be consecutive characters. Alphanumeric characters are also allowed. But numbers cannot start a name.

Names such as

  • _myname67
  • myname67
  • my_name
  • _my_67_name_

are valid but

  • 94myname
  • __myname
  • my__name
  • my name

are not valid.

I have played with Rubular and come up with a couple regexes:

  • /^[^0-9\s+](_?[a-z0-9]+_?)+$/i
  • /^([a-z_?])+$/i

The problem I keep running into is that these match more than one underscores.

like image 903
Igbanam Avatar asked May 16 '11 17:05

Igbanam


People also ask

Can you use regex in Twitter search?

Can you use regex in Twitter search? Twitter unfortunately doesn't support searching of tweets using regular expressions which means that you do have to post process.

What does 9 mean in regex?

Definition and Usage The [^0-9] expression is used to find any character that is NOT a digit. The digits inside the brackets can be any numbers or span of numbers from 0 to 9. Tip: Use the [0-9] expression to find any character between the brackets that is a digit.

Does regex match anything?

In regular expressions, we can match any character using period "." character. To match multiple characters or a given set of characters, we should use character classes.


3 Answers

Edited

a = %w[
    _myname67
    myname67
    my_name
    _my_67_name_
    94myname
    __myname
    my__name
    my\ name
    m_yname
]

p a.select{|name| name =~ /\A_?[a-z]_?(?:[a-z0-9]_?)*\z/i}
# => ["_myname67", "myname67", "my_name", "_my_67_name_", "m_yname"]

You should use ( ) only for substrings that you want to capture. (?: ) is used for groupings that you do not want to capture. It is a good practice to use it whenever you do not need to refer particularly to that substring. It also makes the regex run faster.

like image 103
sawa Avatar answered Oct 09 '22 20:10

sawa


Try the following ^([a-zA-Z](_?[a-zA-Z0-9]+)*_?|_([a-zA-Z0-9]+_?)*)$

I've separated two cases: the word starts with a letter, and starts with an underscore. If you don't want to allow names consisting of one symbol only replace the * with +.

maerics's solution has one problem, it doesn't capture names that have _ on the second place, such as m_yname

like image 26
Hrant Khachatrian Avatar answered Oct 09 '22 21:10

Hrant Khachatrian


Some things are really hard to express using only regular expressions, and are generally write-only (that is, there's no way to read and understand them lately). You can use a simpler regexp (like the two ones you managed to write) and check for double underscores in your Ruby code. It doesn't hurt:

if username =~ /^[^0-9](_?[a-z0-9]+_?)+$/i and username.count('__') == 0 then ...

like image 1
Gabriel Avatar answered Oct 09 '22 21:10

Gabriel