Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex match Telegram username and delete whole line in PHP

I wanna match Telegram username in message text and delete entire line, I've tried this pattern but the problem is that it matches emails too:

.*(@(?=.{5,64}(?:\s|$))(?![_])(?!.*[_]{2})[a-zA-Z0-9_]+(?<![_.])).*

Pattern should match all this lines :

Hi @username how are you?

Hi @username.how are you?

😉@username.

And should not match email like this:

Hi email to [email protected]

like image 685
Ali Raghebi Avatar asked Aug 07 '20 19:08

Ali Raghebi


1 Answers

Use

.*\B@(?=\w{5,32}\b)[a-zA-Z0-9]+(?:_[a-zA-Z0-9]+)*.*

See proof

\B before @ means there must be a non-word character or start of string right before the @.

EXPLANATION

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  \B                       the boundary between two word chars (\w)
                           or two non-word chars (\W)
--------------------------------------------------------------------------------
  @                        '@'
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    \w{5,32}                 word characters (a-z, A-Z, 0-9, _)
                             (between 5 and 32 times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
    \b                       the boundary between a word char (\w)
                             and something that is not a word char
--------------------------------------------------------------------------------
  )                        end of look-ahead
--------------------------------------------------------------------------------
  [a-zA-Z0-9]+             any character of: 'a' to 'z', 'A' to 'Z',
                           '0' to '9' (1 or more times (matching the
                           most amount possible))
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    _                        '_'
--------------------------------------------------------------------------------
    [a-zA-Z0-9]+             any character of: 'a' to 'z', 'A' to
                             'Z', '0' to '9' (1 or more times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )*                       end of grouping
--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
like image 123
Ryszard Czech Avatar answered Nov 14 '22 23:11

Ryszard Czech