Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to check if exact string exists including #

Tags:

regex

php

New question As suggested by Asaph in previous question: Regex to check if exact string exists

I am looking for a way to check if an exact string match exists in another string using Regex or any better method suggested. I understand that you tell regex to match a space or any other non-word character at the beginning or end of a string. However, I don't know exactly how to set it up.

Search String: #t

Should Match:
String 1: Hello World, Nice to see you! #t
String 2: #T Hello World, Nice to see you!
String 3: Hello World, #t Nice to see you!

Should not Match:
String 1: Hello World, Nice to see you!
String 2: Hello World, Nice to see you! #ta
String 3: #tHello World, Nice to see you!

Edit 2: Added more string samples

Edit 1 for Serg555 and SilentGhost:
Characters allowed in search string:
#[_a-zA-Z0-9]
# is optional.

Requirements: Search String may be at any character position in the Subject. There may or may not be a white-space character before or after it. I do not want it to match if it is part of another string; such as part of a word.

For the sake of this question: I think I would do this using this pattern: /\b\#t\b/gi
However, this is not returning the results as I would have expected.

I am able to find the exact matches for normal strings (strings where # isn't present) using:

/\b{$search_string}\b/gi

Additional info: this will be used in PHP 5

like image 813
Jayrox Avatar asked May 13 '10 15:05

Jayrox


People also ask

How do you match exactly in regex?

You need to use an “escape” to tell the regular expression you want to match it exactly, not use its special behaviour. Like strings, regexps use the backslash, \ , to escape special behaviour. So to match an . , you need the regexp \. .

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.


1 Answers

All you need is:

/(?:^|\s)#t\b/i           #t is in the beginning or preceded by space.

\b matches word border, which is a border between word-characters and non-word characters. # is a non-word character, therefore your regex matches only strings like this: abc#t, or ab_#t.

Also, normally # is not a special character in regex, you don't need to escape it.

ETA: Your requirements are rather ambiguous: There may or may not be a white-space character before or after it. I do not want it to match if it is part of another string; such as part of a word.

  1. no white space character before or after? so there will be a non white-space character?
  2. but how is it separated then from other strings? what characters are allowed?

I think you need to give a comprehensive sample of your possible input strings. Because, as it is my regex works just fine.

like image 115
SilentGhost Avatar answered Sep 22 '22 04:09

SilentGhost