Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use python regex to match words beginning with hash and question mark?

Tags:

regex

This should be easy and this regex works fine to search for words beginning with specific characters, but I can't get it to match hashes and question marks.

This works and matches words beginning a:

r = re.compile(r"\b([a])(\w+)\b")

But these don't match: Tried:

r = re.compile(r"\b([#?])(\w+)\b")
r = re.compile(r"\b([\#\?])(\w+)\b")
r = re.compile( r"([#\?][\w]+)?")

even tried just matching hashes

r = re.compile( r"([#][\w]+)?"
r = re.compile( r"([/#][\w]+)?"

text = "this is one #tag and this is ?another tag"
items = r.findall(text)

expecting to get:

[('#', 'tag'), ('?', 'another')]
like image 909
PhoebeB Avatar asked Jan 03 '10 10:01

PhoebeB


2 Answers

\b matches the empty space between a \w and \W (or between a \W and \w) but there is no \b before a # or ?.

In other words: remove the first word boundary.

Not:

r = re.compile(r"\b([#?])(\w+)\b")

but

r = re.compile(r"([#?])(\w+)\b")
like image 195
Bart Kiers Avatar answered Oct 01 '22 21:10

Bart Kiers


you are using Python, regex is the last thing to come to mind

>>> text = "this is one #tag and this is ?another tag"
>>> for word in text.split():
...   if word.startswith("#") or word.startswith("?"):
...     print word
...
#tag
?another
like image 26
ghostdog74 Avatar answered Oct 01 '22 19:10

ghostdog74