Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I detect laughing words in a string?

I'm trying to detect laughing words like "hahahaha" and "lolololol" in a string.

Currently I'm using the following regex:

^((.*?)|)(\b[ha]|\b[lo])(.*?)$

However, this doesn't work for my purposes. It works, but it also matches words totally unrelated to laughter, such as 'kill', because it simply looks for any word that contains the letters l, o, h, a.

How can I detect laughing words (like "hahaha" or "lololol") in a string?

like image 855
gamehelp16 Avatar asked May 09 '13 02:05

gamehelp16


4 Answers

In Python, I tried to do it in this way:

import re

re.sub(r"\b(?:a{0,2}h{1,2}a{0,2}){2,}h?\b", "<laugh>", "hahahahha! I love laughing")

>> <laugh>! I love laughing

like image 146
Ilyas Avatar answered Nov 08 '22 13:11

Ilyas


try with this pattern:

\b(?:a*(?:ha)+h?|(?:l+o+)+l+)\b

or better if your regex flavour support atomic groups and possessive quantifiers:

\b(?>a*+(?:ha)++h?|(?:l+o+)++l+)\b
like image 21
Casimir et Hippolyte Avatar answered Nov 08 '22 13:11

Casimir et Hippolyte


\b(a*ha+h[ha]*|o?l+o+l+[ol]*)\b

Matches:

hahahah
haha
lol
loll
loool
looooool
lolololol
lolololololo
ahaha
aaaahahahahahaha

Does not match:

looo
oool
oooo
llll
ha
l
o
lo
ol
ah
aah
aha
kill
lala
haunt
hauha
louol
like image 6
Patashu Avatar answered Nov 08 '22 12:11

Patashu


To keep it simple, because the solutions posted may be overly complicated for what you want to do: if the only thing you count as "laughing words" are ha, haha, etc. and lol, lolol, lololol, etc., then the following regular expression will be sufficient:

\b(ha)+|l(ol)+\b

This assumes a regex dialect in which \b represents a word boundary, which you seem to be using.

like image 6
Cairnarvon Avatar answered Nov 08 '22 14:11

Cairnarvon