I have Python application.
There is list of 450 prohibited phrases. There is message got from user. I want to check, does this message contain any of this prohibited pharases. What is the fastest way to do that?
Currently I have this code:
message = "sometext"
lista = ["a","b","c"]
isContaining = false
for a, member in enumerate(lista):
if message.contains(lista[a]):
isContaining = true
break
Is there any faster way to do that? I need to handle message (max 500 chars) in less than 1 second.
There is the any built-in function specially for that:
>>> message = "sometext"
>>> lista = ["a","b","c"]
>>> any(a in message for a in lista)
False
>>> lista = ["a","b","e"]
>>> any(a in message for a in lista)
True
Alternatively you could check the intersection of the sets:
>>> lista = ["a","b","c"]
>>> set(message) & set(lista)
set([])
>>> lista = ["a","b","e"]
>>> set(message) & set(lista)
set(['e'])
>>> set(['test','sentence'])&set(['this','is','my','sentence'])
set(['sentence'])
But you won't be able to check for subwords:
>>> set(['test','sentence'])&set(['this is my sentence'])
Using regex compile from list
Consider memory and building time or expression, compile in advance.
lista = [...]
lista_escaped = [re.escape(item) for item in lista]
bad_match = re.compile('|'.join(lista_escaped))
is_bad = bad_match.search(message, re.IGNORECASE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With