Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profanities in Django comments

Tags:

python

django

nlp

Since Django doesn't handle filtering profanities - does anyone have any suggestions on an easy way to implement some sort of natural language processing / filtering of profanities in django?

like image 329
9-bits Avatar asked Jan 16 '23 13:01

9-bits


2 Answers

Django does handle filtering profanities.

From https://docs.djangoproject.com/en/1.4/ref/settings/#profanities-list:

PROFANITIES_LIST

Default: () (Empty tuple)

A tuple of profanities, as strings, that will be forbidden in comments when COMMENTS_ALLOW_PROFANITIES is False.

That said you'll still need to populate that list. Some links to get started.

I would also familiarize yourself with the Scunthorpe problem.

like image 191
zackdever Avatar answered Jan 21 '23 17:01

zackdever


Personally I say... don't bother. If you create better filters, they will simply type it differently...

But, here's a simple example:

import re
bad_words = ['spam', 'eggs']
# The \b gives a word boundary so you don't have the Scunthorpe problem: http://en.wikipedia.org/wiki/Scunthorpe_problem
pattern = re.compile(
    r'\b(%s)\b' % '|'.join(bad_words),
    re.IGNORECASE,
)

some_text = 'This text contains some profane words like spam and eggs. But it wont match spammy stuff.'
print some_text
# This text contains some profane words like spam and eggs. But it wont match spammy stuff.

clean_text = pattern.sub('XXX', some_text)
print clean_text
# This text contains some profane words like XXX and XXX. But it wont match spammy stuff.
like image 31
Wolph Avatar answered Jan 21 '23 16:01

Wolph