What’s a good Python profanity filter library? [closed]

Tags:

Like https://stackoverflow.com/questions/1521646/best-profanity-filter, but for Python — and I’m looking for libraries I can run and control myself locally, as opposed to web services.

(And whilst it’s always great to hear your fundamental objections of principle to profanity filtering, I’m not specifically looking for them here. I know profanity filtering can’t pick up every hurtful thing being said. I know swearing, in the grand scheme of things, isn’t a particularly big issue. I know you need some human input to deal with issues of content. I’d just like to find a good library, and see what use I can make of it.)

935

asked Aug 20 '10 14:08

Paul D. Waite

1 Answers

I didn't found any Python profanity library, so I made one myself.

Parameters

`filterlist`

A list of regular expressions that match a forbidden word. Please do not use \b, it will be inserted depending on inside_words.

Example: ['bad', 'un\w+']

`ignore_case`

Default: True

Self-explanatory.

`replacements`

Default: "$@%-?!"

A string with characters from which the replacements strings will be randomly generated.

Examples: "%&$?!" or "-" etc.

`complete`

Default: True

Controls if the entire string will be replaced or if the first and last chars will be kept.

`inside_words`

Default: False

Controls if words are searched inside other words too. Disabling this

Module source

(examples at the end)

""" Module that provides a class that filters profanities  """  __author__ = "leoluk" __version__ = '0.0.1'  import random import re  class ProfanitiesFilter(object):     def __init__(self, filterlist, ignore_case=True, replacements="$@%-?!",                   complete=True, inside_words=False):         """         Inits the profanity filter.          filterlist -- a list of regular expressions that         matches words that are forbidden         ignore_case -- ignore capitalization         replacements -- string with characters to replace the forbidden word         complete -- completely remove the word or keep the first and last char?         inside_words -- search inside other words?          """          self.badwords = filterlist         self.ignore_case = ignore_case         self.replacements = replacements         self.complete = complete         self.inside_words = inside_words      def _make_clean_word(self, length):         """         Generates a random replacement string of a given length         using the chars in self.replacements.          """         return ''.join([random.choice(self.replacements) for i in                   range(length)])      def __replacer(self, match):         value = match.group()         if self.complete:             return self._make_clean_word(len(value))         else:             return value[0]+self._make_clean_word(len(value)-2)+value[-1]      def clean(self, text):         """Cleans a string from profanity."""          regexp_insidewords = {             True: r'(%s)',             False: r'\b(%s)\b',             }          regexp = (regexp_insidewords[self.inside_words] %                    '|'.join(self.badwords))          r = re.compile(regexp, re.IGNORECASE if self.ignore_case else 0)          return r.sub(self.__replacer, text)   if __name__ == '__main__':      f = ProfanitiesFilter(['bad', 'un\w+'], replacements="-")         example = "I am doing bad ungood badlike things."      print f.clean(example)     # Returns "I am doing --- ------ badlike things."      f.inside_words = True         print f.clean(example)     # Returns "I am doing --- ------ ---like things."      f.complete = False         print f.clean(example)     # Returns "I am doing b-d u----d b-dlike things."

166

answered Oct 13 '22 17:10

leoluk

Related questions
                            
                                How do I install python3-gi within virtualenv?
                            
                                Django JSONField filtering
                            
                                How do I read a parquet in PySpark written from Spark?
                            
                                Python convert seconds to datetime date and time [duplicate]
                            
                                cmake error 'the source does not appear to contain CMakeLists.txt'
                            
                                'in-place' string modifications in Python
                            
                                Check if module exists, if not install it
                            
                                Getting SQLAlchemy to issue CREATE SCHEMA on create_all
                            
                                ImportError: No module named xgboost
                            
                                Purpose of `numpy.log1p( )`?
                            
                                Python | change text color in shell [duplicate]
                            
                                Python print unicode strings in arrays as characters, not code points
                            
                                How to programmatically make a horizontal line in Qt
                            
                                Assigning to variable from parent function: "Local variable referenced before assignment" [duplicate]
                            
                                django submit two different forms with one submit button
                            
                                How to save in *.xlsx long URL in cell using Pandas
                            
                                How to delete an object from a numpy array without knowing the index
                            
                                how do I remove rows with duplicate values of columns in pandas data frame?
                            
                                error: can't start new thread
                            
                                Python's safest method to store and retrieve passwords from a database

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With