Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"bad words" filter [closed]

Not very technical, but... I have to implement a bad words filter in a new site we are developing. So I need a "good" bad words list to feed my db with... any hint / direction? Looking around with google I found this one, and it's a start, but nothing more.

Yes, I know that this kind of filters are easily escaped... but the client will is the client will !!! :-)

The site will have to filter out both english and italian words, but for italian I can ask my colleagues to help me with a community-built list of "parolacce" :-) - an email will do.

Thanks for any help.

like image 314
ila Avatar asked Aug 23 '08 19:08

ila


People also ask

How do I turn off the profanity filter?

Select Language And Input. You might have to scroll down to see it under the Personal section. Tap the Toggles icon next to Google Keyboard. Uncheck the box next to Block Offensive Words.

Is there a profanity filter?

A profanity filter is a type of software that scans user-generated content (UGC) to filter out profanity within online communities, social platforms, marketplaces, and more. Moderators decide on which words to censor, including swear words, words associated with hate speech, harassment, etc.

How do I enable profanity filter?

You can enable the profanity filter by setting profanityFilter = true in the RecognitionConfig . If enabled, Speech-to-Text will attempt to detect profane words and return only the first letter followed by asterisks in the transcript (for example, f***).

How do I block bad words on Google?

Tap on Voice option and enable the toggle that says, 'Block offensive words'. The toggle turns blue when it is enabled, and Google Assistant will no longer use profanity and will censor stronger words.


2 Answers

Beware of clbuttic mistakes.

"Apple made the clbuttic mistake of forcing out their visionary - I mean, look at what NeXT has been up to!"

Hmm. "clbuttic".

Google "clbuttic" - thousands of hits!

There's someone who call his car 'clbuttic'.

There are "Clbuttic Steam Engine" message boards.

Webster's dictionary - no help.

Hmm. What can this be?

HINT: People who make buttumptions about their regex scripts, will be embarbutted when they repeat this mbuttive mistake.

like image 97
AgentConundrum Avatar answered Sep 25 '22 04:09

AgentConundrum


I didn't see any language specified but you can use this for PHP it will generate a RegEx for each instered work so that even intentional mis-spellings (i.e. @ss, i3itch ) will also be caught.

<?php  /**  * @author [email protected]  **/  if($_GET['act'] == 'do')  {     $pattern['a'] = '/[a]/'; $replace['a'] = '[a A @]';     $pattern['b'] = '/[b]/'; $replace['b'] = '[b B I3 l3 i3]';     $pattern['c'] = '/[c]/'; $replace['c'] = '(?:[c C (]|[k K])';     $pattern['d'] = '/[d]/'; $replace['d'] = '[d D]';     $pattern['e'] = '/[e]/'; $replace['e'] = '[e E 3]';     $pattern['f'] = '/[f]/'; $replace['f'] = '(?:[f F]|[ph pH Ph PH])';     $pattern['g'] = '/[g]/'; $replace['g'] = '[g G 6]';     $pattern['h'] = '/[h]/'; $replace['h'] = '[h H]';     $pattern['i'] = '/[i]/'; $replace['i'] = '[i I l ! 1]';     $pattern['j'] = '/[j]/'; $replace['j'] = '[j J]';     $pattern['k'] = '/[k]/'; $replace['k'] = '(?:[c C (]|[k K])';     $pattern['l'] = '/[l]/'; $replace['l'] = '[l L 1 ! i]';     $pattern['m'] = '/[m]/'; $replace['m'] = '[m M]';     $pattern['n'] = '/[n]/'; $replace['n'] = '[n N]';     $pattern['o'] = '/[o]/'; $replace['o'] = '[o O 0]';     $pattern['p'] = '/[p]/'; $replace['p'] = '[p P]';     $pattern['q'] = '/[q]/'; $replace['q'] = '[q Q 9]';     $pattern['r'] = '/[r]/'; $replace['r'] = '[r R]';     $pattern['s'] = '/[s]/'; $replace['s'] = '[s S $ 5]';     $pattern['t'] = '/[t]/'; $replace['t'] = '[t T 7]';     $pattern['u'] = '/[u]/'; $replace['u'] = '[u U v V]';     $pattern['v'] = '/[v]/'; $replace['v'] = '[v V u U]';     $pattern['w'] = '/[w]/'; $replace['w'] = '[w W vv VV]';     $pattern['x'] = '/[x]/'; $replace['x'] = '[x X]';     $pattern['y'] = '/[y]/'; $replace['y'] = '[y Y]';     $pattern['z'] = '/[z]/'; $replace['z'] = '[z Z 2]';     $word = str_split(strtolower($_POST['word']));     $i=0;     while($i < count($word))      {         if(!is_numeric($word[$i]))          {             if($word[$i] != ' ' || count($word[$i]) < '1')              {                 $word[$i] = preg_replace($pattern[$word[$i]], $replace[$word[$i]], $word[$i]);              }          }         $i++;      }     //$word = "/" . implode('', $word) . "/";     echo implode('', $word);  }  if($_GET['act'] == 'list')  {     $link = mysql_connect('localhost', 'username', 'password', '1');     mysql_select_db('peoples');     $sql = "SELECT word FROM filters";     $result = mysql_query($sql, $link);     $i=0;     while($i < mysql_num_rows($result))      {         echo mysql_result($result, $i, 'word') . "<br />";         $i++;      }      echo '<hr>';  } ?> <html>     <head>         <title>RegEx Generator</title>     </head>     <body>         <form action='badword.php?act=do' method='post'>             Word: <input type='text' name='word' /><br />             <input type='submit' value='Generate' />         </form>         <a href="badword.php?act=list">List Words</a>     </body> </html> 
like image 43
UnkwnTech Avatar answered Sep 25 '22 04:09

UnkwnTech