Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it irrational to sanitize random character strings for curse words? [closed]

If you exposing randomly generated strings or strings with data encoded in them (Product keys). Is it irrational to sanitize them for curse words to avoid the client possibly getting offended in the rare case an offensive word is generated.

Anybody ever have a customer get offended by a randomly generated curse word? Anybody out there ever code logic to filter them out?

Thanks

Edit

One time after developing a product key generation system which had customer data encoded into it. As a joke we wrote a program to see what customer input would generate funny words.

like image 302
Ryu Avatar asked Jun 05 '09 15:06

Ryu


People also ask

How do you censor curse words in text?

You can either use the first word of the swear word followed by several dashes, such as d—, or you can insert a placeholder in parenthesis. (Expletive), (vulgarity) or (obscenity) would all be appropriate.

How do you spell swear words with symbols?

The term grawlix refers to the series of typographical symbols (such as @#$%&!) used in cartoons and comic strips to represent swear words. Plural: grawlixes. Also known as jarns, nittles, and obscenicons, grawlixes usually appear in maledicta balloons alongside the comic characters who are uttering the oaths.

Why are swear words rude?

"What makes swear words offensive is that people are ready to be offended by them." "It's almost as if society as a whole takes a conscious – or actually unconscious – decision to say 'this word is taboo', while other words are not offensive." "Things change actually quite quickly.


4 Answers

Don't generate random strings with vowels and then you don't have to worry about curse words.

like image 160
Jeffrey Hines Avatar answered Nov 01 '22 19:11

Jeffrey Hines


Yes, on the grounds that anyone who would be offended by something they saw in a randomly generated string can think of more things they find offensive than you can sanitize.

Don't optimize for the insane.

like image 23
annakata Avatar answered Nov 01 '22 18:11

annakata


Microsoft omits the following from their product keys:

0 1 2 5 A E I O U L N S Z

I omit those from [0-9A-Z], and once the key is generated, I match against a list I found of two-letter combinations most common in English, and regenerate the key if there is a match. For speed, I edit the list of letter pairs by first culling from that list the pairs that are already prevented due to their inclusion of a character in the stripped list ('HE' can't exist if the key is generated from a character set that does not include 'E'), then convert some from 'E' to '3', as in 'H3' instead of 'HE', etc. I have also added a few of my own, like 'KK' and 'CK' for edge cases. One could also omit '3' for speed as necessary, although the more characters you omit the fewer unique keys can be generated.

Probably not a perfect solution, but it's fast enough for my needs and prevents almost all English words from being generated, offensive or not.

like image 27
radiumsoup Avatar answered Nov 01 '22 17:11

radiumsoup


Simplest solution is to generate from a 'sanitized' alphabet; use a set of characters that cannot possibly form words. One suggestion in one of the answers is hexadecimal which is an excellent choice, or otherwise drop some critical letters from the alphabet.

Note that just dropping vowels is not going to do the job... it is all too easy to infer them from the remaining consonants.

like image 30
jerryjvl Avatar answered Nov 01 '22 18:11

jerryjvl