If you exposing randomly generated strings or strings with data encoded in them (Product keys). Is it irrational to sanitize them for curse words to avoid the client possibly getting offended in the rare case an offensive word is generated.
Anybody ever have a customer get offended by a randomly generated curse word? Anybody out there ever code logic to filter them out?
Thanks
Edit
One time after developing a product key generation system which had customer data encoded into it. As a joke we wrote a program to see what customer input would generate funny words.
You can either use the first word of the swear word followed by several dashes, such as d—, or you can insert a placeholder in parenthesis. (Expletive), (vulgarity) or (obscenity) would all be appropriate.
The term grawlix refers to the series of typographical symbols (such as @#$%&!) used in cartoons and comic strips to represent swear words. Plural: grawlixes. Also known as jarns, nittles, and obscenicons, grawlixes usually appear in maledicta balloons alongside the comic characters who are uttering the oaths.
"What makes swear words offensive is that people are ready to be offended by them." "It's almost as if society as a whole takes a conscious – or actually unconscious – decision to say 'this word is taboo', while other words are not offensive." "Things change actually quite quickly.
Don't generate random strings with vowels and then you don't have to worry about curse words.
Yes, on the grounds that anyone who would be offended by something they saw in a randomly generated string can think of more things they find offensive than you can sanitize.
Don't optimize for the insane.
Microsoft omits the following from their product keys:
0 1 2 5 A E I O U L N S Z
I omit those from [0-9A-Z], and once the key is generated, I match against a list I found of two-letter combinations most common in English, and regenerate the key if there is a match. For speed, I edit the list of letter pairs by first culling from that list the pairs that are already prevented due to their inclusion of a character in the stripped list ('HE' can't exist if the key is generated from a character set that does not include 'E'), then convert some from 'E' to '3', as in 'H3' instead of 'HE', etc. I have also added a few of my own, like 'KK' and 'CK' for edge cases. One could also omit '3' for speed as necessary, although the more characters you omit the fewer unique keys can be generated.
Probably not a perfect solution, but it's fast enough for my needs and prevents almost all English words from being generated, offensive or not.
Simplest solution is to generate from a 'sanitized' alphabet; use a set of characters that cannot possibly form words. One suggestion in one of the answers is hexadecimal which is an excellent choice, or otherwise drop some critical letters from the alphabet.
Note that just dropping vowels is not going to do the job... it is all too easy to infer them from the remaining consonants.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With