I am working on an app that will eventually allow users to connect with each other, but first a user will be able to post some public information and I want to block them from posting contact information (mainly email and phone numbers).
Is there an algorithm or approach for iOS or PHP that can detect such information? (Note - This is not a simple regular expression. I want to prevent common "tricky" ways of users displaying their contact info to the public).
Examples of what I want to block:
Obviously, there are unlimited derivations of the above examples and others, so I can't just create a "quick" expression matching algorithm for them all.
I know there probably isn't a 100% perfect approach for this, but was curious if there was something out there that would be better than making my own from scratch.
For email I always use this regex
("([a-zA-Z0-9._%+-]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)")
for other emails, instead of using regex use string searches
if line.tolower.contains("dot") and line.tolower.contains("com")
or if line.tolower.contains("@") and "com"
or if line.tolower.contains("@") and "net"
or if line.tolower.contains("mail") and "com"
or if line.tolower.contains("gmail") or "Yahoo" or "hotmail" or "bing"
As you can see, you are going to have to make quite a few rules
For Phone Numbers
("(?:\b\d{10,11}\b)")
("[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]")
Then like the emails, you will have to use .Contains
The only way to make sure you cover every number out there - you will need to add every area code in letter form in a series like:
"twosixfive"
"fourninesix"
as well as:
"two six five"
"four nine six"
as well as:
"two-six-five"
"four-nine-six"
here is a list of all the area codes: http://en.wikipedia.org/wiki/List_of_NANP_area_codes
There's not that many, you just have to be willing to take the time to do it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With