I need to remove all non alphanumeric characters except spaces and allowed emoticons.
Allowed emoticons are :)
, :(
, :P
etc (the most popular).
I have a string:
$string = 'Hi! Glad # to _ see : you :)';
so I need to process this string and get the following:
$string = 'Hi Glad to see you :)';
Also please pay attention emoticons can contain spaces
e.g.
: ) instead of :)
or
: P instead of :P
Does anyone have a function to do this?
If someone helped me it would be so great :)
UPDATE
Thank you very much for your help.
buckley offered ready solution,
but if string contains emoticons with spaces
e.g. Hi! Glad # to _ see : you : )
result is equal to Hi Glad to see you
as you see emoticon : ) was cut off.
I don't "speak" php ;) but this does it in JS. Maybe you can convert it.
var sIn = 'Hi! Glad # to _ see : you :)',
sOut;
sOut = sIn.match(/([\w\s]|: ?\)|: ?\(|: ?P)*/g).join('');
It works the otherway around from your attempt - it finds all "legal" characters/combinations and joins them together.
Regards
Edit: Updated regex to handle optional spaces in emoticons (as commented earlier).
Ha! This one was interesting
Replace
(?!(:\)|:\(|:P))[^a-zA-Z0-9 ](?<!(:\)|:\(|:P))
With nothing
The idea is that you sandwich the illegal characters with the same regex once as a negative lookhead and once as negative lookbehind.
The result will have consecutive spaces in it. This is something that a regex cannot do in 1 sweep AFAIK cause it can't look at multiple matches at once.
To eliminate the consecutive spaces you can replace \s+
with (an empty space)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With