Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent zalgo text using php [duplicate]

I have some problems with Zalgo on my imageboard.

Texts like below mess up my imageboard. Is there a way to prevent these characters and "fix" or clean up the texts?

Example text Source:

ALL IS LOŚ͖̩͇̗̪̏̈́T ALL I​S LOST the pon̷y he comes he c̶̮omes he comes the ich​or permeates all MY FACE MY FACE ᵒh god no NO NOO̼O​O NΘ stop the an​*̶͑̾̾​̅ͫ͏̙̤g͇̫͛͆̾ͫ̑͆l͖͉̗̩̳̟̍ͫͥͨe̠̅s ͎a̧͈͖r̽̾̈́͒͑e n​ot rè̑ͧ̌aͨl̘̝̙̃ͤ͂̾̆ ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ

I tried to use this solution:

$cleanMessage = preg_replace("/[^\x20-\xAD\x7F]/", "", $input_lines);

Taken from here: Remove special characters that mess with formating But it works only for latin chars Can anyone help me?

like image 974
aftamat4ik Avatar asked Mar 05 '26 00:03

aftamat4ik


1 Answers

This regular expression replaces every superscript symbol in the $text variable:

$text = preg_replace("~[\p{M}]~uis","", $text);

If $text contains char with superscript, for example กิ this regex will remove that superscript symbol and result $text will contain just .

I was improved this regex and changed it to filter only second level of phonetic marks

$text = preg_replace("~(?:[\p{M}]{1})([\p{M}])+?~uis","", $text);

This regex will filter only second level of superscript symbols. Use it if you want to filter deutch or other languages with reserved marks. This regex will transform this word -

͐̈ͩ̎Zͮ͌ͦ͆ͦͤÃ̉͛̄ͭ̈̚LͫG̉̋͂̉Oͨ͌̋͗!

into this: ZÄLͫGO!

I hope second regex will help you.

like image 174
aftamat4ik Avatar answered Mar 07 '26 14:03

aftamat4ik



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!