Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if a message has a combined character in it?

.

.

.

Example: กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ ก็็็็็็็็็็็็็็็็็็็็ ก็็็็็็็็็็็็็็็็็็็็ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ ก็็็็็็็็็็็็็็็็็็็็ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ ก็็็็็็็็็็็็็็็็็็็็ ก็็็็็็็็็็็็็็็็็็็็ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ ก็็็็็็็็็็็็็็็็็็็็ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ (or any "zalgo" text)

I haven't been able to quite figure out any way to check for these. I'm making a kind of antispam and I don't see the need to keep these as they can lag users and is just generally spam.

What I'm trying to do is

if (getMessage().getRawContent().contains(combined character).delete();

If anyone knows a simple way to check for combined chars please post!

If you are confused on what I am asking I can explain it further and show more examples if needed.

like image 835
Miss Cartoon Avatar asked Apr 18 '17 01:04

Miss Cartoon


1 Answers

There are plenty of cases where one or two consecutive combining characters is perfectly valid text. I would look for four or more of them:

if (getMessage().getRawContent().matches(".*\\p{Mn}{4}.*"))
like image 128
VGR Avatar answered Nov 20 '22 04:11

VGR