Here is the issue: I have imported about 20000 game descriptions from mochimedia into my database, but there are many foreign games, which I do not want to list.
I came up with this query to find columns with non-ASCII characters
SELECT * FROM TABLE WHERE NOT HEX(COLUMN) REGEXP '^([0-7][0-9A-F])*$';
Note that I found this solution here on stackoverflow as I am not an expert if it comes to mysql queries.
However, while this query catches quiet a few foreign descriptions, it also seems to fail sometimes and finds perfectly fine descriptions, so what I am looking for is finetuning this query to skip the "okay" ones.
Here are a few returned rows that are "okay", meaning they should not be returned:
Game Boy Jam game that uses game boy restrictions. It’s a western platform game, where you play as a sheriff of the town. Your mission is to capture all the bad bandits in the land and bring them to justice.
and one more
It's hard to be a kitten if you have such a clumsy owner! Yesterday she lost a lot of things in the park and now it's up to you to find them!
Memories of that day can be helpful – you should remember where have you seen that thing last and search there.Map also can be usefull for your task. And finally you can climb up a tree and ask a big cat for a hint – you will see all the events of that day again.
But sometimes it's not enough to just find a lost thing. Some residents of the park may already be using it for themselves – be it mice or ants. In that case you may have to bring them something in exchange for a lost thing – only then you will get it back.
and one last example
Hungry honey bee is a unique fun game. It includes the fun of a platform game, puzzle game, adventure game, role playing game. In this fantasy game, one needs to make honey bee to collect all the flowers in order to win a match. As level progresses new challenges will be introduced with gradually toughness. Overall it’s a complete blend of fun which makes one stick with the game for hours. GOI: Rating 4.5 our of 5
Please remember that I am not a mysql expert, so I can only guess what the issue is, and my guess is that some of the characters like the
’ in It’s or the characters – and :
might cause this.
Maybe someone would be willing to share a optimized query to solve this problem? I spent quiet a few time with this but given the fact that I am still a newbie with php and absolutely not an expert with REGEXP and mysql queries, it would be nice to get some help here so I can improve my knowledge. Please do not assume that I will understand anything you say if you just throw it at me, so detailed help would be wonderful.
Thanks for your time reading this.
If you're simply trying to find columns which contain non-ASCII characters, you can use the query below:
SELECT *
FROM table
WHERE column != CONVERT(column USING ASCII);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With