I have Arabic text (.sql pure text). When I view it in any document, it shows like this:
ØØ±Ù اول Ø§Ù„ÙØ¨Ø§Ù‰ انگليسى ØŒ ØØ±Ù اضاÙÙ‡ مثبت
But when I use an HTML document with <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>, it shows properly like this:
حرف اول الفباى انگليسى ، حرف اضافه مثبت
How can I convert it to readable text?
The Arabic text has been encoded to bytes using UTF-8.
You are explicitly telling the HTML document that the bytes are encoded in UTF-8, which is why any HTML viewer will be able to display the text correctly.
However, any other text viewer will not know the bytes are encoded in UTF-8, unless you put a UTF-8 BOM in front of the text, and the viewer supports BOMs. Otherwise, as you are seeing, a text viewer may instead interpret the bytes in Latin-1 or similar encoding instead. So, you would have to manually tell the text viewer to interpret the bytes as UTF-8 instead. But how you actually do that depends on the particular text viewer you are using. Not all viewers offer this option.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With