I have a file which is ANSI encoded. However it shows Arabic letters inside it. this text file was generated by some program (I have no info on) but it seems like there is some kind of internal encoding (if I might say and if it's possible) for the Arabic letters to make appear.
Is there such a thing? If not, how can the ANSI file show the Arabic letters?
*If possible explain in Java code
Edition 01
When I open it in Notepad++ it shows that the page encoding is ANSI. Please check this photo:
http://www.4shared.com/file/221862075/e8705951/text-Windows.html
Edition 02
you can check the file at from:
http://www.4shared.com/file/221853641/3fa1af8c/data.html
Most Microsoft Windows text files use "ANSI", "OEM", "Unicode" or "UTF-8" encoding.
An encoding converts a sequence of code points to a sequence of bytes. An encoding is typically used when writing text to a file. To read it back in we have to know how it was encoded and decode it back into memory. A text encoding is basically a file format for text files.
UTF-8 is a variable length encoding with a minimum of 8 bits per character. Characters with higher code points will take up to 32 bits.
How do you know that it's ANSI encoded? If it's not a multi-byte encoding like UTF-8, my guess would be it's encoded using an arabic code page like this one: Windows-1256.
You could look at the file in a Hex editor and find out what numbers the arabic characters have and that way try to find out which encoding / code page it was created with.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With