Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make a text file have more than one encoding?

I have a file which is ANSI encoded. However it shows Arabic letters inside it. this text file was generated by some program (I have no info on) but it seems like there is some kind of internal encoding (if I might say and if it's possible) for the Arabic letters to make appear.

Is there such a thing? If not, how can the ANSI file show the Arabic letters?

*If possible explain in Java code


Edition 01

When I open it in Notepad++ it shows that the page encoding is ANSI. Please check this photo:

http://www.4shared.com/file/221862075/e8705951/text-Windows.html


Edition 02

you can check the file at from:

http://www.4shared.com/file/221853641/3fa1af8c/data.html

like image 326
M. A. Kishawy Avatar asked Feb 14 '10 12:02

M. A. Kishawy


People also ask

Are .txt files UTF-8?

Most Microsoft Windows text files use "ANSI", "OEM", "Unicode" or "UTF-8" encoding.

Do text files have encoding?

An encoding converts a sequence of code points to a sequence of bytes. An encoding is typically used when writing text to a file. To read it back in we have to know how it was encoded and decode it back into memory. A text encoding is basically a file format for text files.

Can UTF-8 use more than 8 bits?

UTF-8 is a variable length encoding with a minimum of 8 bits per character. Characters with higher code points will take up to 32 bits.


1 Answers

How do you know that it's ANSI encoded? If it's not a multi-byte encoding like UTF-8, my guess would be it's encoded using an arabic code page like this one: Windows-1256.

You could look at the file in a Hex editor and find out what numbers the arabic characters have and that way try to find out which encoding / code page it was created with.

like image 67
Pekka Avatar answered Sep 19 '22 00:09

Pekka