Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript Convert ansi to utf8

Tags:

jquery

utf-8

ansi

i'm trying with this plugin jquery.csvToTable to show data from csv to web page , cvs file has encoding ansi with Japanese text , but webpage has encoding utf8 , and js is not working with ansi , how is possible to convert or if exist another method data $.get(csvFile, function(data) { data to utf8 , sorry for my bad English , thanks a lot !

like image 767
mIRU Avatar asked Oct 25 '22 06:10

mIRU


2 Answers

By the time you're dealing with a string in JavaScript, you're dealing with UTF-16. It's up to the browser to ensure that any data it passes to the JavaScript layer has been transformed correctly.

In this case, because you're using $.get, that means that the ajax layer has to know what it's dealing with. You'll need to ensure that your server is returning the correct charset information in the HTTP response containing the CSV file, so the browser knows what format the character data is in. Once you're doing that, the browser should do any necessary transformation from the original to UTF-16 for JavaScript.

Specifically, if your CSV file is in the character set Windows-1252 (sometimes people call that "ANSI" although it's not correct), this would be the Content-Type header your server should return with the file:

Content-Type: text/csv; charset=windows-1252

...but if your content is Japanese, I wouldn't think it would be in Windows-1252, which is a (very limited) Latin character set. If you're using Windows, it's more likely to be Code Page 932, which would be:

Content-Type: text/csv; charset=Windows-31J

...or if you're using *nix, perhaps EUC-JP:

Content-Type: text/csv; charset=EUC-JP

You can learn more about charsets in this W3C document. (This article by Joel Spolsky is also quite helpful.) Further information about JavaScript's strings can be found in the specification (but basically, each "character" in a JavaScript string is a UTF-16 word, which means characters requiring two words show up as two "characters" in the string — not ideal for some texts, particularly east-asian ones, but when it was being defined, RAM was still precious...).

like image 157
T.J. Crowder Avatar answered Oct 27 '22 11:10

T.J. Crowder


I use this functions:

function encode_utf8(s) {
  return unescape(encodeURIComponent(s));
}

function decode_utf8(s) {
  return decodeURIComponent(escape(s));
}

Extracted from: http://ecmanaut.blogspot.com.ar/2006/07/encoding-decoding-utf8-in-javascript.html

like image 36
Walter Otto Krause Avatar answered Oct 27 '22 11:10

Walter Otto Krause