Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i open a Windows-1255 encoded file in Node.js?

I have a file in Windows-1255 (Hebrew) encoding, and i'd like to be able to access it in Node.js.

I tried opening the file with fs.readFile, and it gives me a Buffer that i can't do anything with. I tried setting the encoding to Windows-1255, but that wasn't recognized.

I also checked out the windows-1255 package, but i couldn't decode with that, because fs.readFile either gives a Buffer or a UTF8 string, and the package requires a 1255-encoded string.

How can i read a Windows-1255-encoded file in Node.js?

like image 936
Scimonster Avatar asked Oct 29 '14 09:10

Scimonster


People also ask

How do I open a node js file in Windows?

Open your terminal right inside VS Code by selecting View > Terminal (or select Ctrl+`, using the backtick character). If you need to change the default terminal, select the dropdown menu and choose Select Default Shell. In the terminal, enter: node app. js .

How do I view local files in node JS?

Begin by creating the object from the fs module you will use in your program: const fs = require('fs'); Now that you have required the module and have it in an object called fs you can access all its methods. The one we are going to access in this occassion is the one mentioned earlier: writeFile().

Which method is used to read files on your computer in Node JS?

The fs.readFile() method is used to read files on your computer.


1 Answers

It seems that using the node-iconv package is the best way. Unfortunately iconv-lite which is easier to include in your code does not seem to implement transcoding for CP1255.

This thread & answer shows simple example and concisely demonstrates using both these modules.

Returning to iconv, I've had some problems installing on debian with npm prefix, and I submitted an issue to the maintainer here. I managed to workaround the issue sudo-ing the install, and the "sudo chown"-ing back to me the installed module.

I have tested various win-xxxx encodings and CodePages that have access to (Western+Eastern European samples).

But I could not make it work with CP1255 although it is listed in their supported encodings, because I do not have that specific codepage installed locally, and it gets all mangled up. I tried stealing some Hebrew script from this page, but the pasted version was always corrupted. I dared not actually install the language on my Windows machine for fear I don't brick it - sorry.

// sample.js
var Iconv = require('iconv').Iconv;
var fs = require('fs');

function decode(content) {
  var iconv = new Iconv('CP1255', 'UTF-8//TRANSLIT//IGNORE');
  var buffer = iconv.convert(content);
  return buffer.toString('utf8');
};

console.log(decode(fs.readFileSync('sample.txt')));

Extra (off topic) explanations for dealing with file encodings, and how to read files through Node.js buffers:

fs.readFile returns a buffer by default.

// force the data to be string with the second optional argument
fs.readFile(file, {encoding:'utf8'}, function(error, string) {
    console.log('raw string:', string);// autoconvert to a native string
});

OR

// use the raw return buffer and do bitwise processing on the encoded bytestream
fs.readFile(file, function(error, buffer) {
    console.log(buffer.toString('utf8'));// process the binary buffer
});
like image 113
cdanea Avatar answered Sep 19 '22 21:09

cdanea