Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

nodejs get file character encoding

How can I find out what character encoding a given text file has?

var inputFile = "filename.txt";
var file = fs.readFileSync(inputFile); 
var data = new Buffer(file, "ascii");
var fileEncoding = some_clever_function(file);
if (fileEncoding !== "utf8") {
    // do something
}

Thanks

like image 814
6bytes Avatar asked Apr 21 '16 18:04

6bytes


People also ask

What is UTF with BOM?

The UTF-8 file signature (commonly also called a "BOM") identifies the encoding format rather than the byte order of the document. UTF-8 is a linear sequence of bytes and not sequence of 2-byte or 4-byte units where the byte order is important.

What encoding does Nodejs use?

The character encodings currently supported by Node.js are the following: 'utf8' (alias: 'utf-8' ): Multi-byte encoded Unicode characters. Many web pages and other document formats use UTF-8. This is the default character encoding.

How can I tell the encoding of a file Vscode?

You can view the file encoding in the status bar. Click on the encoding in the status bar to reopen or save the active file with a different encoding. Then choose an encoding.


2 Answers

You can try to use external module, such as https://www.npmjs.com/package/detect-character-encoding

like image 69
RidgeA Avatar answered Sep 23 '22 01:09

RidgeA


The previously mentioned module works for me too. Alternatively you could have a look at detect-file-encoding-and-language which I'm using at the moment.

Installation:

$ npm install detect-file-encoding-and-language

Usage:

// index.js

const languageEncoding = require("detect-file-encoding-and-language");

const pathToFile = "/home/username/documents/my-text-file.txt"

languageEncoding(pathToFile).then(fileInfo => console.log(fileInfo));
// Possible result: { language: japanese, encoding: Shift-JIS, confidence: { language: 0.97, encoding: 1 } }
like image 28
Falaen Avatar answered Sep 25 '22 01:09

Falaen