I'm trying to load a UTF8 json file from disk using node.js (0.10.29) on Windows 8.1. The following is the code that runs:
var http = require('http');
var utils = require('util');
var path = require('path');
var fs = require('fs');
var myconfig;
fs.readFile('./myconfig.json', 'utf8', function (err, data) {
if (err) {
console.log("ERROR: Configuration load - " + err);
throw err;
} else {
try {
myconfig = JSON.parse(data);
console.log("Configuration loaded successfully");
}
catch (ex) {
console.log("ERROR: Configuration parse - " + err);
}
}
});
I get the following error when I run this:
SyntaxError: Unexpected token ´╗┐
at Object.parse (native)
...
Now, when I change the file encoding (using Notepad++) to ANSI, it works without a problem.
Any ideas why this is the case? Whilst development is being done on Windows the final solution will be deployed to a variety of non-Windows servers, I'm worried that I'll run into issues on the server end if I deploy an ANSI file to Linux, for example.
According to my searches here and via Google the code should work on Windows as I am specifically telling it to expect a UTF-8 file.
Sample config I am reading:
{
"ListenIP4": "10.10.1.1",
"ListenPort": 8080
}
Per "fs.readFileSync(filename, 'utf8') doesn't strip BOM markers #1918", fs.readFile
is working as designed: BOM is not stripped from the header of the UTF-8 file, if it exists. It at the discretion of the developer to handle this.
Possible workarounds:
data = data.replace(/^\uFEFF/, '');
per https://github.com/joyent/node/issues/1918#issuecomment-2480359
bomstrip
per https://github.com/joyent/node/issues/1918#issuecomment-38491548
What you are getting is the byte order mark header (BOM) of the UTF-8 file. When JSON.parse
sees this, it gives an syntax error (read: "unexpected character" error). You must strip the byte order mark from the file before passing it to JSON.parse
:
fs.readFile('./myconfig.json', 'utf8', function (err, data) {
myconfig = JSON.parse(data.toString('utf8').replace(/^\uFEFF/, ''));
});
// note: data is an instance of Buffer
To get this to work without I had to change the encoding from "UTF-8" to "UTF-8 without BOM" using Notepad++ (I assume any decent text editor - not Notepad - has the ability to choose this encoding type).
This solution meant that the deployment guys could deploy to Unix without a hassle, and I could develop without errors during the reading of the file.
In terms of reading the file, the other response I sometimes got in my travels was a question mark appended before the start of the file contents, when trying various encoding options. Naturally with a question mark or ANSI characters appended the JSON.parse fails.
Hope this helps someone!
New answer
As i had the same problem with several different formats I went ahead and made a npm that try to read textfiles and parse it as text, no matter the original format. (as original question was to read a .json it would fit perfect). (files without BOM and unknown BOM is handled as ASCII/latin1)
https://www.npmjs.com/package/textfilereader
So change the code to
var http = require('http');
var utils = require('util');
var path = require('path');
var fs = require('textfilereader');
var myconfig;
fs.readFile('./myconfig.json', 'utf8', function (err, data) {
if (err) {
console.log("ERROR: Configuration load - " + err);
throw err;
} else {
try {
myconfig = JSON.parse(data);
console.log("Configuration loaded successfully");
}
catch (ex) {
console.log("ERROR: Configuration parse - " + err);
}
}
});
Old answer
Run into this problem today and created function to take care of it. Should have a very small footprint, assume it's better than the accepted replace solution.
function removeBom(input) {
// All alternatives found on https://en.wikipedia.org/wiki/Byte_order_mark
const fc = input[0].charCodeAt(0).toString(16);
switch (fc) {
case 'efbbbf': // UTF-8
case 'feff': // UTF-16 (BE) + UTF-32 (BE)
case 'fffe': // UTF-16 (LE)
case 'fffe0000': // UTF-32 (LE)
case '2B2F76': // UTF-7
case 'f7644c': // UTF-1
case 'dd736673': // UTF-EBCDIC
case 'efeff': // SCSU
case 'fbee28': // BOCU-1
case '84319533': // GB-18030
return input.slice(1);
break;
default:
return input;
}
}
const fileBuffer = removeBom(fs.readFileSync(filePath, "utf8"));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With