Node.js unicode issue with HTTP response body

Question

The response body of HTTP requests using the native 'http' module, displays question mark characters for unicode chars, instead of their actual value. Here's the basic snippet of code that I'm running.

var http = require('http');
var google = http.createClient(80, 'www.google.it');
var request = google.request('GET', '/',
{
 'host': 'www.google.it',
}
  );
request.end();
request.on('response', function (response) {
  response.setEncoding('utf8');
  response.on('data', function (chunk) {
    console.log(chunk);
  });
});

In the response there's a specific word that starts with "Pubblicit". Its last letter is a weird character that shows as a question mark to me. The word should be Pubblicità, instead it is displyed as Pubblicit?.

I have also tried outputting the data using .toString():

console.log(chunk.toString());

or

console.log(chunk.toString('utf8'));

But I'm getting the same results.

Any idea?

Luca Matteis · Accepted Answer

I set response.setEncoding('binary'); and it works. No idea why though.

Reference: http://groups.google.com/group/nodejs/browse_thread/thread/3bd3935b1f42a5f4?pli=1

user943702 · Answer

Reason maybe that, if we do not specify a "googleKnownAsUTF8OK" user-agent on request header, google would response a html doc with content-type of ISO-8859-1(for old browsers,bots?i dont know), so decode the response buffer by "binary" is correct.

But, if we decode a buffer encoded in ISO-8859-1 by utf8, then the byte 0xe0(à) implies "form a character by 3bytes in a row", it is a malformed character in our case, so a few unexpected characters(depending on the environment) was displayed.

We may try "Mozilla/5.0" as value of user-agent. Good luck.

Node.js unicode issue with HTTP response body

Tags:

javascript

node.js

unicode

utf-8

v8

Luca Matteis

2 Answers

Luca Matteis

user943702

Recent Activity

Donate For Us

Node.js unicode issue with HTTP response body

Tags:

javascript

node.js

unicode

utf-8

v8

Luca Matteis

2 Answers

Luca Matteis

user943702

Related questions

Recent Activity

Donate For Us