Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Response encoding with node.js "request" module

I am trying to get data from the Bing search API, and since the existing libraries seem to be based on old discontinued APIs I though I'd try myself using the request library, which appears to be the most common library for this. My code looks like

var SKEY           =  "myKey...." , 
    ServiceRootURL =  'https://api.datamarket.azure.com/Bing/Search/v1/Composite';

function getBingData(query, top, skip, cb) {
    var params = {
         Sources: "'web'", 
         Query: "'"+query+"'", 
         '$format': "JSON", 
         '$top': top, '$skip': skip
       },
       req = request.get(ServiceRootURL).auth(SKEY, SKEY, false).qs(params);
    request(req, cb)
}

getBingData("bookline.hu", 50, 0, someCallbackWhichParsesTheBody)

Bing returns some JSON and I can work with it sometimes but if the response body contains a large amount of non ASCII characters JSON.parse complains that the string is malformed. I tried switching to an ATOM content type, but there was no difference, the xml was invalid. Inspecting the response body as available in the request() callback actually shows bad code.

So I tried the same request with some python code, and that appears to work fine all the time. For reference:

r = requests.get(
       'https://api.datamarket.azure.com/Bing/Search/v1/Composite?Sources=%27web%27&Query=%27sexy%20cosplay%20girls%27&$format=json', 
        auth=HTTPBasicAuth(SKEY,SKEY))
stuffWithResponse(r.json())

I am unable to reproduce the problem with smaller responses (e.g. limiting the number of results) and unable to identify a single result which causes the issue (by stepping up the offset). My impression is that the response gets read in chunks, transcoded somehow and reassembled back in a bad way, which means the json/atom data becomes invalid if some multibyte character gets split, which happens on larger responses but not small ones.

Being new to node, I am not sure if there is something I should be doing (setting the encoding somewhere? Bing returns UTF-8, so this doesn't seem needed).

Anyone has any idea of what is going on?

FWIW, I'm on OSX 10.8, node is v0.8.20 installed via macports, request is v2.14.0 installed via npm.

like image 213
riffraff Avatar asked Feb 22 '13 14:02

riffraff


1 Answers

i'm not sure about the request library but the default nodejs one works well for me. It also seems a lot easier to read than your library and does indeed come back in chunks.

http://nodejs.org/api/http.html#http_http_request_options_callback or for https (like your req) http://nodejs.org/api/https.html#https_https_request_options_callback (the same really though)

For the options a little tip: use url parse

var url = require('url');

var params = '{}'

var dataURL = url.parse(ServiceRootURL);
var post_options = {  
    hostname: dataURL.hostname,
    port: dataURL.port || 80,
    path: dataURL.path,
    method: 'GET',  
    headers: {  
        'Content-Type': 'application/json; charset=utf-8',  
        'Content-Length': params.length  
    }  
};

obviously params needs to be the data you want to send

like image 174
rob_james Avatar answered Sep 23 '22 08:09

rob_james