Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to manipulate default value retrieved from x-ray scraper (node.js)

This is my code:

var Xray = require('x-ray');  
var x = Xray();
x('http://someurl.com', 'tr td:nth-child(2)', [{  
    text: 'a',
    url: 'a@href'
  }]).write('results.json')

I need to populate the field named "text" only with the first word from each a tag. An example of a tag value:

"FirstWord SecondWord ThirdWord"

The actual result is text: FirstWord SecondWord ThirdWord

Desired result text: FirstWord

I can postprocess the result.json file but i don´t like that way.

like image 910
cesarluis Avatar asked Sep 28 '22 02:09

cesarluis


2 Answers

you can define your function in the filters, which showed in the official Github page

var Xray = require('x-ray');
var x = Xray({
  filters: {
    trim: function (value) {
      return typeof value === 'string' ? value.trim() : value
    },
    reverse: function (value) {
      return typeof value === 'string' ? value.split('').reverse().join('') : value
    },
    slice: function (value, start , end) {
      return typeof value === 'string' ? value.slice(start, end) : value
    }
  }
});

x('http://mat.io', {
  title: 'title | trim | reverse | slice:2,3'
})(function(err, obj) {
/*
  {
    title: 'oi'
  }
*/
})
like image 184
Franci Avatar answered Oct 04 '22 19:10

Franci


There is a fork of x-ray library made by cbou
It's custom x-ray API has a function prepare that can change the output
https://github.com/cbou/x-ray#xrayprepare-str--fn

Example:

function uppercase(str) {
  return str.toUpperCase();
}

xray('mat.io')
.prepare('uppercase', uppercase)
.select('title | uppercase')
.run(function(err, title) {
  // title == MAT.IO
});
like image 45
Christian Saiki Avatar answered Oct 04 '22 17:10

Christian Saiki