Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Proxying requests in Node

Tags:

node.js

proxy

I need to be able to offer replica sites (to www.google.com, www.facebook.com, etc. any site) through my node server. I found this library:

https://github.com/nodejitsu/node-http-proxy

And I used the following code when proxying requests:

options = {
  ignorePath: true,
  changeOrigin: false
}

var proxy = httpProxy.createProxyServer({options});

router.get(function(req, res) {
  proxy.web(req, res, { target: req.body.url });
});

However, this configuration causes an error for most sites. Depending on the site, I'll get an Unknown service error coming from the target url, or an Invalid host... something along those lines. However, when I pass

changeOrigin: true

I get a functioning proxy service, but my the user's browser gets redirected to the actual url of their request, not to mine (so if req.body.url = http://www.google.com, the request will go to http://www.google.com)

How can I make it so my site's url gets shown, but so that I can exactly copy whatever is being displayed? I need to be able to add a few JS files to the request, which I'm doing using another library.

For clarification, here is a summary of the problem:

  1. The user requests a resource that has a url property

  2. This url is in the form of http://www.example.com

  3. My server, running on www.pv.com, need to be able to direct the user to www.pv.com/http://www.example.com

  4. The HTTP response returned alongside www.pv.com/http://www.example.com is a full representation of http://www.example.com. I need to be able to add my own Javascript/HTML files in this response as well.

like image 657
db2791 Avatar asked Apr 03 '17 00:04

db2791


People also ask

What is proxying a request?

Overview of Proxying Requests to Another Web ServerYou proxy requests based on the URL of the incoming request. The HttpProxyServlet (provided as part of the distribution) takes an HTTP request, redirects it to the proxy URL, and sends the response to the client's browser back through WebLogic Server.

Does Nodejs support proxy?

In the standard method, the client sends a request directly to the API endpoint to fetch the desired data. In this case, the node. js proxy will act as an intermediary between the user and the API. Thus, the user makes a request to the proxy, which is then forwarded to the API endpoint.

What is node http-proxy?

node-http-proxy is an HTTP programmable proxying library that supports websockets. It is suitable for implementing components such as reverse proxies and load balancers.

What is reverse proxy in Nodejs?

Reverse proxy is a proxy server which retrieve resources on behalf of client from one or more servers. Client end need not to know about all those servers. They request to proxy server on specific URL with over HTTP and proxy server finds out where to look ( in Servers ) to serve that request.


2 Answers

Looking at https://stackoverflow.com/a/32704647/1587329, the only difference is that it uses a different target parameter:

var http = require('http');
var httpProxy = require('http-proxy');
var proxy = httpProxy.createProxyServer({});

http.createServer(function(req, res) {
    proxy.web(req, res, { target: 'http://www.google.com' });
}).listen(3000);

This would explain the Invalid host error: you need to pass a host as the target parameter, not the whole URL. Thus, the following might work:

options = {
  ignorePath: true,
  changeOrigin: false
}

var proxy = httpProxy.createProxyServer({options});

router.get(function(req, res) {
  var url = req.body.url;
  proxy.web(req, res, { target: url.protocol + '//' + url.host });
});

For the URL object, see the NodeJS website.

like image 87
serv-inc Avatar answered Sep 28 '22 17:09

serv-inc


Use a headless browser to navigate to the website and get the HTML of the website. Then send the HTML as a response for the website requested. One advantage of using a headless browser is that it allows you to get the HTML from sites rendered with JavaScript. Nightmare.js (an API or library for electron.js) is a good choice because it uses Electron.js under the hood. The electron framework is faster than Phantom.js (an alternative). With Nightmare.js you can inject a JavaScript file into the page as shown in the code snippet below. You may need to tweak the code to add other features. Currently, I am only allowed to add two links, so links to other resources are in the code snippet.

  • To setup a ubuntu server to run Nightmare.js in headless mode you have to install xvfb (an X server) and the dependencies: https://github.com/segmentio/nightmare/issues/224

apt-get update && apt-get install -y xvfb x11-xkb-utils xfonts-100dpi
xfonts-75dpi xfonts-scalable xfonts-cyrillic x11-apps clang
libdbus-1-dev libgtk2.0-dev libnotify-dev libgnome-keyring-dev
libgconf2-dev libasound2-dev libcap-dev libcups2-dev libxtst-dev
libxss1 libnss3-dev gcc-multilib g++-multilib

-

// example: http://hostname.com/http://www.tutorialspoint.com/articles/how-to-configure-and-install-redis-on-ubuntu-linux
//X server: http://www.linfo.org/x_server.html

var express = require('express')
var Nightmare = require('nightmare')// headless browser
var Xvfb = require('xvfb')// run headless browser using X server
var vo = require('vo')// run generator function
var app = express()
var xvfb = new Xvfb()


app.get('/', function (req, res) {
  res.end('')
})

// start the X server to run nightmare.js headless browser
xvfb.start(function (err, xvfbProcess) {
  if (!err) {
    app.get('/*', function (req, res) {
      var run = function * () {
        var nightmare = new Nightmare({
          show: false,
          maxAuthRetries: 10,
          waitTimeout: 100000,
          electronPath: require('electron'),
          ignoreSslErrors: 'true',
          sslProtocol: 'tlsv1'
        })

        var result = yield nightmare.goto(req.url.toString().substring(1))
        .wait()
        // .inject('js', '/path/to/.js') inject a javascript file to manipulate or inject html
        .evaluate(function () {
          return document.documentElement.outerHTML
        })
        .end()
        return result
      }

      // execute generator function
      vo(run)(function (err, result) {
        if (!err) {
          res.end(result)
        } else {
          console.log(err)
          res.status(500).end()
        }
      })
    })
  }
})

app.listen(8080, '0.0.0.0')
like image 31
Citrudev Avatar answered Sep 28 '22 15:09

Citrudev