I need to be able to offer replica sites (to www.google.com, www.facebook.com, etc. any site) through my node server. I found this library:
https://github.com/nodejitsu/node-http-proxy
And I used the following code when proxying requests:
options = {
ignorePath: true,
changeOrigin: false
}
var proxy = httpProxy.createProxyServer({options});
router.get(function(req, res) {
proxy.web(req, res, { target: req.body.url });
});
However, this configuration causes an error for most sites. Depending on the site, I'll get an Unknown service
error coming from the target url, or an Invalid host
... something along those lines. However, when I pass
changeOrigin: true
I get a functioning proxy service, but my the user's browser gets redirected to the actual url of their request, not to mine (so if req.body.url = http://www.google.com
, the request will go to http://www.google.com
)
How can I make it so my site's url gets shown, but so that I can exactly copy whatever is being displayed? I need to be able to add a few JS files to the request, which I'm doing using another library.
For clarification, here is a summary of the problem:
The user requests a resource that has a url
property
This url
is in the form of http://www.example.com
My server, running on www.pv.com
, need to be able to direct the
user to www.pv.com/http://www.example.com
The HTTP response returned alongside
www.pv.com/http://www.example.com
is a full representation of
http://www.example.com
. I need to be able to add my own
Javascript/HTML files in this response as well.
Overview of Proxying Requests to Another Web ServerYou proxy requests based on the URL of the incoming request. The HttpProxyServlet (provided as part of the distribution) takes an HTTP request, redirects it to the proxy URL, and sends the response to the client's browser back through WebLogic Server.
In the standard method, the client sends a request directly to the API endpoint to fetch the desired data. In this case, the node. js proxy will act as an intermediary between the user and the API. Thus, the user makes a request to the proxy, which is then forwarded to the API endpoint.
node-http-proxy is an HTTP programmable proxying library that supports websockets. It is suitable for implementing components such as reverse proxies and load balancers.
Reverse proxy is a proxy server which retrieve resources on behalf of client from one or more servers. Client end need not to know about all those servers. They request to proxy server on specific URL with over HTTP and proxy server finds out where to look ( in Servers ) to serve that request.
Looking at https://stackoverflow.com/a/32704647/1587329, the only difference is that it uses a different target parameter:
var http = require('http'); var httpProxy = require('http-proxy'); var proxy = httpProxy.createProxyServer({}); http.createServer(function(req, res) { proxy.web(req, res, { target: 'http://www.google.com' }); }).listen(3000);
This would explain the Invalid host
error: you need to pass a host as the target
parameter, not the whole URL. Thus, the following might work:
options = {
ignorePath: true,
changeOrigin: false
}
var proxy = httpProxy.createProxyServer({options});
router.get(function(req, res) {
var url = req.body.url;
proxy.web(req, res, { target: url.protocol + '//' + url.host });
});
For the URL object, see the NodeJS website.
Use a headless browser to navigate to the website and get the HTML of the website. Then send the HTML as a response for the website requested. One advantage of using a headless browser is that it allows you to get the HTML from sites rendered with JavaScript. Nightmare.js (an API or library for electron.js) is a good choice because it uses Electron.js under the hood. The electron framework is faster than Phantom.js (an alternative). With Nightmare.js you can inject a JavaScript file into the page as shown in the code snippet below. You may need to tweak the code to add other features. Currently, I am only allowed to add two links, so links to other resources are in the code snippet.
apt-get update && apt-get install -y xvfb x11-xkb-utils xfonts-100dpi
xfonts-75dpi xfonts-scalable xfonts-cyrillic x11-apps clang
libdbus-1-dev libgtk2.0-dev libnotify-dev libgnome-keyring-dev
libgconf2-dev libasound2-dev libcap-dev libcups2-dev libxtst-dev
libxss1 libnss3-dev gcc-multilib g++-multilib
-
// example: http://hostname.com/http://www.tutorialspoint.com/articles/how-to-configure-and-install-redis-on-ubuntu-linux
//X server: http://www.linfo.org/x_server.html
var express = require('express')
var Nightmare = require('nightmare')// headless browser
var Xvfb = require('xvfb')// run headless browser using X server
var vo = require('vo')// run generator function
var app = express()
var xvfb = new Xvfb()
app.get('/', function (req, res) {
res.end('')
})
// start the X server to run nightmare.js headless browser
xvfb.start(function (err, xvfbProcess) {
if (!err) {
app.get('/*', function (req, res) {
var run = function * () {
var nightmare = new Nightmare({
show: false,
maxAuthRetries: 10,
waitTimeout: 100000,
electronPath: require('electron'),
ignoreSslErrors: 'true',
sslProtocol: 'tlsv1'
})
var result = yield nightmare.goto(req.url.toString().substring(1))
.wait()
// .inject('js', '/path/to/.js') inject a javascript file to manipulate or inject html
.evaluate(function () {
return document.documentElement.outerHTML
})
.end()
return result
}
// execute generator function
vo(run)(function (err, result) {
if (!err) {
res.end(result)
} else {
console.log(err)
res.status(500).end()
}
})
})
}
})
app.listen(8080, '0.0.0.0')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With