Dynamic HTTP proxy with Node.js and Redis

Today i experimented with dynamic HTTP proxy server based on Node.js and Redis. I wanted a simple proxy that could route external requests for different hosts to the internal grid of workers. Worker is just a application server process that listens on specific ip address and port.

This experiment is a part of something bigger, an internal mini-cloud that is similar to what heroku does. Except its purpose to serve staging apps or quick deployments for people that aren't familiar with deployments practices.

Concept

Given there are 3 application servers (for ruby, php and python apps), each server runs a bunch of apps and sits behind public domain *.cloud.com. So, when user goes to foo.cloud.com request should be proxied to an internal worker address that is stored in redis database. Example worker map:

app1.cloud.com => 192.168.1.1:9000
app2.cloud.com => 192.168.1.2:9001
app3.cloud.com => 192.168.1.3:9002

At any time this map could be altered by controlling software (admin panel, monitoring, etc).

In most cases i would use nginx as a reverse proxy in front of internal workers. It has a decent set of features, easy configuration and small memory footprint. Configuration is pretty simple but unfortunately does not support dynamic upstream routing. Eventually, i wanted something like this insite my nginx config:

upstream internal_grid {
  dynamic: true;
  backend: 192.168.1.1:6379 fail_timeout=0; # redis backend
  backend: 192.168.1.2:6379 fail_timeout=0; # backup redis backend
}

After some research i a found few ways to implement what i needed:

Using nginx resolver. - http://wiki.nginx.org/HttpCoreModule#resolver
Using few nginx customizations and plugins - http://openresty.org/#DynamicRoutingBasedOnRedis
Implement custom proxy logic with node.js http proxy lib - https://github.com/nodejitsu/node-http-proxy

I decided that heavy-patching nginx could be tricky and take some time to adjust. Since my concept is just a proof-of-concept non-production server, my choice went to node.js.

Node.js has a decent amount of libraries available for developers and node-http-proxy is one of them (made by nodejitsu). Documentation is great and the amount of code to go though is reasonable small. Plus it supports websockets and ssl which is also great but not that important.

Dynamic HTTP Proxy

Routings start when request hits the server on 127.0.0.1:8000. For better experience i'd put some caching server like varnish in front. After that point an internal logic should figure out a routing target, which is just a hash like this: {host: 'IP', port: 1234}. Routes are being stored as redis hash in key-value pairs representing hostname and destination and also being cached by proxy server for better performance.

ProxyRouters implementation:

var ProxyRouter = function(options) {
  if (!options.backend) {
    throw "ProxyRouter backend required. Please provide options.backend parameter!";
  }

  this.backend   = options.backend;
  this.cache_ttl = (options.cache_ttl || 10) * 1000;
  this.cache     = {};

  console.log("ProxyRouter cache TTL is set to " + this.cache_ttl + " ms.");
};

ProxyRouter.prototype.lookup = function(hostname, callback) {
  var self = this;
  if (!this.cache[hostname]) {
    client.hget('routes', hostname, function(err, data) {
      if (data) {
        // Lookup route
        var route = data.split(':');
        var target = {host: route[0], port: route[1]};
        
        // Set cache and expiration
        self.cache[hostname] = target;
        self.expire_route(hostname, self.cache_ttl);
        
        // Return target
        callback(target);
      }
      else {
        callback(null);
      }
    });
  }
  else {
    callback(this.cache[hostname]);
  }
};

ProxyRouter.prototype.flush = function() {
  this.cache = {};
};

ProxyRouter.prototype.flush_route = function(hostname) {
  delete(this.cache[hostname]);
};

ProxyRouter.prototype.expire_route = function(hostname, ttl) {
  var self = this;
  setTimeout(function() {
    self.flush_route(hostname);
  }, ttl);
};

ProxyRouter turned out to be a pretty simple piece in this puzzle. Now, lets implement the server itself:

var redis     = require("redis"),
    http      = require('http'),
    httpProxy = require('http-proxy');

var client = redis.createClient();

var proxyRouter = new ProxyRouter({
  backend: redis.createClient(),
  cache_ttl: 5
});

var proxyServer = httpProxy.createServer(function (req, res, proxy) {
  var buffer = httpProxy.buffer(req);
  var hostname = req.headers.host.split(':')[0];

  proxyRouter.lookup(hostname, function(route) {
    if (route) {
      proxy.proxyRequest(req, res, {
        host: route.host,
        port: route.port,
        buffer: buffer
      });
    } 
    else {
      try {
        res.writeHead(404);
        res.end();
      } 
      catch (er) {
        console.error("res.writeHead/res.end error: %s", er.message);
      }
    }
  });
}).listen(8000);

Start the server:

node server.js

In case if you're missing modules:

nmp install http-proxy redis hiredis

To test it out you just need to spin up few local apps on different ports, and define routing map. I prefer using telnet when dealing with redis:

HSET routes myapp1.cloud.com 127.0.0.1:3000
HSET routes myapp2.cloud.com 127.0.0.1:4000

Once you setup routes you'll be able to route requests to workers. What does it give you? Well, this is just a simple concept implementation, but with more hacking on it you probably could add a control layer, like spawning new instances of apps, assigning destination and checking on process statuses. In case if process goes does down it could be spawned automatically and requests would be routed to the new location.

Keep experimenting.