Canary Deployment with NGINX

August 5, 2015

You have just deployed the first version of your web service, your users like it and can’t stop using it, and now you contemplate deploying an upgraded version of the service such that:

  • There should be no interruption of the service while you deploy and configure the new version, even if this involves stopping and restarting several processes that make up your service.
  • You can run both versions in parallel for a while, each version on a controlled portion of the traffic, to validate that the new version of the service behaves as expected. This is known as canary-deployment, or blue-green deployment, and is also an instance of A/B testing.
  • The traffic control mechanism should be configurable without interruption of the service. For example, should the new version appear to misbehave, the traffic can be quickly directed to revert to the old version.
  • The traffic control mechanisms should allow both percentage-based control (e.g., 1% of the public traffic should be directed to the new version), and also client-dependent control, such as based on IP address, or HTTP headers such as User-agent and Referrer (e.g., employees or known testers, should be directed to the new version), or a combination thereof.
  • The traffic control mechanisms should ensure “stickiness”: all HTTP requests from a client should end up being serviced by the same version, for a given traffic-control configuration.
  • The separate versions are all using the same database. This is a tall order, because it requires you to design the web app with forward and backward compatibility. This is not as hard as it seems, and is essential for achieving incremental deployment and rollback. We’ll talk about how to achieve this in a future post.

Canary deployment is a very useful technique and there are multiple mechanisms that one can use to implement it. In this post, I describe how to do it if you are using NGINX as your front web server, with the examples specifically for a Django uWSGI-based web service, although a similar configuration can be used for other kinds of web services. In a future post, I will describe a slightly more complex way to achieve the same using HAProxy.
I have learned these techniques while developing the Touchstone web service for Conviva Inc.  Conviva is a leader in video-quality analytics and optimization, and Touchstone is a proprietary web service that Conviva customers use to test and fine-tune their integration of the Conviva libraries in video players. Since Conviva has customers in many time zones, and there are a number of automated continuous-testing tools that use the Touchstone web service, there is no convenient time to interrupt the service to deploy upgrades.

Let us now describe an example traffic-control configuration that we want to achieve (as shown in the figure below):

  • We want to deploy three separate instances of the web service, each with its own set of static asset files, and its own uWSGI server running possibly different versions of the Django application. We call these versions “alpha” (early-adopter least-tested version), “beta” (believed to be ready for general availability), and “ga” (hardened, general availability version). Note that all these instances of the Django application use the same database, and all the forward and backward compatibility support for the persisted data is built-in the application itself.
  • We identify clients coming from the “employees” and “tester” groups, based on their public IP addresses. We want to send 100% of traffic from “employees”, and 30% of traffic from “tester” group to the “alpha” instance. The “beta” instance will get the rest of the “tester” traffic and also 1% of the public traffic. Finally, 99% of the public traffic should go to the “ga” instance.

NGINX is a high-performance web server and reverse-proxy server and it has a flexible configuration language to control how it handles the incoming requests. You already know that a good deployment design for a web service should use a real web server for the static assets, and for HTTPS termination, with a reverse-proxy setup to your actual application (called an “upstream” application in this context). NGINX is a very good choice as a web server, and as we show here it turns out that it can do quite a bit more for your deployment than serving static files and terminating HTTPS connections.
We will take advantage of the following features of the NGINX configuration language:

  • NGINX configurations can use variables, which are set for each request based on the contents of the incoming HTTP request. The variables can then be used to compute other variables, and ultimately to control how each request is handled, such as to what upstream application it is proxied to or from what directory are the static files served.
  • The configuration directive geo sets the value of a variable based on the IP address from which a request is coming:

    # The "$ip_range" variable is set to either “employees”, or “testers”,
    # or “public”, based on the client's IP address
    geo $ip_range { employees; # IP addresses of our office testers; # IP address of our testers
    default public; # Everybody else

  • The directive split_clients sets the value of a variable based on a randomized percentage split with customized stickiness. In the example below, the value of the variable $remote_addr (the client IP address) is concatenated with the string “AAA” and the result is hashed into a 32-bit number. The value of the defined variable is set based on where in the 32-bit range the hash value falls:

    # The "$split" variable is set to different percentage ranges
    # sticky by remote_addr (client IP address)
    split_clients "${remote_addr}AAA" $split {
    1% fraction1; # 1% of remote addresses assigned to "fraction1"
    30% fraction2; # 30% of remote addresses assigned to "fraction2"
    * other; # rest to "other"

    This scheme guarantees that two requests with the same IP address will result in the same value for the “$split” variable. This scheme can be adapted by using other variables in place of, or in addition to “$remote_addr”, in the split_client directive.
    Note that the notation “${remote_addr}AAA” performs string interpolation, computing a string based on the value of the $remote_addr variable concatenated with “AAA”.

  • The map directive can be used to compute the value of a variable conditionally based on other variables. In the example below, the “$instance” variable is set based on a concatenation of the $ip_range and $split variables computed above, using regular expression matches.

    # The "$instance" variable is set based on $ip_range and $split.
    map "${ip_range}_${split}" $instance {
    "~^employees_.*" alpha; # everything from "employees" to "alpha"
    "~^testers_fraction2$" alpha; # 30% from "testers" to "alpha"
    "~^testers_.*" beta; # the rest from "testers" to "beta"
    "~^public_fraction1$" beta; # 1% of the public to "beta"
    default ga; # everything else to "ga"

    The “~” prefix tells “map” to use regular expression matching, and the different clauses of “map” are evaluated in order until one matches.

All we have to do now is to use the value of the $instance variable to decide which instance of the web application to proxy requests to, as shown below:

# uUSGI upstream handlers, separate UNIX sockets for each instance
upstream app_alpha_django {
server unix:///opt/app_alpha/uwsgi.sock;

upstream app_beta_django {
server unix:///opt/app_beta/uwsgi.sock;

upstream app_ga_django {
server unix:///opt/app_ga/uwsgi.sock;

server {
listen 80;
location /static {
# Each instance has its own static files
root /opt/app_${instance}/static;

# Define the common parameters for all uwsgi requests
location / {
uwsgi_param QUERY_STRING $query_string;
uwsgi_param REQUEST_METHOD $request_method;
… more standard uwsgi parameters …
# Each instance has its own upstream server
uwsgi_pass app_${instance}_django;

That’s it! Well, almost. I also use a script to generate the above configuration based on a specification of the different instances, the IP ranges, and the percentages. For example, the script can quickly generate a configuration that forces all traffic to a single instance. Finally, the script tests the NGINX configuration, and only then tells NGINX to load the new configuration (something that NGINX can do without dropping requests):

# Make sure you tell nginx to test the configuration before reloading
sudo /usr/sbin/nginx -t
sudo kill -HUP `cat /var/run/`

I hope this post will help you make the most out of NGINX, and perhaps motivate you to dig the manual for even more goodies.