High Availability High Performance Web Cache
This material is work-in-progress ... Do not follow instructions here until this notice is removed. |
Introduction
This document explains how to use HAProxy and ucarp to provide high performance and high-availability services.
In this document we will use the Squid web cache as the example service. Squid typically uses only a single processor, even on a multi-processor machine. To get increased web-caching performance, it is better to scale the web cache out across multiple (cheap) physical boxes. Although web caching is used as the example service, this document applies to other services, such as mail, web acceleration, etc.
Network Diagram
In the end, we will have an architecture that looks like this:
The workstations all connect to the HAProxy instance at 192.168.1.10. 192.168.1.10 is a virtual IP controlled by ucarp; that is, HAProxy runs on one of the web cache servers at any given time, but any of the web caches can be the HAProxy instance.
HAProxy distributes the web traffic across all live web cache servers, which cache the resources from the Internet.
Benefits
- The HAProxy server in the diagram is 'virtual' - it represents the service running on any of the web cache servers
- Each web cache server is configured as a mirror of the others - this simplifies adding additional capacity.
- HAProxy will ignore servers that have failed or been taken offline, and notices when they are returned to service
- This configuration allows individual servers to be upgraded or modified in a "rolling blackout", with no downtime for users.
- Ucarp automatically restarts the HAProxy service on another cache if the server running HAProxy crashes. This is automatic recovery with typically less that 3 seconds of downtime from the clients perspective.
Initial Services
The first step in getting high-availability is to have more than one server; do the following on each of cache1-4
- install squid
apk add squid
- create a minimal /etc/squid/squid.conf
acl all src all acl localhost src 127.0.0.1/32 acl localnet src 10.0.0.0/8 # RFC1918 possible internal network acl localnet src 172.16.0.0/12 # RFC1918 possible internal network acl localnet src 192.168.0.0/16 # RFC1918 possible internal network http_access allow localnet http_access allow localhost http_access deny all http_port 3129 forwarded_for off
- ensure squid starts on boot
rc_update add squid /etc/init.d/squid start
At this point, you should be able to set your browser to use any of 192.168.1.1[1-4]:3129 as a proxy address, and get to the Internet. Because this config file does not use any optimizations, browsing will be slower than normal. This is normal. Any optimizations to the squid configuration you make to one server can be applied to all in the array. The purpose of this example is to show that the service is uniform across the array.