Setting up Explicit Squid Proxy

Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages. Squid has extensive access controls and makes a great server accelerator. It is licensed under the GNU GPL.

If you are looking to setup a transparent squid proxy, see this page

Terminology

client

A client is often considered a user of a PC or similar system, but more accurately a client is the applications a person uses to access web pages and other resources, and the OS they are running on.

proxy

A proxy is a device which makes connections on behalf of clients. If we consider a common TCP connection, there is one TCP connection between the client (source) and the proxy, and a separate TCP connection between the proxy and the server (destination). Consider this beautiful diagram:

Client<---------->|PROXY|<------------>Server
	   A			B

Point A is the client-side connection and point B is the server-side connection.

These are separate, distinct connections, so for example the client-side could be encrypted and the server-side in plaintext (unencrypted), or the client-side could use browser user-agent header x and the server-side connection could use browser user-agent header y.

The proxy is effectively acting as a server to the client, and as a client to the server (OCS). Without a proxy, the connection would simply be from client to server. The destination server is often referred to as the 'OCS' or 'Origin Content Server' - this simply means the server hosting the objects that the client requests (for example the web pages that you want).

The above is of course a simplified version of things. Other factors, such as the HTTP version of the client browser, or existence of the object in cache on the proxy, will have impacts on how many server-side connections are created.

explicit forward proxy

An explicit proxy is one in which the client is explicitly configured to use the proxy, and as such are aware of the existence of the proxy on the network. When the client sends packets to an explicit proxy, they are addressed to the proxy server listening address and port. Squid usually listens for explicit traffic on TCP port 3128 but TCP port 8080 is a common explicit proxy listening port. AFAIK all explicit proxy deployments are forward proxy deployments, where the clients can make use of the caching and optimisation features of the proxy when making outbound requests. An explicit proxy can be involved in authentication of the client. This article discusses this type of proxy deployment.

transparent forward proxy

A transparent proxy, also known as an intercepting proxy, does not require any configuration changes on the client, since traffic is transparently sent to the proxy, usually through traffic redirection by a router. When the client sends packets, they are addressed to the destination server. A transparent server is not usually involved with client authentication; a client cannot authenticate to a proxy server that it is not (or should not) be aware of. There are however, ways around this, which usually involve redirecting the client to a login page (or captive portal).

reverse proxy

A reverse proxy sits in front of a resource such as a web server and answers queries from clients, caching content from the server and optimising connections to it.

cache

A cache is simply an object store. A proxy will usually cache objects (images, html text, downloaded files etc) that are requested by clients, which means storing the objects on the proxy either in RAM or on disk. This has the benefit of a client being able to get the object from the proxy, without having to wait for the proxy to connect out to the destination server (OCS) and download the object again, resulting in a better client experience ("the web pages seem to load faster") and reduced bandwidth (less connections made server side, especially if we consider objects that are repeatedly requested).

Caching is influenced by proxy configuration (what to cache) and by numerous HTTP headers (am I allowed to cache this object? How long should I cache it for?) such as 'Expires', 'Cache Control', 'If-Modified-Since' and 'Last-Modified'. A proxy will usually keep its cache fresh by making requests for cached objects independent of client requests for the objects.

More information

You may also wish to review https://devcentral.f5.com/articles/the-concise-guide-to-proxies which provides further information on various proxy types.

Installation

Install the squid package:

apk add squid

If you wish to use the Alpine Configuration Framework (ACF) front-end for squid, install the acf-squid package:

apk add acf-squid

You can then logon to the device over https://x.x.x.x (replace x.x.x.x with the IP of your server of course) and manage the squid configuration files and stop/start/restart the daemon etc.

Basic configuration

Config file

The main configuration file is /etc/squid/squid.conf. Lines beginning with a '#' are comments. squid should already come with a basic working configuration file but an example configuration file is shown below, which will get you up and running quickly and is well commented but please change the localnet definition for a more restrictive one:

## Tested and working on squid 3.3.10-r0 and Alpine 2.7.1 (kernel 3.10.19-0-grsec), 64-bit
## Example rule allowing access from your local networks.
## Adapt to list your (internal) IP networks from where browsing
## should be allowed
acl localnet src 10.0.0.0/8	# RFC1918 possible internal network
acl localnet src 172.16.0.0/12	# RFC1918 possible internal network
acl localnet src 192.168.0.0/16	# RFC1918 possible internal network
## Allow anyone to use the proxy (you should lock this down to client networks only!):
# acl localnet src all
## IPv6 local addresses:
acl localnet src fc00::/7       # RFC 4193 local private network range
acl localnet src fe80::/10      # RFC 4291 link-local (directly plugged) machines

acl SSL_ports port 443
acl Safe_ports port 80		# http
acl Safe_ports port 21		# ftp
acl Safe_ports port 443		# https
acl Safe_ports port 70		# gopher
acl Safe_ports port 210		# waiss
acl Safe_ports port 1025-65535	# unregistered ports
acl Safe_ports port 280		# http-mgmt
acl Safe_ports port 488		# gss-http
acl Safe_ports port 591		# filemaker
acl Safe_ports port 777		# multiling http
acl CONNECT method CONNECT
acl QUERY urlpath_regex cgi-bin \? asp aspx jsp

## Prevent caching jsp, cgi-bin etc
cache deny QUERY

## Only allow access to the defined safe ports whitelist
http_access deny !Safe_ports

## Deny CONNECT to other than secure SSL ports
http_access deny CONNECT !SSL_ports

## Only allow cachemgr access from localhost
http_access allow localhost manager
http_access deny manager

## We strongly recommend the following be uncommented to protect innocent
## web applications running on the proxy server who think the only
## one who can access services on "localhost" is a local user
http_access deny to_localhost

##
## INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
##

## Example rule allowing access from your local networks.
## Adapt localnet in the ACL section to list your (internal) IP networks
## from where browsing should be allowed
http_access allow localnet
http_access allow localhost

## And finally deny all other access to this proxy
http_access deny all

## Squid normally listens to port 3128
http_port 3128
## If you have multiple interfaces you can specify to listen on one IP like this:
#http_port 1.2.3.4:3128 

## Uncomment and adjust the following to add a disk cache directory.
## 1024 is the disk space to use for cache in MB, adjust as you see fit!
## Default is no disk cache
#cache_dir ufs /var/cache/squid 1024 16 256
## Better, use 'aufs' cache type, see 
##http://www.squid-cache.org/Doc/config/cache_dir/ for info.
#cache_dir aufs /var/cache/squid 1024 16 256
## Recommended to only change cache type when squid is stopped, and use 'squid -z' to
## ensure cache is (re)created correctly

## Leave coredumps in the first cache dir
#coredump_dir /var/cache/squid

## Where does Squid log to?
#access_log /var/log/squid/access.log
## Use the below to turn off access logging
access_log none
## When logging, web auditors want to see the full uri, even with the query terms
#strip_query_terms off
## Keep 7 days of logs
#logfile_rotate 7

## How much RAM, in MB, to use for cache? Default since squid 3.1 is 256 MB
cache_mem 64 MB

## Maximum size of individual objects to store in cache
maximum_object_size 1 MB

## Amount of data to buffer from server to client 
read_ahead_gap 64 KB

## Use X-Forwarded-For header?
## Some consider this a privacy/security risk so it is often disabled
## However it can be useful to identify misbehaving/problematic clients
#forwarded_for on 
forwarded_for delete 

## Suppress sending squid version information
httpd_suppress_version_string on

## How long to wait when shutting down squid
shutdown_lifetime 30 seconds

## Replace the User Agent header.  Be sure to deny the header first, then replace it :)
#request_header_access User-Agent deny all
#request_header_replace User-Agent Mozilla/5.0 (Windows; MSIE 9.0; Windows NT 9.0; en-US)

## What hostname to display? (defaults to system hostname)
#visible_hostname a_proxy

## Use a different hosts file?
#hosts_file /path/to/file

## Add any of your own refresh_pattern entries above these.
refresh_pattern ^ftp:		1440	20%	10080
refresh_pattern ^gopher:	1440	0%	1440
refresh_pattern -i (/cgi-bin/|\?) 0	0%	0
refresh_pattern .		0	20%	4320

Note:If you change the squid configuration file, you do not need to restart squid in order to load the changes, just use this command instead:

squid -k reconfigure

Testing

Start and check squid

Start the squid service:

rc-service squid start

To start squid automatically at boot:

rc-update add squid

Check the squid configuration for errors:

squid -k check

If there is no feedback, everything is gravy! (that's a good thing).

Check that squid is listening for traffic, using netstat for example:

netstat -tl

You should see a line showing a Local Address and the listening port (in our example config above it is set to 3128). If you don't see this, check the "http_port" directive is set in the config file and has a value. Ensure this port isn't being used by something else on the system.

Remember to ensure the squid proxy has valid IP configuration including default gateway etc.

Configure the client

Each application using the proxy will have to be configured to send traffic via the proxy. If we assume that our squid proxy is running on IP address 10.0.0.1, port 3128, we would configure the Firefox browser in the following manner:

Tools>Options>Advanced>Network>Settings...
Select Manual proxy configuration and tick the 'use this proxy server for all protocols' box
Under HTTP Proxy: add the squid listening IP address, 10.0.0.1. In the Port: section add the squid listening port 3128
Click OK to save the changes.

Now browse, you should have internet access, via the proxy!

Many Operating Systems allow a system proxy to be set. Firefox can be set to use the system proxy settings:

Tools>Options>Advanced>Network>Settings...
Select Use system proxy settings
Click OK to save the changes.

The system proxy settings themselves vary from system to system but on an Alpine install you can simply run the setup-proxy script.

It is also possible to configure the browser to use a PAC file. This file is usually hosted on a webserver (which may also be the proxy, but doesn't have to be) and it tells the browser what requests to send to the proxy and which ones to send direct (bypassing the proxy).

The Squid FAQ on configuring browsers offers more information on this topic.

Logs

If you've set the proxy to take access logs, you can view these to see client requests coming in:

tail -f /var/log/squid/access.log

Use Ctrl-C to exit back to the prompt.

SSL interception or SSL bumping

The offical squid documentation apears to prefer the term SSL interception for transparent squid deployments and SSL bumping for explicit proxy deployments. Nonetheless, both environments use the ssl_bump configuration directive (and some others) in /etc/squid/squid.conf for their configuration. In general terminology, SSL interception is generally used to describe both deployments and that will be the term used here. We are, of course, dealing with an explicit forward proxy configuration here.

Behaviour without SSL interception

Clients behind an explicit proxy use the 'CONNECT' HTTP method. The first connection to the proxy port uses HTTP and specifies the destination server (often termed the Origin Content Server, or OCS). After this the proxy simply acts as a tunnel, and blindly proxies the connection without inspecting the traffic.

Behaviour with SSL interception

Using this method, clients still use the CONNECT method but the client uses the certificate from the proxy (so it must be a certificate trusted by the client) to encrypt the traffic. Thus, the proxy is able to decrypt and view the traffic on the client-side before creating another encrypted connection server-side. This enables the proxy to, in essence, launch a man-in-the-middle 'attack' but also allows it to do all the things is can with plain, unencrypted HTTP traffic, like change the browser User-Agent reported to the server.

Configuration

Add packages

Add the ca-certificates package (required to trust common Certificate Authority (CA) certificates) and the openssl package (to create self-signed certificate or CSR). The -U option ensures we update the package list first:

apk -U add ca-certificates openssl

Generate cert/key pair

You obviously don't need to follow both of the next sections. Either generate a self-signed certificate or a CA signed one (you have to pay for the latter) and then amend the squid configuration to enable SSL interception and point it to the key/cert pair generated in these steps.

Generate a self-signed certificate with OpenSSL

The following example command will produce a working cert/key pair, saved to /etc/squid/squid.pem:

openssl req -newkey rsa:4096 -x509 -keyout /etc/squid/squid.pem -out /etc/squid/squid.pem -days 365 -nodes

Then adjust permissions:

chmod 400 /etc/squid/squid.pem

In the above example we save the cetificate and key to the same file; they can be saved to separate files if you wish, just adjust paths accordingly.

Generate a CSR to get a CA-signed certificate

Create a private key using the syntax openssl genrsa -out <key_path_and_name> <keysize>

For example:

openssl genrsa -out /etc/squid/squid.key 2048

Create the CSR with the syntax openssl req -new -key <key_path_and_name> -out <csr_path_and_name>

For example:

openssl req -new -key /etc/squid/squid.key -out /etc/squid/squid.csr

You then need to supply the CSR (Certificate Signing Request) to your Certificate Authority (CA). Do not send them, or anyone else, your private key. It should remain private!

Some CA's (such as Thawte and Verisign) provide an online CSR checker, so you can ensure the CSR is valid before providing it to them.

Once the CA receive the CSR and do their thing they should send you back the CA signed public key. Request it in .pem format if possible (it's a widely used standard for certs). You then need to copy this back onto the Squid proxy, to /etc/squid/ if you are following the example here.

Remember to amend the squid configuration to point at the correct locations of the private key and CA signed certificate.

Amend /etc/squid/squid.conf

Next, we need to amend the squid configuration file to use SSL interception. In the below example, we will add a few lines, then amend the http_port directive so that it still serves HTTP requests, but also performs SSL interception on HTTPS connections that are established via the HTTP method CONNECT. You can use a separate http_port for each if you wish, but remember to amend the client configuration to send HTTPS traffic to the alternative port.

## Use the below to avoid proxy-chaining
always_direct allow all
## Always complete the server-side handshake before client-side (recommended)
ssl_bump server-first all
## Allow server side certificate errors such as untrusted certificates, otherwise the connection is closed for such errors
sslproxy_cert_error allow all
## Or maybe deny all server side certificate errors according to your company policy
#sslproxy_cert_error deny all
## Accept certificates that fail verification (should only be needed if using 'sslproxy_cert_error allow all')
sslproxy_flags DONT_VERIFY_PEER

## Modify the http_port directive to perform SSL interception
## Ensure to point to the cert/key created earlier
## Disable SSLv2 because it isn't safe
http_port 3128 ssl-bump cert=/etc/squid/squid.pem key=/etc/squid/squid.pem generate-host-certificates=on options=NO_SSLv2

The decision you're making with the 'sslproxy_cert_error' (and potentially the 'sslproxy_flags') option is to either close the connection when a certificate error is encountered (such as a self-signed, untrusted certificate is presented), or to pass certificate errors onto the client to allow them to make the choice about the site and whether or not to trust the certificate.

Fix client SSL Warnings

You will need to install the self-signed proxy certificate (in our example we saved it to /etc/squid/squid.pem) to all clients, otherwise they will probably get an SSL error for every domain they visit over HTTPS. It is important to install the certificate as a Certificate Authority (CA) certificate to the client browser for trust to establish properly. If you are using a CA signed certificate (not a self-signed one) then the browser probably already trusts the certificate and so this step will likely not be needed.

For Internet Explorer, you can likely double-click the .pem certificate and use the Certificate Import Wizard to manually install the certificate to the Trusted Root Certification Authorities certificate store.

For Firefox, use Tools>Options>Advanced>Certificates>View Certificates. Under the Authorities tab, use Import... to add the certificate as a trusted authority. Installing the certificate as any other kind of certificate will result in a poor user experience.

Note:If you see the error This certificate is already installed as a certificate authority. be sure to check all locations for existence of the certificate and remove (delete) it wherever found. Firefox likes to install certificates as Server certificates rather than under Authorities as we would like. Once removing all traces of the certificate, please be sure to restart Firefox and try importing the certificate again.

Disable SSL interception for certain sites

There may be situations where you wish to disable SSL interception/SSL bumping for certain destinations due to issues with functionality or privacy concerns. As an example, a Windows Dropbox client refused to establish a secure connection because it did not trust the self-signed certificate in use by the proxy. Or, you may wish to allow user privacy to be retained when they are using hotmail.com. In this example we will create an Access Control list (ACL) to prevent SSL interception to *.hotmail.com and *.dropbox.com. Remember that rule order is important, the first match wins! So put more specific rules at the top, more general rules below.

## Disable ssl interception for dropbox.com and hotmail.com (and localhost)
acl no_ssl_interception dstdomain .dropbox.com .hotmail.com
ssl_bump none localhost
ssl_bump none no_ssl_interception
## Add the rest of your ssl-bump rules below
## e.g ssl_bump server-first all
## etc

Advert blocking

There are several methods to achieve this, you could simply create an ACL for known advert domains (see blocking domains for an indication of how to do this). Another options is to use a hosts file specific to squid (i.e. unrelated to the system hosts file), which will direct traffic for known adverts/malware sites to the localhost. The connection can then either fail (because there is no web server running at 127.0.0.1:80 to service the requests that are redirected by the hosts file to localhost) and squid will display a standard error, or you can have a web server running that will respond with some form of 'advert blocked' page.

Blocking ads will save bandwidth and should improve page load times. As a drawback, some pages may look untidy or odd without advertising in place.

The first thing to do is either create a hosts file yourself or find a pre-configured one such as this one (note that this file is free to use for personal use only, see the full license here.

Whichever method you choose, save the hosts file to the local filesystem, in our example to /etc/squid/hosts.txt

Then, add the hosts_file directive to the squid configuration:

hosts_file /etc/squid/hosts.txt

Remember to reload the configuration/restart the squid service for the changes to take effect.

Blocking domains

If you have a large number of domains you wish to block, instead of adding them directly to the squid configuration file the best option is to create a separate list and reference this in the configuration file. The domain list should have domains listed one per line. There is an example list (warning, this doesn't get updated!) available here or here We will refer to this list in our example below.

- Create your own, or download a domain list and save it to /etc/squid/porndomains.acl:

wget http://www.ginjachris.co.uk/porndomains.acl -O /etc/squid/porndomains.acl

- Amend the squid configuration file at /etc/squid/squid.conf as follows:

# block porn domains based on a URL filtering list
acl blacklistpr0n dstdomain "/etc/squid/porndomains.acl"
http_access deny blacklistpr0n

- Check the squid configuration for errors:

squid -k check

and if there are none, apply the changes:

squid -k reconfigure

- Done! Domains from the list should now be blocked.

You can of course create your own lists of domains and add blacklists/whitelists to your configuration based on the above example. Each list should of course have a unique name.

DNS configuration

No additional DNS configuration is required since by default Squid will use the settings in /etc/resolv.conf. You may wish to change this behaviour for your environment or tweak settings to improve performance, as per the below example. It's heavily commented as always for my examples, change to suit your needs.

## Use DNS defined servers.  Default is to use servers defined in /etc/resolv.conf 
#dns_nameservers 10.0.0.1 192.168.0.1
## Should squid handle single-component names?  Default is disabled
#dns_defnames on
## How many DNS child processes to spawn?  Values shown are defaults
#dns_children 32 startup=1 idle=1
## Enable EDNS.  Default is no ("none").  Size is specified in bytes.
#dns_packet_max none
## How often to retransmit DNS query?  Default is 5 seconds, doubled every time all DNS servers have been tried.
#dns_retransmit_interval 5
## DNS query timeout. If no response is received to a DNS query within this time,
## all DNS servers for the queried domain are assumed to be unavailable.
## Default is 30 seconds
#dns_timeout 30 seconds
## Default is to use IPv6 to connect to sites where available, over IPv4.
## Turning this feature on reverses this and prefers IPv4 connections over IPv6
#dns_v4_first on

More information

Squid configuration directives

Squid man page

Squid Arch linux wiki page

Terminology

client

proxy

explicit forward proxy

transparent forward proxy

reverse proxy

cache

More information

Installation

Basic configuration

Config file

Testing

Start and check squid

Configure the client

Logs

SSL interception or SSL bumping

Behaviour without SSL interception

Behaviour with SSL interception

Configuration

Add packages

Generate cert/key pair

Generate a self-signed certificate with OpenSSL

Generate a CSR to get a CA-signed certificate

Amend /etc/squid/squid.conf

Fix client SSL Warnings

Disable SSL interception for certain sites

Further reading

Advert blocking

Blocking domains

DNS configuration

More information