Difference between revisions of "Setting up Explicit Squid Proxy"
m (→Advert blocking)
m (→Blocking domains)
|Line 375:||Line 375:|
- Done! Domains from the list should now be blocked
- Done! Domains from the list should now be blocked
== DNS configuration ==
== DNS configuration ==
Revision as of 14:44, 24 February 2014
Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages. Squid has extensive access controls and makes a great server accelerator. It is licensed under the GNU GPL.
If you are looking to setup a transparent squid proxy, see this page
- 1 Terminology
- 2 Installation
- 3 Basic configuration
- 4 SSL interception or SSL bumping
- 4.1 Behaviour without SSL interception
- 4.2 Behaviour with SSL interception
- 4.3 Configuration
- 4.4 Further reading
- 5 Advert blocking
- 6 Blocking domains
- 7 DNS configuration
- 8 More information
A client is often considered a user of a PC or similar system, but more accurately a client is the applications a person uses to access web pages and other resources, and the OS they are running on.
A proxy is a device which makes connections on behalf of clients. If we consider a common TCP connection, there is one TCP connection between the client (source) and the proxy, and a separate TCP connection between the proxy and the server (destination). Consider this beautiful diagram:
Client<---------->|PROXY|<------------>Server A B
Point A is the client-side connection and point B is the server-side connection.
These are separate, distinct connections, so for example the client-side could be encrypted and the server-side in plaintext (unencrypted), or the client-side could use browser user-agent header x and the server-side connection could use browser user-agent header y.
The proxy is effectively acting as a server to the client, and as a client to the server (OCS). Without a proxy, the connection would simply be from client to server. The destination server is often referred to as the 'OCS' or 'Origin Content Server' - this simply means the server hosting the objects that the client requests (for example the web pages that you want).
The above is of course a simplified version of things. Other factors, such as the HTTP version of the client browser, or existence of the object in cache on the proxy, will have impacts on how many server-side connections are created.
explicit forward proxy
An explicit proxy is one in which the client is explicitly configured to use the proxy, and as such are aware of the existence of the proxy on the network. When the client sends packets to an explicit proxy, they are addressed to the proxy server listening address and port. Squid usually listens for explicit traffic on TCP port 3128 but TCP port 8080 is a common explicit proxy listening port. AFAIK all explicit proxy deployments are forward proxy deployments, where the clients can make use of the caching and optimisation features of the proxy when making outbound requests. An explicit proxy can be involved in authentication of the client. This article discusses this type of proxy deployment.
transparent forward proxy
A transparent proxy, also known as an intercepting proxy, does not require any configuration changes on the client, since traffic is transparently sent to the proxy, usually through traffic redirection by a router. When the client sends packets, they are addressed to the destination server. A transparent server cannot be involved with client authentication; a client cannot authenticate to a proxy server that it is not (or should not) be aware of.
A reverse proxy sits in front of a resource such as a web server and answers queries from clients, caching content from the server and optimising connections to it.
A cache is simply an object store. A proxy will usually cache objects (images, html text, downloaded files etc) that are requested by clients, which means storing the objects on the proxy either in RAM or on disk. This has the benefit of a client being able to get the object from the proxy, without having to wait for the proxy to connect out to the destination server (OCS) and download the object again, resulting in a better client experience ("the web pages seem to load faster") and reduced bandwidth (less connections made server side, especially if we consider objects that are repeatedly requested).
Caching is influenced by proxy configuration (what to cache) and by numerous HTTP headers (am I allowed to cache this object? How long should I cache it for?) such as 'Expires', 'Cache Control', 'If-Modified-Since' and 'Last-Modified'. A proxy will usually keep its cache fresh by making requests for cached objects independent of client requests for the objects.
If you wish to use the Alpine Configuration Framework (ACF) front-end for squid, install thepackage:
You can then logon to the device over https://x.x.x.x (replace x.x.x.x with the IP of your server of course) and manage the squid configuration files and stop/start/restart the daemon etc.
The main configuration file is
/etc/squid/squid.conf. Lines beginning with a '#' are comments.
squid should already come with a basic working configuration file but an example configuration file is shown below, which will get you up and running quickly and is well commented but please change the localnet definition for a more restrictive one:
## Tested and working on squid 3.3.10-r0 and Alpine 2.7.1 (kernel 3.10.19-0-grsec), 64-bit ## Example rule allowing access from your local networks. ## Adapt to list your (internal) IP networks from where browsing ## should be allowed acl localnet src 10.0.0.0/8 # RFC1918 possible internal network acl localnet src 172.16.0.0/12 # RFC1918 possible internal network acl localnet src 192.168.0.0/16 # RFC1918 possible internal network ## Allow anyone to use the proxy (you should lock this down to client networks only!): # acl localnet src all ## IPv6 local addresses: acl localnet src fc00::/7 # RFC 4193 local private network range acl localnet src fe80::/10 # RFC 4291 link-local (directly plugged) machines acl SSL_ports port 443 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 70 # gopher acl Safe_ports port 210 # waiss acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl CONNECT method CONNECT acl QUERY urlpath_regex cgi-bin \? asp aspx jsp ## Prevent caching jsp, cgi-bin etc cache deny QUERY ## Only allow access to the defined safe ports whitelist http_access deny !Safe_ports ## Deny CONNECT to other than secure SSL ports http_access deny CONNECT !SSL_ports ## Only allow cachemgr access from localhost http_access allow localhost manager http_access deny manager ## We strongly recommend the following be uncommented to protect innocent ## web applications running on the proxy server who think the only ## one who can access services on "localhost" is a local user http_access deny to_localhost ## ## INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS ## ## Example rule allowing access from your local networks. ## Adapt localnet in the ACL section to list your (internal) IP networks ## from where browsing should be allowed http_access allow localnet http_access allow localhost ## And finally deny all other access to this proxy http_access deny all ## Squid normally listens to port 3128 http_port 3128 ## If you have multiple interfaces you can specify to listen on one IP like this: #http_port 184.108.40.206:3128 ## Uncomment and adjust the following to add a disk cache directory. ## 1024 is the disk space to use for cache in MB, adjust as you see fit! ## Default is no disk cache #cache_dir ufs /var/cache/squid 1024 16 256 ## Better, use 'aufs' cache type, see ##http://www.squid-cache.org/Doc/config/cache_dir/ for info. #cache_dir aufs /var/cache/squid 1024 16 256 ## Recommended to only change cache type when squid is stopped, and use 'squid -z' to ## ensure cache is (re)created correctly ## Leave coredumps in the first cache dir #coredump_dir /var/cache/squid ## Where does Squid log to? #access_log /var/log/squid/access.log ## Use the below to turn off access logging access_log none ## When logging, web auditors want to see the full uri, even with the query terms #strip_query_terms off ## Keep 7 days of logs #logfile_rotate 7 ## How much RAM, in MB, to use for cache? Default since squid 3.1 is 256 MB cache_mem 64 MB ## Maximum size of individual objects to store in cache maximum_object_size 1 MB ## Amount of data to buffer from server to client read_ahead_gap 64 KB ## Use X-Forwarded-For header? ## Some consider this a privacy/security risk so it is often disabled ## However it can be useful to identify misbehaving/problematic clients #forwarded_for on forwarded_for delete ## Suppress sending squid version information httpd_suppress_version_string on ## How long to wait when shutting down squid shutdown_lifetime 30 seconds ## Replace the User Agent header. Be sure to deny the header first, then replace it :) #request_header_access User-Agent deny all #request_header_replace User-Agent Mozilla/5.0 (Windows; MSIE 9.0; Windows NT 9.0; en-US) ## What hostname to display? (defaults to system hostname) #visible_hostname a_proxy ## Use a different hosts file? #hosts_file /path/to/file ## Add any of your own refresh_pattern entries above these. refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 refresh_pattern . 0 20% 4320
Note:If you change the squid configuration file, you do not need to restart squid in order to load the changes, just use this command instead:
Start and check squid
Start the squid service:
To start squid automatically at boot:
Check the squid configuration for errors:
If there is no feedback, everything is gravy! (that's a good thing).
Check that squid is listening for traffic, using netstat for example:
You should see a line showing a Local Address and the listening port (in our example config above it is set to 3128). If you don't see this, check the "http_port" directive is set in the config file and has a value. Ensure this port isn't being used by something else on the system.
Remember to ensure the squid proxy has valid IP configuration including default gateway etc.
Configure the client
Each application using the proxy will have to be configured to send traffic via the proxy. If we assume that our squid proxy is running on IP address 10.0.0.1, port 3128, we would configure the Firefox browser in the following manner:
- Select Manual proxy configuration and tick the 'use this proxy server for all protocols' box
- Under HTTP Proxy: add the squid listening IP address, 10.0.0.1. In the Port: section add the squid listening port 3128
- Click OK to save the changes.
Now browse, you should have internet access, via the proxy!
Many Operating Systems allow a system proxy to be set. Firefox can be set to use the system proxy settings:
- Select Use system proxy settings
- Click OK to save the changes.
It is also possible to configure the browser to use a PAC file. This file is usually hosted on a webserver (which may also be the proxy, but doesn't have to be) and it tells the browser what requests to send to the proxy and which ones to send direct (bypassing the proxy).
The Squid FAQ on configuring browsers offers more information on this topic.
If you've set the proxy to take access logs, you can view these to see client requests coming in:
Use Ctrl-C to exit back to the prompt.
SSL interception or SSL bumping
The offical squid documentation apears to prefer the term SSL interception for transparent squid deployments and SSL bumping for explicit proxy deployments. Nonetheless, both environments use the ssl_bump configuration directive (and some others) in
/etc/squid/squid.conf for their configuration.
In general terminology, SSL interception is generally used to describe both deployments and that will be the term used here. We are, of course, dealing with an explicit forward proxy configuration here.
Behaviour without SSL interception
Clients behind an explicit proxy use the 'CONNECT' HTTP method. The first connection to the proxy port uses HTTP and specifies the destination server (often termed the Origin Content Server, or OCS). After this the proxy simply acts as a tunnel, and blindly proxies the connection without inspecting the traffic.
Behaviour with SSL interception
Using this method, clients still use the CONNECT method but the client uses the certificate from the proxy (so it must be a certificate trusted by the client) to encrypt the traffic. Thus, the proxy is able to decrypt and view the traffic on the client-side before creating another encrypted connection server-side. This enables the proxy to, in essence, launch a man-in-the-middle 'attack' but also allows it to do all the things is can with plain, unencrypted HTTP traffic, like change the browser User-Agent reported to the server.
-U option ensures we update the package list first:
Generate cert/key pair
You obviously don't need to follow both of the next sections. Either generate a self-signed certificate or a CA signed one (you have to pay for the latter) and then amend the squid configuration to enable SSL interception and point it to the key/cert pair generated in these steps.
Generate a self-signed certificate with OpenSSL
The following example command will produce a working cert/key pair, saved to /etc/squid/squid.pem:
Then adjust permissions:
In the above example we save the cetificate and key to the same file; they can be saved to separate files if you wish, just adjust paths accordingly.
Generate a CSR to get a CA-signed certificate
Create a private key using the syntax
openssl genrsa -out <key_path_and_name> <keysize>
Create the CSR with the syntax
openssl req -new -key <key_path_and_name> -out <csr_path_and_name>
You then need to supply the CSR (Certificate Signing Request) to your Certificate Authority (CA). Do not send them, or anyone else, your private key. It should remain private!
Once the CA receive the CSR and do their thing they should send you back the CA signed public key. Request it in .pem format if possible (it's a widely used standard for certs). You then need to copy this back onto the Squid proxy, to /etc/squid/ if you are following the example here.
Remember to amend the squid configuration to point at the correct locations of the private key and CA signed certificate.
Next, we need to amend the squid configuration file to use SSL interception. In the below example, we will add a few lines, then amend the
http_port directive so that it still serves HTTP requests, but also performs SSL interception on HTTPS connections that are established via the HTTP method CONNECT. You can use a separate
http_port for each if you wish, but remember to amend the client configuration to send HTTPS traffic to the alternative port.
## Use the below to avoid proxy-chaining always_direct allow all ## Always complete the server-side handshake before client-side (recommended) ssl_bump server-first all ## Allow server side certificate errors such as untrusted certificates, otherwise the connection is closed for such errors sslproxy_cert_error allow all ## Or maybe deny all server side certificate errors according to your company policy #sslproxy_cert_error deny all ## Accept certificates that fail verification (should only be needed if using 'sslproxy_cert_error allow all') sslproxy_flags DONT_VERIFY_PEER ## Modify the http_port directive to perform SSL interception ## Ensure to point to the cert/key created earlier ## Disable SSLv2 because it isn't safe http_port 3128 ssl-bump cert=/etc/squid/squid.pem key=/etc/squid/squid.pem generate-host-certificates=on options=NO_SSLv2
Fix client SSL Warnings
You will need to install the self-signed proxy certificate (in our example we saved it to /etc/squid/squid.pem) to all clients, otherwise they will probably get an SSL error for every domain they visit over HTTPS. It is important to install the certificate as a Certificate Authority (CA) certificate to the client browser for trust to establish properly. If you are using a CA signed certificate (not a self-signed one) then the browser probably already trusts the certificate and so this step will likely not be needed.
For Internet Explorer, you can likely double-click the .pem certificate and use the Certificate Import Wizard to manually install the certificate to the Trusted Root Certification Authorities certificate store.
For Firefox, use Tools>Options>Advanced>Certificates>View Certificates. Under the Authorities tab, use Import... to add the certificate as a trusted authority. Installing the certificate as any other kind of certificate will result in a poor user experience.
Note:If you see the error This certificate is already installed as a certificate authority. be sure to check all locations for existence of the certificate and remove (delete) it wherever found. Firefox likes to install certificates as Server certificates rather than under Authorities as we would like. Once removing all traces of the certificate, please be sure to restart Firefox and try importing the certificate again.
Disable SSL interception for certain sites
There may be situations where you wish to disable SSL interception/SSL bumping for certain destinations due to issues with functionality or privacy concerns. As an example, a Windows Dropbox client refused to establish a secure connection because it did not trust the self-signed certificate in use by the proxy. Or, you may wish to allow user privacy to be retained when they are using hotmail.com. In this example we will create an Access Control list (ACL) to prevent SSL interception to *.hotmail.com and *.dropbox.com. Remember that rule order is important, the first match wins! So put more specific rules at the top, more general rules below.
## Disable ssl interception for dropbox.com and hotmail.com (and localhost) acl no_ssl_interception dstdomain .dropbox.com .hotmail.com ssl_bump none localhost ssl_bump none no_ssl_interception ## Add the rest of your ssl-bump rules below ## e.g ssl_bump server-first all ## etc
There are several methods to achieve this, you could simply create an ACL for known advert domains (see blocking domains for an indication of how to do this). Another options is to use a hosts file specific to squid (i.e. unrelated to the system hosts file), which will direct traffic for known adverts/malware sites to the localhost. The connection can then either fail (because there is no web server running at 127.0.0.1:80 to service the requests that are redirected by the hosts file to localhost) and squid will display a standard error, or you can have a web server running that will respond with some form of 'advert blocked' page.
Blocking ads will save bandwidth and should improve page load times. As a drawback, some pages may look untidy or odd without advertising in place.
Whichever method you choose, save the hosts file to the local filesystem, in our example to
Then, add the
hosts_file directive to the squid configuration:
Remember to reload the configuration/restart the squid service for the changes to take effect.
If you have a large number of domains you wish to block, instead of adding them directly to the squid configuration file the best option is to create a separate list and reference this in the configuration file. The domain list should have domains listed one per line. There is an example list (warning, this doesn't get updated!) available here: https://dl.dropboxusercontent.com/u/30359454/Squid/porn_domains We will refer to this list in our example.
- Download, or create your own, domain list (for example
wget http://dl.dropboxusercontent.com/u/30359454/Squid/porn_domains to download my example list)
- Save it to /etc/squid/porn_domains
- Amend the squid configuration file at /etc/squid/squid.conf as follows:
# block porn domains based on a URL filtering list acl blacklist-domains dstdomain "/etc/squid/porn_domains" http_access deny blacklist-domains
- Check the squid configuration for errors:
squid -k check
and if there are none, apply the changes:
squid -k reconfigure
- Done! Domains from the list should now be blocked.
You can of course create your own lists of domains and add blacklists/whitelists to your configuration based on the above example. Each list should of course have a unique name.
No additional DNS configuration is required since by default Squid will use the settings in /etc/resolv.conf. You may wish to change this behaviour for your environment or tweak settings to improve performance, as per the below example. It's heavily commented as always for my examples, change to suit your needs.
## Use DNS defined servers. Default is to use servers defined in /etc/resolv.conf #dns_nameservers 10.0.0.1 192.168.0.1 ## Should squid handle single-component names? Default is disabled #dns_defnames enabled ## How many DNS child processes to spawn? Values shown are defaults #dns_children 32 startup=1 idle=1 ## Enable EDNS. Default is no ("none"). Size is specified in bytes. #dns_packet_max none ## How often to retransmit DNS query? Default is 5 seconds, doubled every time all DNS servers have been tried. #dns_retransmit_interval 5 ## DNS query timeout. If no response is received to a DNS query within this time, ## all DNS servers for the queried domain are assumed to be unavailable. ## Default is 30 seconds #dns_timeout 30 seconds ## Default is to use IPv6 to connect to sites where available, over IPv4. ## Turning this feature on reverses this and prefers IPv4 connections over IPv6 #dns_v4_first on