Dynamic Multipoint VPN (DMVPN)

From Alpine Linux
Revision as of 09:24, 22 August 2013 by Ncopa (talk | contribs) (→‎ISP Failover: add static route rules)
This material is work-in-progress ...

Do not follow instructions here until this notice is removed.
(Last edited by Ncopa on 22 Aug 2013.)

http://alpinelinux.org/about under Why the Name Alpine? states: [ref?]

The first open-source implementation of Cisco's DMVPN, called OpenNHRP, was written for Alpine Linux.

So the aim of this document is to be the reference Linux DMVPN setup, with all the networking services needed for the clients that will use the DMVPN (DNS, DHCP, firewall, etc.).

Terminology

NBMA: Non-Broadcast Multi-Access network as described in RFC 2332

Hub: the Next Hop Server (NHS) performing the Next Hop Resolution Protocol service within the NBMA cloud.

Spoke: the Next Hop Resolution Protocol Client (NHC) which initiates NHRP requests of various types in order to obtain access to the NHRP service.

Tip: At the time of this writing the recommended Alpine version for building a DMVPN should be at minimum 2.4.11. Don't use 2.5.x, or 2.6.0 since the kernel has in-tunnel IP fragmentation issues. Alpine 2.6.1 or later should be okay instead.
Note: This document assumes that all Alpine installations are run in diskless mode and that the configuration is saved on USB key

Extract Certificates

We will use certificates for DMVPN and for OpenVPN (RoadWarrior clients). Here are the general purpose instruction for extracting certificates from pfx files:

openssl pkcs12 -in cert.pfx -cacerts -nokeys -out cacert.pem openssl pkcs12 -in cert.pfx -nocerts -nodes -out serverkey.pem openssl pkcs12 -in cert.pfx -nokeys -clcerts -out cert.pem

Remember to set appropriate permission for your certificate files:

chmod 600 *.pem *.pfx

Spoke Node

A local spoke node network has support for multiple ISP connections, along with redundant layer 2 switches. At least one 802.1q capable switch is required, and a second is optional for redundancy purposes. The typical spoke node network looks like:

Alpine Setup

We will setup the network interfaces as follows:

bond0.3 = Management (not implemented below yet)
bond0.8 = LAN
bond0.64 = DMZ
bond0.80 = Voice (not implemented below yet)
bond0.96 = Internet Access Only (no access to the DMVPN network)(not implemented below yet)
bond0.256 = ISP1
bond0.257 = ISP2

Boot Alpine in diskless mode and run setup-alpine

You will be prompted something like this... Suggestion on what you could enter...
Select keyboard layout [none]: Type an appropriate layout for you
Select variant: Type an appropriate layout for you (if prompted)
Enter system hostname (short form, e.g. 'foo') [localhost]: Enter the hostname, e.g. vpnc
Available interfaces are: eth0
Enter '?' for help on bridges, bonding and vlans.
Which one do you want to initialize? (or '?' done')
Enter bond0.8
Available bond slaves are: eth0 eth1
Which slave(s) do you want to add to bond0? (or 'done') [eth0]
eth0 eth1
IP address for bond0? (or 'dhcp', 'none', '?') [dhcp]: Press Enter confirming 'none'
IP address for bond0.8? (or 'dhcp', 'none', '?') [dhcp]: Enter the IP address of your LAN interface, e.g. 10.1.0.1
Netmask? [255.255.255.0]: Press Enter confirming '255.255.255.0' or type another appropriate subnet mask
Gateway? (or 'none') [none]: Press Enter confirming 'none'
Do you want to do any manual network configuration? [no] yes
Make a copy of the bond0.8 configuration for bond0.64, bond0.256 and bond0.257 (optional) interfaces.
Don't forget to add a gateway and a metric value for ISP interfaces when multiple gateways are set.
Save and close the file (:wq)
DNS domain name? (e.g. 'bar.com') []: Enter the domain name of your intranet, e.g., example.net
DNS nameservers(s)? []: 8.8.8.8 8.8.4.4 (we will change them later)
Changing password for root
New password:
Enter a secure password for the console
Retype password: Retype the above password
Which timezone are you in? ('?' for list) [UTC]: Press Enter confirming 'UTC'
HTTP/FTP proxy URL? (e.g. 'http://proxy:8080', or 'none') [none] Press Enter confirming 'none'
Enter mirror number (1-9) or URL to add (or r/f/e/done) [f]: Select a mirror close to you and press Enter
Which SSH server? ('openssh', 'dropbear' or 'none') [openssh]: Press Enter confirming 'openssh'
Which NTP client to run? ('openntpd', 'chrony' or 'none') [chrony]: Press Enter confirming 'chrony'
Which disk(s) would you like to use? (or '?' for help or 'none') [none]: Press Enter confirming 'none' or type 'none' if needed
Enter where to store configs ('floppy', 'usb' or 'none') [usb]: Press Enter confirming 'usb'
Enter apk cache directory (or '?' or 'none') [/media/usb/cache]: Press Enter confirming '/media/usb/cache'

Bonding

Update the bonding configuration:

echo bonding mode=balance-tlb miimon=100 updelay=500 >> /etc/modules

Physically install

At this point, you're ready to connect the VPN Spoke Node to the network if you haven't already done so. Please set up an 802.1q capable switch with the VLANs listed in AlpineSetup section. Once done, tag all of the VLANs on one port. Connect that port to eth0. Then, connect your first ISP's CPE to a switchport with VLAN 256 untagged.

Recursive DNS

apk add unbound

With your favorite editor open /etc/unbound/unbound.conf and add the following configuration. If you have a domain that you want unbound to resolve but is internal to your network only, the stub-zone stanza is present:

server:
        verbosity: 1
        interface: 10.1.0.1
        do-ip4: yes
        do-ip6: no
        do-udp: yes
        do-tcp: yes
        do-daemonize: yes
        access-control: 10.1.0.0/16 allow
        access-control: 127.0.0.0/8 allow

do-not-query-localhost: no

root-hints: "/etc/unbound/named.cache"

forward-zone:
        name: "example.net"
        forward-addr: 172.16.255.1
        forward-addr: 172.16.255.2
        forward-addr: 172.16.255.3
        forward-addr: 172.16.255.4
        forward-addr: 172.16.255.5
        forward-addr: 172.16.255.7

forward-zone:
	name: "example2.net"
	forward-addr: 172.16.255.1
        forward-addr: 172.16.255.2
        forward-addr: 172.16.255.3
        forward-addr: 172.16.255.4
        forward-addr: 172.16.255.5
        forward-addr: 172.16.255.7

stub-zone:
	name: "location1.example.net"
	stub-addr: 10.1.0.2

python:
remote-control:
        control-enable: no

Fetch the latest copy of root hints:

wget http://ftp.internic.net/domain/named.cache -O /etc/unbound/named.cache /etc/init.d/unbound start rc-update add unbound echo nameserver 10.1.0.1 > /etc/resolv.conf

Local DNS Zone

If you have a DNS zone that is only resolvable internally to your network, you will need a 2nd IP address on your LAN interface, and use NSD to host the zone.

First, add the following to the end of the bond0.8 stanza in /etc/network/interfaces:

auto bond0.8
     ...
     ...
     up ip addr add 10.1.0.2/24 dev bond0.8

Then, install nsd:

apk add nsd

Create /etc/nsd/nsd.conf:

server:
        ip-address: 10.1.0.2
        port: 53
        server-count: 1
        ip4-only: yes
        hide-version: yes
        identity: ""
        zonesdir: "/etc/nsd"
zone:
        name: location1.example.net
        zonefile: location1.example.net.zone

Create zonefile in /etc/nsd/location1.example.net.zone:

;## location1.example.net authoritative zone

$ORIGIN location1.example.net.
$TTL 86400

@ IN SOA ns1.location1.example.net. webmaster.location1.example.net. (
                2013081901      ; serial
                28800           ; refresh
                7200            ; retry
                86400           ; expire
                86400           ; min TTL
                )

                NS              ns1.location1.example.net.
                MX      10      mail.location1.example.net.
ns              IN      A       10.1.0.2
mail            IN      A       10.1.0.4

Check configuration then start:

nsd-checkconf /etc/nsd/nsd.conf nsdc rebuild /etc/init.d/nsd start rc-update add nsd

NTP

apk add chrony /etc/init.d/chronyd start rc-update add chronyd

GRE Tunnel

With your favorite editor open /etc/network/interfaces and add the following:

auto gre1
iface gre1 inet static
      pre-up ip tunnel add $IFACE mode gre ttl 64 tos inherit key 12.34.56.78 || true
      address 172.16.1.1
      netmask 255.255.0.0
      post-down ip tunnel del $IFACE || true

Save and close the file.

ifup gre1

IPSEC

apk add ipsec-tools

With your favorite editor open /etc/ipsec.conf and change the content to the following:

spdflush;
spdadd 0.0.0.0/0 0.0.0.0/0 gre -P out	ipsec esp/transport//require;
spdadd 0.0.0.0/0 0.0.0.0/0 gre -P in 	ipsec esp/transport//require;

With your favorite editor open /etc/racoon/racoon.conf and change the content to the following:

remote anonymous {
	exchange_mode main;
	lifetime time 2 hour;
	certificate_type x509 "/etc/racoon/cert.pem" "/etc/racoon/key.pem";
	ca_type x509 "/etc/racoon/ca.pem";
	my_identifier asn1dn;
	nat_traversal on;
        script "/etc/opennhrp/racoon-ph1dead.sh" phase1_dead;
	dpd_delay 120;
	proposal {
		encryption_algorithm aes 256;
		hash_algorithm sha1;
		authentication_method rsasig;
		dh_group modp4096;
	}
	proposal {
		encryption_algorithm aes 256;
		hash_algorithm sha1;
		authentication_method rsasig;
		dh_group 2;
	}
}

sainfo anonymous {
	pfs_group 2;
	lifetime time 2 hour;
	encryption_algorithm aes 256;
	authentication_algorithm hmac_sha1;
	compression_algorithm deflate;
}

Save and close the file.

/etc/init.d/racoon start rc-update add racoon

Next Hop Resolution Protocol (NHRP)

apk add opennhrp

With your favorite editor open /etc/opennhrp/opennhrp.conf and change the content to the following:

interface gre1
	dynamic-map 172.16.0.0/16 hub.example.com
	shortcut
	redirect
	non-caching

interface bond0.8
	shortcut-destination

interface bond0.64
	shortcut-destination

With your favorite editor open /etc/opennhrp/opennhrp-script and change the content to the following:

#!/bin/sh

MYAS=$(sed -n 's/router bgp \(\d*\)/\1/p' < /etc/quagga/bgpd.conf)

case $1 in
interface-up)
    echo "Interface $NHRP_INTERFACE is up"
    if [ "$NHRP_INTERFACE" = "gre1" ]; then
        ip route flush proto 42 dev $NHRP_INTERFACE
        ip neigh flush dev $NHRP_INTERFACE

        vtysh -d bgpd \
            -c "configure terminal" \
            -c "router bgp $MYAS" \
            -c "no neighbor core" \
            -c "neighbor core peer-group"
    fi
    ;;
peer-register)
    ;;
peer-up)
    if [ -n "$NHRP_DESTMTU" ]; then
        ARGS=`ip route get $NHRP_DESTNBMA from $NHRP_SRCNBMA | head -1`
        ip route add $ARGS proto 42 mtu $NHRP_DESTMTU
    fi
    echo "Create link from $NHRP_SRCADDR ($NHRP_SRCNBMA) to $NHRP_DESTADDR ($NHRP_DESTNBMA)"
    racoonctl establish-sa -w isakmp inet $NHRP_SRCNBMA $NHRP_DESTNBMA || exit 1
    racoonctl establish-sa -w esp inet $NHRP_SRCNBMA $NHRP_DESTNBMA gre || exit 1
    ;;
peer-down)
    echo "Delete link from $NHRP_SRCADDR ($NHRP_SRCNBMA) to $NHRP_DESTADDR ($NHRP_DESTNBMA)"
    racoonctl delete-sa isakmp inet $NHRP_SRCNBMA $NHRP_DESTNBMA
    ip route del $NHRP_DESTNBMA src $NHRP_SRCNBMA proto 42
    ;;
nhs-up)
    echo "NHS UP $NHRP_DESTADDR"
    (
        flock -x 200
        vtysh -d bgpd \
            -c "configure terminal" \
            -c "router bgp $MYAS" \
            -c "neighbor $NHRP_DESTADDR remote-as 65000" \
            -c "neighbor $NHRP_DESTADDR peer-group core" \
            -c "exit" \
            -c "exit" \
            -c "clear bgp $NHRP_DESTADDR"
    ) 200>/var/lock/opennhrp-script.lock
    ;;
nhs-down)
    (
        flock -x 200
        vtysh -d bgpd \
            -c "configure terminal" \
            -c "router bgp $MYAS" \
            -c "no neighbor $NHRP_DESTADDR"
    ) 200>/var/lock/opennhrp-script.lock
    ;;
route-up)
    echo "Route $NHRP_DESTADDR/$NHRP_DESTPREFIX is up"
    ip route replace $NHRP_DESTADDR/$NHRP_DESTPREFIX proto 42 via $NHRP_NEXTHOP dev $NHRP_INTERFACE
    ip route flush cache
    ;;
route-down)
    echo "Route $NHRP_DESTADDR/$NHRP_DESTPREFIX is down"
    ip route del $NHRP_DESTADDR/$NHRP_DESTPREFIX proto 42
    ip route flush cache
    ;;
esac

exit 0

Save and close the file. Make it executable:

chmod +x /etc/opennhrp/opennhrp-script /etc/init.d/opennhrp start rc-update add opennhrp

BGP

apk add quagga touch /etc/quagga/zebra.conf

With your favorite editor open /etc/quagga/bgpd.conf and change the content to the following:

password strongpassword
enable password strongpassword
log syslog

access-list 1 remark Command line access authorized IP
access-list 1 permit 127.0.0.1
line vty
 access-class 1

hostname vpnc.example.net

router bgp 65001
	bgp router-id 172.16.1.1
	network 10.1.0.0/16
	neighbor %HUB_GRE_IP% remote-as 65000
	neighbor %HUB_GRE_IP% remote-as 65000
        ...

Add the line neighbor %HUB_GRE_IP% remote-as 65000 for each Hub host you have in your NBMA cloud.

Save and close the file.

/etc/init.d/bgpd start rc-update add bgpd

OpenVPN

echo tun >> /etc/modules modprobe tun apk add openvpn openssl dhparam -out /etc/openvpn/dh1024.pem 1024

Set up the config in /etc/openvpn/openvpn.conf

dev tun
proto udp
port 1194

server 10.1.128.0 255.255.255.0
push "route 10.0.0.0 255.0.0.0"
push "dhcp-option DNS 10.1.0.1"

tls-server
ca /etc/openvpn/cacert.pem
cert /etc/openvpn/servercert.pem
key /etc/openvpn/serverkey.pem

crl-verify /etc/openvpn/crl.pem

dh /etc/openvpn/dh1024.pem

persist-key
persist-tun

keepalive 10 120

comp-lzo

status /var/log/openvpn.status
mute 20
verb 3

/etc/init.d/openvpn start rc-update add openvpn

Firewall

apk add awall

With your favorite editor, edit the following files and set their contents as follows:


/etc/awall/optional/params.json

{
  "description": "params",

  "variable": {
    "B_IF" = "bond0.8",
    "C_IF" = "bond0.64",
    "ISP1_IF" = "bond0.256",
    "ISP2_IF" = "bond0.257"
  }
}


/etc/awall/optional/internet-host.json

{
  "description": "Internet host",

  "import": "params",

  "zone": {
    "E": { "iface": [ "$ISP1_IF", "$ISP2_IF" ] },
    "ISP1": { "iface": "$ISP1_IF" },
    "ISP2": { "iface": "$ISP2_IF" }
  },

  "filter": [
    {
      "in": "E",
      "service": "ping",
      "action": "accept",
      "flow-limit": { "count": 10, "interval": 6 }
    },
    {
      "in": "E",
      "out": "_fw",
      "service": [ "ssh", "https" ],
      "action": "accept",
      "conn-limit": { "count": 3, "interval": 60 }
    },

    {
      "in": "_fw",
      "out": "E",
      "service": [ "dns", "http", "ntp" ],
      "action": "accept"
    },
    {
      "in": "_fw",
      "service": [ "ping", "ssh" ],
      "action": "accept"
    }
  ]
}


/etc/awall/optional/clampmss.json

{
  "description": "Deal with ISPs afraid of ICMP",

  "import": "internet-host",

  "clamp-mss": [ { "out": "E" } ]
}

/etc/awall/optional/mark.json

{
  "description": "Mark traffic based on ISP",

  "import": [ "params", "internet-host" ],

  "route-track": [
    { "out": "ISP1", "mark": 1 },
    { "out": "ISP2", "mark": 2 }
  ]
}


/etc/awall/optional/dmvpn.json

{
  "description": "DMVPN router",

  "import": "internet-host",

  "variable": {
    "A_ADDR": [ "10.0.0.0/8", "172.16.0.0/16" ]
  },

  "zone": {
    "A": { "addr": "$A_ADDR", "iface": "gre1" }
  },

  "filter": [
    { "in": "E", "out": "_fw", "service": "ipsec", "action": "accept" },
    { "in": "_fw", "out": "E", "service": "ipsec", "action": "accept" },
    {
      "in": "E",
      "out": "_fw",
      "ipsec": "in",
      "service": "gre",
      "action": "accept"
    },
    {
      "in": "_fw",
      "out": "E",
      "ipsec": "out",
      "service": "gre",
      "action": "accept"
    },

    { "in": "_fw", "out": "A", "service": "bgp", "action": "accept" },
    { "in": "A", "out": "_fw", "service": "bgp", "action": "accept"},
    { "out": "E", "dest": "$A_ADDR", "action": "reject" }
  ]
}


/etc/awall/optional/vpnc.json

{
  "description": "VPNc",

  "import": [ "params", "internet-host", "dmvpn" ],

  "zone": {
    "B": { "iface": "$B_IF" },
    "C": { "iface": "$C_IF" }
  },


  "policy": [
    { "in": "A", "action": "accept" },
    { "in": "B", "out": "A", "action": "accept" },
    { "in": "C", "out": [ "A", "E" ], "action": "accept" },
    { "in": "E", "action": "drop" },
    { "in": "_fw", "out": "A", "action": "accept" }
  ],

  "snat": [
    { "out": "E" }
  ],

  "filter": [
    {
      "in": "A",
      "out": "_fw",
      "service": [ "ping", "ssh", "http", "https" ],
      "action": "accept"
    },

    {
      "in": [ "B", "C" ],
      "out": "_fw",
      "service": [ "dns", "ntp", "http", "https", "ssh" ],
      "action": "accept"
    },

    {
      "in": "_fw",
      "out": [ "B", "C" ],
      "service": [ "dns", "ntp" ],
      "action": "accept"
    },

    { 
      "in": [ "A", "B", "C" ],
      "out": "_fw",
      "proto": "icmp",
      "action": "accept"
    }

  ]
}

awall enable clampmss awall enable vpnc awall activate rc-update add iptables

ISP Failover

apk add pingu echo -e "1\tisp1">> /etc/iproute2/rt_tables echo -e "2\tisp2">> /etc/iproute2/rt_tables

Configure pingu to monitor our bond0.256 and bond0.257 interfaces in /etc/pingu/pingu.conf. Add the hosts to monitor for ISP failover to /etc/pingu/pingu.conf and bind to primary ISP. We also set the ping timeout to 4 seconds.:

timeout 4
required 2
retry 11

interface bond0.256 { 
  # route-table must correspond with mark in /etc/awall/optional/mark.json
  route-table 1
  fwmark 1
  # the rule-priority must be a higher number than the priority in /etc/shorewall/route_rules <-- FIXME
  rule-priority 20000
  # google dns
  ping 8.8.8.8
  # opendns
  ping 208.67.222.222
}

interface bond0.257 {
  # route-table must correspond with mark in /etc/awall/optional/mark.json
  route-table 2
  fwmark 2
  rule-priority 20000
}

Make sure we can reach the public IP from our LAN by adding static route rules for our private net(s). Edit /etc/pingu/route-rules:

to 10.0.0.0/8 table main prio 1000
to 172.16.0.0/12 table main prio 1000

Start pingu:

/etc/init.d/pingu start rc-update add pingu

Now, if both hosts stop responding to pings, ISP-1 will be considered down and all gateways via bond0.256 will be removed from main route table. Note that the gateway will not be removed from the route table '1'. This is so we can continue try ping via bond0.256 so we can detect that the ISP is back online. When ISP starts working again, the gateways will be added back to main route table again.

Commit Configuration

lbu ci

Hub Node

Troubleshooting the DMVPN

Broken Path MTU Discovery (PMTUD)

ISPs afraid of ICMP (which is somehow legitimate) just blindly add no ip unreachables in their router interfaces, effectively creating a blackhole router that breaks PMTUD, since ICMP Type 3 Code 4 packets (Fragmentation Needed) are dropped. PMTUD is needed by ISAKMP that runs on UDP (TCP works because it uses CLAMPMSS).

For technical details see http://packetlife.net/blog/2008/oct/9/disabling-unreachables-breaks-pmtud/

PMTUD could also be broken due to badly configured DSL modem/routers or bugged firmware.

You can easily detect which host is the blackhole router by pinging with DF bit set and with packets of standard MTU size, each hop given in your traceroute to destination:

ping -M do -s 1472 %IP%

Note: "-M do" requires GNU ping, present in iputils package

If you don't get a response back (either Echo-Response or Fragmentation-Needed) there's firewall dropping ICMP packets. If it answer to normal ping packets (DF bit cleared), most likely you have hit a blackhole router.