User talk:Jch/consul

From Alpine Linux
This material is work-in-progress ...

Do not follow instructions here until this notice is removed.
(Last edited by Jch on 20 Apr 2015.)

Consul

Introduction

This page is currently my experimental log about consul on AL.

Downloaded package: consul-0.5.0-r0.apk from http://repos.mauras.ch/alpinelinux/x86_64/

Will prepare 5 KVM: 3 consul server (in run-from-ram mode) and 2 consul agent (a SAN (NBD) and a LXC host in data mode) on an isolated network.

We will experiment to find the right spec to allocate the consul servers for 100 to 1000 agents.

We plan to orchestrate our VM fleet based on consul events, envconsul and consul-template. All with ash scripts and smart setup.

We combine our PXE servers and the consul ones to inherit from the consul resilence (consul leader) to offer high-availability (HA) (but still in ram only) to the PXE service.

Currently, we have scripts to adapt a freshly PXE booted machine into a

  • PXE and consul server
  • SAN (san.service.consul)
  • generic diskless machine (for KVM or services) (kvm.service.consul)
  • generic data mode machine (/var) (for LXC or services) (lxc.service.consul)
  • machine with physical drives (raid + KVM + SAN) (raid.service.consul; kvm.service.consul)
  • specific sys mode machine (sys.service.consul)

We have defined:

  • *.kvm.service.consul
  • *.lxc.service.consul
  • *.san.service.consul
  • *.nbd.service.consul
  • *.raid.service.consul
  • *.nas.service.consul
  • *.sys.service.consul
  • time.service.consul
  • repo.service.consul
  • resolver.service.consul
  • dns.service.consul
  • collectd.service.consul
  • webcache.service.consul
  • mariadb.service.consul
  • *.ldap.service.consul
  • relay.service.consul
  • syslog.service.consul
  • *.ceph.service.consul
  • *.git.service.consul
  • *.vpn.service.consul
  • etc

Install

We will just focus on the consul installation and configuration parts.

wget http://repos.mauras.ch/alpinelinux/x86_64/consul-0.5.0-r0.apk
apk add consul-0.5.0-r0.apk --allow-untrusted --force

Configuration

Consul v0.5.0
Consul Protocol: 2 (Understands back to: 1)

By default, there are 4 files in /etc/consul

acl.json.sample      encrypt.json.sample  server.json          tls.json.sample

Consul Server

As in our setup we want as many PXE server (in stand-by mode) than consul server, installing a consul server is done by a script doing

  1. whoami on the LAN ?
  2. find IP of boot server (is consul leader)
  3. preconfigure OS
  4. install package from stable
  5. install package from edge
  6. rsync data
  7. install experimental consul package
  8. start consul as server
  9. untie from boot server
  10. get ENV from consul
  11. configure networking
  12. start networking
  13. start sshd
  14. (restart consul?)
  15. configure dnsmasq
  16. start dnsmasq
  17. configure ntpd
  18. start ntpd
  19. register time.service.consul
  20. configure collectd
  21. start collectd
  22. configure dhcpd
  23. configure pxelinux
  24. start in.tftp
  25. configure NFS
  26. start NFS
  27. configure darkhttpd
  28. start darkhttpd
  29. register repo.service.consul
  30. register pxe.service.consul

dhcpd will be started manually for now but later by the consul leader election.

DNSmasq with consul

apk add dnsmasq
mkdir -p /etc/dnsmasq.d
echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul
echo "resolv-file=/etc/resolv.dnsmasq" > /etc/dnsmasq.d/20-resolv
echo "nameserver ${DNS1}" > /etc/resolv.dnsmasq
echo "nameserver ${DNS2}" >> /etc/resolv.dnsmasq
echo <<EOF > /etc/conf.d/dnsmasq
MASQ_OPTS="--user=dnsmasq --group=dnsmasq --interface=eth0 --conf-dir=/etc/dnsmasq.d"
EOF
rc-service dnsmasq start

Consul Agent

We need to lauch the consul service after being able to read the IP address of the PXEserver involved.
This particuliar address will be used to join the consul cluster as agent.

  1. whoami on the LAN ?
  2. find IP of boot server (is consul leader)
  3. preconfigure OS
  4. install package from stable
  5. install package from edge
  6. install experimental consul package
  7. get standard agent config
  8. start consul as agent
  9. join consul cluster
  10. untie from boot server
  11. get ENV from consul
  12. configure dnsmasq
  13. restart dnsmasq
  14. configure collectd
  15. start collectd

Whenever possible, a service will be referenced by its name (like collectd.service.consul for the collectd server).
Consul agent and dnsmasq are the 2 first services to be started on each machine.

Bootstrap the consul cluster

Is to launch the 3 first servers lying to them about the first consul server because we do not want to switch PXE production at the same time we are introducing consul...

We have scipts to adapt a freshly PXE launched machine to the role of:

  • PXE and consul server
  • SAN
  • generic diskless machine (for KVM or services)
  • generic data mode machine (/var) (for LXC or services)
  • machine with physical drives (raid + KVM + SAN)
  • specific sys mode machine

Usage

Machines discovery

List machines = "consul members"

list PXEservers = servers in "consul members"

whois active.PXEservers = leader in "consul members"

Services discoevry

Register service

List services

Cheks

Key/Value storage

k/v

Events

Define event

Watch event

Triggers