User talk:Jch/consul: Difference between revisions

From Alpine Linux
m (nowiki'd a dead link.)
 
(26 intermediate revisions by one other user not shown)
Line 7: Line 7:
This page is currently my experimental log about consul on AL.
This page is currently my experimental log about consul on AL.


Downloaded package: consul-0.5.0-r0.apk from http://repos.mauras.ch/alpinelinux/x86_64/
<s>Downloaded package: consul-0.5.0-r0.apk from http://repos.mauras.ch/alpinelinux/x86_64/</s><br/>
Now in repo edge/testing


Will prepare 5 KVM: 3 consul server (in ''run-from-ram'' mode) and 2 consul agent (a SAN (NBD) and a LXC host in ''data'' mode) on an isolated network.
We will experiment to find the right spec to allocate the consul servers for 100 to 1000 agents. For now, we have a 52 nodes cluster.
 
We will experiment to find the right spec to allocate the consul servers for 100 to 1000 agents.


We plan to orchestrate our VM fleet based on consul events, envconsul and consul-template. All with ash scripts and smart setup.
We plan to orchestrate our VM fleet based on consul events, envconsul and consul-template. All with ash scripts and smart setup.


We combine our PXE servers and the consul ones to inherit from the consul resilence (consul leader) to offer high-availability (HA) (but still in ram only) to the PXE service.
We combine our PXE servers and the consul ones to inherit from the consul resilence (consul leader) to offer high-availability (HA) (but still in ram only) to the PXE service.
Currently, we have scripts to adapt a freshly PXE booted machine into a
* PXE and consul server
* SAN (san.service.consul)
* generic diskless machine (for KVM or services) (kvm.service.consul)
* generic data mode machine (/var) (for LXC or services) (lxc.service.consul)
* machine with physical drives (raid + KVM + SAN) (raid.service.consul; kvm.service.consul)
* specific sys mode machine (sys.service.consul)
We have defined:
* *.kvm.service.consul
* *.lxc.service.consul
* *.san.service.consul
* *.nbd.service.consul
* *.raid.service.consul
* *.nas.service.consul
* *.sys.service.consul
* time.service.consul
* repo.service.consul
* resolver.service.consul
* dns.service.consul
* collectd.service.consul
* webcache.service.consul
* mariadb.service.consul
* *.ldap.service.consul
* relay.service.consul
* syslog.service.consul
* *.ceph.service.consul
* *.git.service.consul
* *.vpn.service.consul
* etc


== Install ==
== Install ==
Line 53: Line 21:


<pre>
<pre>
wget http://repos.mauras.ch/alpinelinux/x86_64/consul-0.5.0-r0.apk
apk add consul@testing
apk add consul-0.5.0-r0.apk --allow-untrusted --force
</pre>
 
=== Web UI ===
 
One need to download and unpack <nowiki>https://dl.bintray.com/mitchellh/consul/0.5.0_web_ui.zip</nowiki> in some directory.<br/>
The contents of this zip file is a index.html with some js and css files in a subdirectory <pre>
Archive:  0.5.0_web_ui.zip
  creating: dist/
  inflating: dist/.gitkeep
  inflating: dist/index.html
  creating: dist/static/
  inflating: dist/static/application.min.js
  inflating: dist/static/base.css
  inflating: dist/static/base.css.map
  inflating: dist/static/bootstrap.min.css
  inflating: dist/static/consul-logo.png
  inflating: dist/static/favicon.png
  inflating: dist/static/loading-cylon-purple.svg
</pre>
</pre>
Tu use it, one has to specify "-ui-dir /path/to/ui" in /etc/conf.d/consul.<br/>
The UI is available at the /ui path on the same port as the HTTP API. By default this is http://localhost:8500/ui.


== Configuration ==
== Configuration ==
Line 69: Line 57:
=== Consul Server ===
=== Consul Server ===


As in our setup we want as many PXE server (in stand-by mode) than consul server, installing a consul server is done by a script doing
the default consul is for a stand alone server...


# whoami on the LAN ?
It is important to bootstrap the cluster properly before starting agents otherwise agents will never find the consul leader!
# find IP of boot server (is consul leader)
# preconfigure OS
# install package from stable
# install package from edge
# rsync data
# install experimental consul package
# start consul as server
# untie from boot server
# get ENV from consul
# configure networking
# start networking
# start sshd
# (restart consul?)
# configure dnsmasq
# start dnsmasq
# configure ntpd
# start ntpd
# register time.service.consul
# configure collectd
# start collectd
# configure dhcpd
# configure pxelinux
# start in.tftp
# configure NFS
# start NFS
# configure darkhttpd
# start darkhttpd
# register repo.service.consul
# register pxe.service.consul
 
dhcpd will be started manually for now but later by the consul leader election.


=== DNSmasq with consul ===
=== DNSmasq with consul ===


<pre>
<pre>
NIC="eth0" # PXEserver
NIC="lan"  # generic machine
apk add dnsmasq
apk add dnsmasq
rc-update add dnsmasq
mkdir -p /etc/dnsmasq.d
mkdir -p /etc/dnsmasq.d
echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul
echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul
echo <<EOF > /etc/conf.d/dnsmasq
MASQ_OPTS="--user=dnsmasq --group=dnsmasq --interface=${NIC} --conf-dir=/etc/dnsmasq.d"
EOF
rc-service dnsmasq start
DNS1=resolver.service.consul # need an IP here, not a fqdn...
DNS1=resolver.service.consul # should be different due to round-robin
echo "resolv-file=/etc/resolv.dnsmasq" > /etc/dnsmasq.d/20-resolv
echo "resolv-file=/etc/resolv.dnsmasq" > /etc/dnsmasq.d/20-resolv
echo "nameserver ${DNS1}" > /etc/resolv.dnsmasq
echo "nameserver ${DNS1}" > /etc/resolv.dnsmasq
echo "nameserver ${DNS2}" >> /etc/resolv.dnsmasq
echo "nameserver ${DNS2}" >> /etc/resolv.dnsmasq
echo <<EOF > /etc/conf.d/dnsmasq
rc-service dnsmasq restart
MASQ_OPTS="--user=dnsmasq --group=dnsmasq --interface=eth0 --conf-dir=/etc/dnsmasq.d"
EOF
rc-service dnsmasq start
</pre>
</pre>


Line 123: Line 86:
We need to lauch the consul service after being able to read the IP address of the PXEserver involved.<br/>
We need to lauch the consul service after being able to read the IP address of the PXEserver involved.<br/>
This particuliar address will be used to join the consul cluster as agent.
This particuliar address will be used to join the consul cluster as agent.
# whoami on the LAN ?
# find IP of boot server (is consul leader)
# preconfigure OS
# install package from stable
# install package from edge
# install experimental consul package
# get standard agent config
# start consul as agent
# join consul cluster
# untie from boot server
# get ENV from consul
# configure dnsmasq
# restart dnsmasq
# configure collectd
# start collectd


Whenever possible, a service will be referenced by its name (like collectd.service.consul for the collectd server).<br/>
Whenever possible, a service will be referenced by its name (like collectd.service.consul for the collectd server).<br/>
Consul agent and dnsmasq are the 2 first services to be started on each machine.
Consul agent and dnsmasq are the 2 first services to be started on each machine.


== Bootstrap the consul cluster ==
This is now included in our generic default apkovl for PXE boot and in our LXC creation custom-made script..
 
If an agent (or server) has several private IP addresses, one has to specify the address to be used in /etc/conf.d/consul (with -bind=<IP>). That address has to be reachable by all consul members!


Is to launch the 3 first servers lying to them about the first consul server because we do not want to switch PXE production at the same time we are introducing consul...
=== Securing consul ===


We have scipts to adapt a freshly PXE launched machine to the role of:
We will follow coredumb recommandations as on https://www.mauras.ch/securing-consul.html and based on sample config provided in the package.
* PXE and consul server
 
* SAN
With gossip encryption, we observe some flapping in the presence of nodes...
* generic diskless machine (for KVM or services)
* generic data mode machine (/var) (for LXC or services)
* machine with physical drives (raid + KVM + SAN)
* specific sys mode machine (apkovl ou -boot c)


== Usage ==
== Usage ==


HTTP API seems unavailable with the default configuration provided by the package...
Be sure to have proxt http disabled if you want to use curl against localhost...
 
How to list services or to interact with the k/v store?


=== Machines discovery ===
=== Machines discovery ===
Line 165: Line 108:
List machines = <pre>consul members</pre> or <pre>curl localhost:8500/v1/catalog/nodes</pre>
List machines = <pre>consul members</pre> or <pre>curl localhost:8500/v1/catalog/nodes</pre>


list PXEservers = servers in "consul members"
=== Services discovery ===
 
get consul leader = ???
 
=== Services discoevry ===


Register service = <pre>echo '{"service": {"name": "pxe"}}' > /etc/consul/pxe.json ; consul reload</pre>
Register service = <pre>echo '{"service": {"name": "pxe"}}' > /etc/consul/pxe.json ; consul reload</pre>
Line 179: Line 118:
curl http://localhost:8500/v1/catalog/node/<node> : Lists the services provided by a node
curl http://localhost:8500/v1/catalog/node/<node> : Lists the services provided by a node
</pre>
</pre>
For instance: <pre>
http_proxy= curl http://localhost:8500/v1/catalog/services?pretty
</pre>
==== What if ? ====
What if I register a service service1 with tag tag1 on node node1,<br/>
then I register a service service1 with tag tag2 on node2.<br/>
then I register a service service1 with tag tag3 on node1?<br/>
will the tags just add?<br/>
''NO, the lastest registration overwrite previous one.''
I plan to advertize available NBD with {"service": "nbd", "tags": ["<name>"]} ans access it with nbd:name.nbd.service.consul:name.<br/>
''but before that, need to use new nbd format with nammes on an unique port instead of differentiation by port (old scheme)''
Is there a clean way to dynamically add a tag to a service on a given consul node without losing existing tags ?


=== Cheks ===
=== Cheks ===
Line 184: Line 140:
=== Key/Value storage ===
=== Key/Value storage ===


k/v
<pre>
curl -X PUT -d 'value' http://localhost:8500/v1/kv/path/to/key
</pre>
 
<pre>
curl http://localhost:8500/v1/kv/path/to/key
</pre>
 
<pre>
curl -X DELETE http://localhost:8500/v1/kv/path/to/key?recurse
</pre>
 
'''need a standard function to convert from json to ash and vice-versa'''...


=== Events ===
=== Events ===
Line 193: Line 161:


=== Triggers ===
=== Triggers ===
=== Open questions ===
* How to know (from any node in the cluster) which is the active leader?
** an alternative is to define an applicative leader as in https://consul.io/docs/guides/leader-election.html
= envconsul =
I tried to build a package for it but without success till now...
= consul-template =

Latest revision as of 01:52, 28 August 2023

This material is work-in-progress ...

Do not follow instructions here until this notice is removed.
(Last edited by Zcrayfish on 28 Aug 2023.)

Consul

Introduction

This page is currently my experimental log about consul on AL.

Downloaded package: consul-0.5.0-r0.apk from http://repos.mauras.ch/alpinelinux/x86_64/
Now in repo edge/testing

We will experiment to find the right spec to allocate the consul servers for 100 to 1000 agents. For now, we have a 52 nodes cluster.

We plan to orchestrate our VM fleet based on consul events, envconsul and consul-template. All with ash scripts and smart setup.

We combine our PXE servers and the consul ones to inherit from the consul resilence (consul leader) to offer high-availability (HA) (but still in ram only) to the PXE service.

Install

We will just focus on the consul installation and configuration parts.

apk add consul@testing

Web UI

One need to download and unpack https://dl.bintray.com/mitchellh/consul/0.5.0_web_ui.zip in some directory.

The contents of this zip file is a index.html with some js and css files in a subdirectory

Archive:  0.5.0_web_ui.zip
   creating: dist/
  inflating: dist/.gitkeep
  inflating: dist/index.html
   creating: dist/static/
  inflating: dist/static/application.min.js
  inflating: dist/static/base.css
  inflating: dist/static/base.css.map
  inflating: dist/static/bootstrap.min.css
  inflating: dist/static/consul-logo.png
  inflating: dist/static/favicon.png
  inflating: dist/static/loading-cylon-purple.svg

Tu use it, one has to specify "-ui-dir /path/to/ui" in /etc/conf.d/consul.
The UI is available at the /ui path on the same port as the HTTP API. By default this is http://localhost:8500/ui.

Configuration

Consul v0.5.0
Consul Protocol: 2 (Understands back to: 1)

By default, there are 4 files in /etc/consul

acl.json.sample      encrypt.json.sample  server.json          tls.json.sample

Consul Server

the default consul is for a stand alone server...

It is important to bootstrap the cluster properly before starting agents otherwise agents will never find the consul leader!

DNSmasq with consul

NIC="eth0" # PXEserver
NIC="lan"  # generic machine 
apk add dnsmasq
rc-update add dnsmasq
mkdir -p /etc/dnsmasq.d
echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul
echo <<EOF > /etc/conf.d/dnsmasq
MASQ_OPTS="--user=dnsmasq --group=dnsmasq --interface=${NIC} --conf-dir=/etc/dnsmasq.d" 
EOF
rc-service dnsmasq start
DNS1=resolver.service.consul # need an IP here, not a fqdn...
DNS1=resolver.service.consul # should be different due to round-robin
echo "resolv-file=/etc/resolv.dnsmasq" > /etc/dnsmasq.d/20-resolv
echo "nameserver ${DNS1}" > /etc/resolv.dnsmasq
echo "nameserver ${DNS2}" >> /etc/resolv.dnsmasq
rc-service dnsmasq restart

Consul Agent

We need to lauch the consul service after being able to read the IP address of the PXEserver involved.
This particuliar address will be used to join the consul cluster as agent.

Whenever possible, a service will be referenced by its name (like collectd.service.consul for the collectd server).
Consul agent and dnsmasq are the 2 first services to be started on each machine.

This is now included in our generic default apkovl for PXE boot and in our LXC creation custom-made script..

If an agent (or server) has several private IP addresses, one has to specify the address to be used in /etc/conf.d/consul (with -bind=<IP>). That address has to be reachable by all consul members!

Securing consul

We will follow coredumb recommandations as on https://www.mauras.ch/securing-consul.html and based on sample config provided in the package.

With gossip encryption, we observe some flapping in the presence of nodes...

Usage

Be sure to have proxt http disabled if you want to use curl against localhost...

Machines discovery

List machines =

consul members

or

curl localhost:8500/v1/catalog/nodes

Services discovery

Register service =

echo '{"service": {"name": "pxe"}}' > /etc/consul/pxe.json ; consul reload

List services :

curl http://localhost:8500/v1/agent/services : Returns the services the local agent is managing 
curl http://localhost:8500/v1/catalog/services : Lists services in a given DC 
curl http://localhost:8500/v1/catalog/service/<service> : Lists the nodes in a given service
curl http://localhost:8500/v1/catalog/node/<node> : Lists the services provided by a node

For instance:

http_proxy= curl http://localhost:8500/v1/catalog/services?pretty

What if ?

What if I register a service service1 with tag tag1 on node node1,
then I register a service service1 with tag tag2 on node2.
then I register a service service1 with tag tag3 on node1?
will the tags just add?
NO, the lastest registration overwrite previous one.

I plan to advertize available NBD with {"service": "nbd", "tags": ["<name>"]} ans access it with nbd:name.nbd.service.consul:name.
but before that, need to use new nbd format with nammes on an unique port instead of differentiation by port (old scheme)

Is there a clean way to dynamically add a tag to a service on a given consul node without losing existing tags ?

Cheks

Key/Value storage

curl -X PUT -d 'value' http://localhost:8500/v1/kv/path/to/key
curl http://localhost:8500/v1/kv/path/to/key
curl -X DELETE http://localhost:8500/v1/kv/path/to/key?recurse

need a standard function to convert from json to ash and vice-versa...

Events

Define event

Watch event

Triggers

Open questions

envconsul

I tried to build a package for it but without success till now...

consul-template