User talk:Jch: Difference between revisions

Revision as of 07:12, 17 April 2015

About NFS

NFS is now working with AL. Both as server and client with the nfs-utils package.
However, to use NFS as client in some LXC does not seems to work yet as shown below

nfstest:~# mount -t nfs -o ro 192.168.1.149:/srv/boot/alpine /mnt
mount.nfs: Operation not permitted
mount: permission denied (are you root?)
nfstest:~# tail /var/log/messages 
Apr  4 10:05:59 nfstest daemon.notice rpc.statd[431]: Version 1.3.1 starting
Apr  4 10:05:59 nfstest daemon.warn rpc.statd[431]: Flags: TI-RPC 
Apr  4 10:05:59 nfstest daemon.warn rpc.statd[431]: Failed to read /var/lib/nfs/state: Address in use
Apr  4 10:05:59 nfstest daemon.notice rpc.statd[431]: Initializing NSM state
Apr  4 10:05:59 nfstest daemon.warn rpc.statd[431]: Failed to write NSM state number: Operation not permitted
Apr  4 10:05:59 nfstest daemon.warn rpc.statd[431]: Running as root.  chown /var/lib/nfs to choose different user
nfstest:~# ls -l /var/lib/nfs
total 12
-rw-r--r--    1 root     root             0 Nov 10 15:43 etab
-rw-r--r--    1 root     root             0 Nov 10 15:43 rmtab
drwx------    2 nobody   root          4096 Apr  4 10:05 sm
drwx------    2 nobody   root          4096 Apr  4 10:05 sm.bak
-rw-r--r--    1 root     root             4 Apr  4 10:05 state
-rw-r--r--    1 root     root             0 Nov 10 15:43 xtab

msg from ncopa """ dmesg should tell you that grsecurity tries to prevent you to do this.

grsecurity does not permit the syscall mount from within a chroot since that is a way to break out of a chroot. This affects lxc containers too.

I would recommend that you do the mouting from the lxc host in the container config with lxc.mount.entry or similar.

https://linuxcontainers.org/lxc/manpages/man5/lxc.container.conf.5.html#lbAR

If you still want disable mount protection in grsecurity then you can do that with: echo 0 > /proc/sys/kernel/grsecurity/chroot_deny_mount """

this is not working with

lxc.mount.entry=nfsserver:/srv/boot/alpine mnt nfs nosuid,intr 0 0

on the host machine with all nfs modules and helper software installed and loaded.

backend:~# lxc-start -n nfstest
lxc-start: conf.c: mount_entry: 2049 Invalid argument - failed to mount
'nfsserver:/srv/boot/alpine' on '/usr/lib/lxc/rootfs/mnt'
lxc-start: conf.c: lxc_setup: 4163 failed to setup the mount entries for
'nfstest'
lxc-start: start.c: do_start: 688 failed to setup the container
lxc-start: sync.c: __sync_wait: 51 invalid sequence number 1. expected 2
lxc-start: start.c: __lxc_start: 1080 failed to spawn 'nfstest'

Nor with

echo 0 > /proc/sys/kernel/grsecurity/chroot_deny_mount

on the host machine with all nfs modules and helper software installed and loaded which does'nt work either.

To find a proper way to use NFS shares from AL LXC is an important topic in order to be able to, for instance, load balance web servers sharing contents uploaded by users.

Next step will be to have HA for the NFS server itself (with only AL machines).

About NBD

NBD is now in edge/testing thanks to clandmeter.

I cannot test it properly at the moment because all the machine are busy in prod. and this package allows newstyle only. I'm waiting my new lab machine...

We still miss xnbd fot it's proxy features allowing live migration. We are very exited by xnbd capacities!
Will be avid tester!

Also we are still looking after the right solution to backup NBD as a whole (versus by it's content) while in use. dd|nc is the used way nowadays.

New lab machine

Very soon, I will receive a brand new lab machine.

I plan to use lxc in qemu (KVM) in qemu (yes, twice!) to simulate a rack of servers running AL.

There will be 8 first level KVMs. A firewall, a router, storage nodes and compute nodes.

OpenVSwitch (OVS) will be used to simulate the networks (isp, internet, lan, storage, wan, ipmi).

The first level KVMs will receive block devices (BD) as logical volumes (LV) in LVM2 on top of a mdadm raid array composed with the physical hard disk drives.
They will assemble the received BD with mdadm and pass the raw raid as single BD tho the second level SAN KVMs. Those SAN will use LVM2 to publish LV as NBD on OVS "lan".
Some second level KVM will mount NBDs to expose NFS shares.
Other will mount NBS and NFS for real data access with containers (LXC) and expose services on OVS "wan" or "lan".

The first second level KVM to be launched will be a virtual laptop from an virtual USB stick. This particuliar machine with offer a PXEboot environment to the OVS "lan".
The storage and compute nodes will be launched with PXE on the OVS "lan" but will be able to run totally from RAM with no string attached to the boot devices (for instance the initial NFS share).

As soon as 1 SAN and 1 compute node will be available, the PXEboot server will reproduce himself from the virtual laptop USB stick to the compute node using the storage node to store the information about the setup; then live-migrate (keeping status of running machines).

eth0 is almays connected to OVS "lan" but on the firewall (connected to OVS "internet" and "isp").
The router is connected to all OVS but "isp" and "storage".
The storage nodes are connected to OVS "storage".
The compute nodes are connected to OVS "wan".

The DHCP lease is offered with no time limit after absence check on OVS "lan".

As a matter of fact, the only difference between a first and a second level KVM is sda first and vda second.

All machines run a consul instance.
The PXEboot server is a fixed known consul server guarantee to be present (otherwise boot does'nt even exist!).
On the N first compute nodes launched, a consul server KVM will be started (configured to reach a quorum of N) to replace the standard consul client.
As the state of a running cluster is always kept in the PXEboot server, This capacity is present in all consul server but active only on the actual consul leader.
We need to link or maintain the PXE configuration and bootstrap (including relevant apkovl) files to the consul key/value datastore to benefit from his resilience.
We need to hack lbu commit to push the resulting apkovl to all consul servers (as they are also stand-by copy of the consul leader).
Each consul election need to enforce the consul leader as the active PXE server.

In the real rack, at this stage, we just switch machine on connected to right switches after checking that it will boot trough PXE on first NIC (eth0).
In our simulator, we can manually start a KVM as fake physical machine (sda) or have a script on the real physical lab machine driving the lyfe cycle of those KVMs.

About consul

nothing yet but big hopes ^^
I'm lurking IRC about it ;)

We plan to use it's dynamic DNS feature, it's hosts listing, services inventory, events, k/v store...
and even semi high-availability for our PXE infrastructure the consul leader being the active PXEserver and other consul server are dormant PXEservers.
All config scripts adapted to pull values out of consul k/v datastore based on profiles found out of consul various lists.
As the key for dhcpd and PXEboot is the hwaddr, it will become our uuid for LAN and consul too.
We are very exited by consul capacities!
Will be avid tester!

Open questions

What memory footprint is needed?
What about dynamycally adapt quorum size?

Are checks possible triggers?

consul watch -prefix type -name name /path/to/executable

consul event [options] -name name [payload]

What best practice to store etc configurations?

envconsul

seems a very interesting feature!
Hope to see it packaged as soon as consul will be ;)

consul-template

seems a very interesting feature!
Hope to see it packaged as soon as consul will be ;)

About CEPH

CEPH is supposed to sovle the problem of high availability for the data stores, be it block devices (disks) or character devices (files).

The actual situation is not satisfactory.

We are very exited by CEPH capacities!
Will be avid tester!

About Docker

not a lot of information on the Docker page yet ...

@@ Line 151: / Line 151: @@
 #* consul-template
-=== consul_template ===
+=== envconsul ===
+seems a very interesting feature!<br/>
+Hope to see it packaged as soon as consul will be ;)
+=== consul-template ===
 seems a very interesting feature!<br/>