Talk:LXC: Difference between revisions

From Alpine Linux
(→‎Alternative Network Setup: got macvlan bridge mode working)
(Add unpriviledged containers with shadow-uidmap)
(5 intermediate revisions by 3 users not shown)
Line 2: Line 2:
= Alternative Network Setup =
= Alternative Network Setup =


These are notes on macvlan on a box with real vlans.  The goal here is to have the host on a management vlan, and several guests each on other vlans.  There's no need for the host to talk to the guests.  I wanted to try to see if the guest could get dhcp addresses. Something like this:
These are notes on macvlan on a box with real vlans.  The goal here is to have the host on a management vlan, and several guests each on other vlans.  There's no need for the host to talk to the guests.  The host resides on the "OOB" network, and if the host needs to talk to a guest, it does so with lxc-console, like having a KVM.  Each guest should get its address from the DHCP server on the appropriate vlan.Something like this:


Setup:
Setup:
Line 17: Line 17:
|-
|-
| guest3
| guest3
| dhcp (different address) on vlan 64
| dhcp on vlan64 (different address)
|}
|}


Line 45: Line 45:
* Here's /etc/lxc/lxc.conf
* Here's /etc/lxc/lxc.conf
  lxc.network.type  =  macvlan
  lxc.network.type  =  macvlan
lxc.network.macvlan.mode = bridge # allow guests on the same vlan to see each other
  # Allow guests on the same vlan to see each other                                  
  lxc.network.link  =  eth0.65
lxc.network.macvlan.mode = bridge                                                   
  lxc.network.name  =  eth0
  lxc.network.link  =  eth0.65                    
  # lxc.network.flags  =  up       # We will bring the interface up inside the container
  lxc.network.name  =  eth0                                                                  
  # lxc.network.ipv4  =  0.0.0.0  # We are going to do dhcp later
# lxc.network.hwaddr = de:ad:be:ef:c0:00    # macvlan will make one up, but possible if wanted               
  # lxc.network.flags  =  up                 # Do NOT bring up the interface, we will do so within the container
  # lxc.network.ipv4  =  0.0.0.0           # Do NOT assign an address, we do so within the container         
                                                                                                     
  # Capabilities to drop (for instance, to stop the guest from mounting sys) 
# Taken from http://sourceforge.net/mailarchive/message.php?msg_id=28285704 
# sys_boot is not listed here, as it causes problems when the host tries to stop the guest
# If you trust the guest, then you can get by without dropping capabilities
                                                                                 
lxc.cap.drop= sys_admin audit_control audit_write fsetid ipc_lock               
lxc.cap.drop= ipc_owner lease linux_immutable mac_admin mac_override mknod setfcap
lxc.cap.drop= setpcap sys_module sys_nice sys_pacct sys_ptrace sys_rawio
lxc.cap.drop= sys_tty_config sys_time 
* Create the guests
* Create the guests
  for a in `seq 1 3`; do  
  for a in `seq 1 3`; do  
Line 57: Line 70:
* vi /var/lib/lxc/guest2/config
* vi /var/lib/lxc/guest2/config
   change lxc.network.link to eth0.129
   change lxc.network.link to eth0.129
* Start and enter the first guest (this is where the fun starts)
* Start and enter the first guest (this is where the fun starts)
  /etc/init.d/lxc.guest1 start
  /etc/init.d/lxc.guest1 start
Line 64: Line 76:
=== Fun inside the guest ===
=== Fun inside the guest ===


* The /etc/networking/interfaces file is already set up for dhcp, so let's just restart networking:
* /dev/null is currently created as a regular file
guest1:~# /etc/init.d/networking restart
* /dev/zero doesn't exist
* Stopping networking ...
*  eth0 ...
cat: can't open '/var/run/udhcpc.eth0.pid': No such file or directory
ifdown: warning: no dhcp clients found and stopped  [ !! ]
  * Starting networking ...
  *   eth0 ...
cat: can't open '/sys/class/net/eth0/ifindex': No such file or directory
/usr/share/udhcpc/default.script: line 125: arithmetic syntax error
/usr/share/udhcpc/default.script: line 125: arithmetic syntax error
* But.. lookie there... we do have a real ip address.
* The reason for the syntax errors is we don't have sys/class/net mounted... So let's mount it and try again....
guest1:~# mount -t sysfs none /sys
guest1:~# /etc/init.d/networking restart
  * Stopping networking ...
  *  eth0 ...  [ ok ]
  * Starting networking ...
  *  eth0 ...  [ ok ]
guest1:~#
* We just opened ourselves to a [http://blog.bofh.it/debian/id_413 world of hurt].  But more on that later
* Let's see if we can make this 'just work'.  We're going to do some weird things, don't worry... its not standard
  guest1:~# rc-update add networking
  guest1:~# echo "sysfs /sys sysfs auto,defaults 0 0" >>/etc/fstab
  guest1:~# cat - << EOF >/etc/network/interfaces
    #auto lo
    iface lo inet loopback


    auto eth0
To create these, do the following from ''the host''
    iface eth0 inet dhcp
 
        pre-up /bin/mount -a
<pre>
        hostname guest1
rm -f /var/lib/lxc/[guest-name]/rootfs/dev/null
    EOF
rm -f /var/lib/lxc/[guest-name]/rootfs/dev/zero
  ctrl-a q
mknod  /var/lib/lxc/[guest-name]/rootfs/dev/zero c 1 5
  lxchost# /etc/init.d/lxc.guest1 restart
mknod  /var/lib/lxc/[guest-name]/rootfs/dev/null c 1 3
  lxchost# lxc-console -n guest1
</pre>
* We have Networking!
 
* Repeat the configuration for guest2 and 3
We do this in the host because our default config drops mknod capabilites in the guest.


=== What Works, What Doesnt ===
=== What Works, What Doesnt ===
* Pro
* Pro
** Each guest has its own mac address
** Each guest has its own mac address
** Can ping from one guest to another
** Network connectivity between each guest  
** No communication allowed between host and guests (this is a plus in our case - managment vlan != user vlan)
** No communication allowed between host and guests (this is a plus in our case - managment vlan != user vlan)
** if iptables modules are loaded in the host, each guest can create its own iptables rules (awall for all! sweet)
* Con
* Con
** Real /sys is mounted - the guest can shut down the host<br />
** No communication allowed between host and guests because we are not using a bridge interface (this is a plus in our case - managment vlan != user vlan)
in guest1: echo /sbin/poweroff > /sys/kernel/uevent_helper<br />
 
in host: /etc/init.d/lxc.guest1 stop
== About lxc-attach ==
 
I cannot conncect to any AL LXC build under AL... the response is always <pre>
infra:~# lxc-attach --name=git -- "ps ax"
lxc_container: attach.c: lxc_attach_to_ns: 196 Operation not permitted - failed to set namespace 'pid'
lxc_container: attach.c: lxc_attach: 844 failed to enter the namespace
</pre>
What did I possibly wrong?<br/>
Or is it a bug in AL LXC?
 
 
=== Update about lxc-attach ===
 
'''LXC-host: lxc-attach fail with "lxc_attach_to_ns: 270 Operation not permitted - failed to set namespace 'pid'"'''
 
'''Issue:''' When you try to run lxc-attach, this fails. "use of CAP_SYS_ADMIN in chroot denied for /usr/bin/lxc-attach" appears in dmesg.<br/>
'''Cause:''' This issue due to grsecurity restriction in the lxc host.<br/>
'''Workaround:''' Add the following settings to your sysctl.conf file:<br/>
<pre>
kernel.grsecurity.chroot_caps=0
kernel.grsecurity.chroot_deny_chmod=0
</pre>
 
Since those settings are read only at lxc host boot, and they have been applied in a second time, some of the lxc hosts might not have those settings loaded yet.
A simple workaround can be:
 
<pre>
echo 0 > /proc/sys/kernel/grsecurity/chroot_caps
echo 0 > /proc/sys/kernel/grsecurity/chroot_deny_chroot
</pre>
 
or simply run:
 
<pre>sysctl -p</pre>
 
== Unprivileged containers ==
 
To use unprivileged containers, one needs to install shadow-uidmap and add 'name:100000:65536' to both /etc/subuid and /etc/subgid. Or they will get errors like:
 
unshare: Operation not permitted
read pipe: Permission denied
lxc-create: lxccontainer.c: do_create_container_dir: 985 Failed to chown container dir
lxc-create: tools/lxc_create.c: main: 318 Error creating container test
 
[[User:Pickfire|Pickfire]] ([[User talk:Pickfire|talk]]) 16:05, 23 February 2017 (UTC)

Revision as of 16:05, 23 February 2017

Alternative Network Setup

These are notes on macvlan on a box with real vlans. The goal here is to have the host on a management vlan, and several guests each on other vlans. There's no need for the host to talk to the guests. The host resides on the "OOB" network, and if the host needs to talk to a guest, it does so with lxc-console, like having a KVM. Each guest should get its address from the DHCP server on the appropriate vlan.Something like this:

Setup:

host dhcp on vlan 8
guest1 dhcp on vlan 64
guest2 dhcp on vlan 129
guest3 dhcp on vlan64 (different address)
  • Host's /etc/network/interfaces file
auto lo
iface lo inet loopback
 
# MGMT vlan
auto eth0.8
iface eth0.8 inet dhcp
     hostname lxchost

# USR vlan - we bring it up, but dont assign an address
auto eth0.65
iface eth0.65 inet manual
   up ip link set $IFACE addr de:ad:be:ef:ca:fe
   up ip link set $IFACE up
   down ip link set $IFACE down

# VoIP vlan - we bring it up, but dont assign an address
auto eth0.129
iface eth0.129 inet manual
   up ip link set $IFACE addr 0f:f1:ce:c0:ff:ee
   up ip link set $IFACE up
   down ip link set $IFACE down
  • Here's /etc/lxc/lxc.conf
lxc.network.type   =   macvlan
# Allow guests on the same vlan to see each other                                   
lxc.network.macvlan.mode = bridge                                                    
lxc.network.link   =   eth0.65                     
lxc.network.name   =   eth0                                                                   
# lxc.network.hwaddr = de:ad:be:ef:c0:00    # macvlan will make one up, but possible if wanted                 
# lxc.network.flags  =   up                 # Do NOT bring up the interface, we will do so within the container
# lxc.network.ipv4   =   0.0.0.0            # Do NOT assign an address, we do so within the container          
                                                                                                     
# Capabilities to drop (for instance, to stop the guest from mounting sys)   
# Taken from http://sourceforge.net/mailarchive/message.php?msg_id=28285704  
# sys_boot is not listed here, as it causes problems when the host tries to stop the guest

# If you trust the guest, then you can get by without dropping capabilities
                                                                                  
lxc.cap.drop= sys_admin audit_control audit_write fsetid ipc_lock                 
lxc.cap.drop= ipc_owner lease linux_immutable mac_admin mac_override mknod setfcap
lxc.cap.drop= setpcap sys_module sys_nice sys_pacct sys_ptrace sys_rawio
lxc.cap.drop= sys_tty_config sys_time  
  • Create the guests
for a in `seq 1 3`; do 
  lxc-create -n guest${a} -f /etc/lxc/lxc.conf -t alpine
  ln -s /etc/init.d/lxc /etc/init.d/lxc.guest${a}
done
  • vi /var/lib/lxc/guest2/config
  change lxc.network.link to eth0.129
  • Start and enter the first guest (this is where the fun starts)
/etc/init.d/lxc.guest1 start
lxc-console -n guest1

Fun inside the guest

  • /dev/null is currently created as a regular file
  • /dev/zero doesn't exist

To create these, do the following from the host

rm -f /var/lib/lxc/[guest-name]/rootfs/dev/null
rm -f /var/lib/lxc/[guest-name]/rootfs/dev/zero
mknod  /var/lib/lxc/[guest-name]/rootfs/dev/zero c 1 5
mknod  /var/lib/lxc/[guest-name]/rootfs/dev/null c 1 3

We do this in the host because our default config drops mknod capabilites in the guest.

What Works, What Doesnt

  • Pro
    • Each guest has its own mac address
    • Network connectivity between each guest
    • No communication allowed between host and guests (this is a plus in our case - managment vlan != user vlan)
    • if iptables modules are loaded in the host, each guest can create its own iptables rules (awall for all! sweet)
  • Con
    • No communication allowed between host and guests because we are not using a bridge interface (this is a plus in our case - managment vlan != user vlan)

About lxc-attach

I cannot conncect to any AL LXC build under AL... the response is always

infra:~# lxc-attach --name=git -- "ps ax"
lxc_container: attach.c: lxc_attach_to_ns: 196 Operation not permitted - failed to set namespace 'pid'
lxc_container: attach.c: lxc_attach: 844 failed to enter the namespace

What did I possibly wrong?
Or is it a bug in AL LXC?


Update about lxc-attach

LXC-host: lxc-attach fail with "lxc_attach_to_ns: 270 Operation not permitted - failed to set namespace 'pid'"

Issue: When you try to run lxc-attach, this fails. "use of CAP_SYS_ADMIN in chroot denied for /usr/bin/lxc-attach" appears in dmesg.
Cause: This issue due to grsecurity restriction in the lxc host.
Workaround: Add the following settings to your sysctl.conf file:

kernel.grsecurity.chroot_caps=0
kernel.grsecurity.chroot_deny_chmod=0

Since those settings are read only at lxc host boot, and they have been applied in a second time, some of the lxc hosts might not have those settings loaded yet. A simple workaround can be:

echo 0 > /proc/sys/kernel/grsecurity/chroot_caps 
echo 0 > /proc/sys/kernel/grsecurity/chroot_deny_chroot

or simply run:

sysctl -p

Unprivileged containers

To use unprivileged containers, one needs to install shadow-uidmap and add 'name:100000:65536' to both /etc/subuid and /etc/subgid. Or they will get errors like:

unshare: Operation not permitted
read pipe: Permission denied
lxc-create: lxccontainer.c: do_create_container_dir: 985 Failed to chown container dir
lxc-create: tools/lxc_create.c: main: 318 Error creating container test

Pickfire (talk) 16:05, 23 February 2017 (UTC)