User:Darkfader/distcc: Difference between revisions
(initial) |
No edit summary |
||
| Line 20: | Line 20: | ||
distcc | distcc | ||
distccd-openrc | distccd-openrc | ||
you also need stuff to do compiles | |||
alpine sdk | |||
clang | |||
binutils | |||
... | |||
elfutils(-dev) | |||
| Line 31: | Line 39: | ||
distcc hosts file | |||
idk about that thing it's odd | |||
settings for aports | |||
== detail infos == | |||
=== hosts syntax === | |||
* myhost otherhost | |||
* myhost,cpp,lzo myotherhost,cpp,lzo | |||
==== the host ==== | |||
hostname/ip | |||
localhost | |||
127.0.0.1 | |||
::1 - does not work | |||
==== protocol ==== | |||
* no protocol given | |||
* ,cpp,lzo protocol | |||
cpp implies lzo, it requires compression, even if you have 10gbit/s or more, it's just hardcoded | |||
architecture | === threads === | ||
/number of workers | |||
== architecture == | |||
it can handle C, C++, ObjC and some other stuff | it can handle C, C++, ObjC and some other stuff | ||
* what happens with normal xmit | |||
* what happens with pump mode | |||
* at which step the include server is used and how it collects the includes | |||
startup and shutdown | === distribution algorithm === | ||
honestly I simply don't get it | |||
* The order matters | |||
* The number of threads matters | |||
==== localhost ==== | |||
* localhost precedence | |||
* localhost fallback | |||
variable: DISTCC_FALLBACK | |||
0 = Fail to compile if it would need to fallback to a normal local gcc call | |||
1 = If remote compile fails, just do it yourself | |||
Latency | |||
== Operation == | |||
=== startup and shutdown === | |||
service distcc stop is not entirely reliable (it can take a minute after the stop until the processes are gone and sometimes it will never stop | service distcc stop is not entirely reliable (it can take a minute after the stop until the processes are gone and sometimes it will never stop | ||
this is very bad with openrc, the openrc script returns after a second and only relies on its service flags, not the process status. | this is very bad with openrc, the openrc script returns after a second and only relies on its service flags, not the process status. | ||
| Line 51: | Line 119: | ||
ccache and memcached | === ccache and memcached === | ||
CCACHE is said to be conflicting with pump mode unless when you call them in the backend | |||
so, where you start the compile, you don't use it | so, where you start the compile, you don't use it | ||
where the compile happens, you use it | where the compile happens, you use it | ||
they share the cache via memcached | they can share the cache via memcached, this is a nice trick for consistency | ||
dockerized / native | === dockerized / native === | ||
it remains mostly the same, a container needs to make sure it monitors the right services (distccd, nginx, include_server) | it remains mostly the same, a container needs to make sure it monitors the right services (distccd, nginx, include_server) | ||
| Line 66: | Line 134: | ||
troubleshooting / analysis | == troubleshooting / analysis == | ||
=== testing === | |||
1. turn off fallback | |||
2. do example compiles | |||
2.1 code example C | |||
with included header | |||
2.2 code example C++ | |||
with included header | |||
2.3 code example ObjC | |||
with included header | |||
== security == | |||
the security model consists of ip restrictions. | the security model consists of ip restrictions. | ||
there seems to also be some GSSAPI user auth. | there seems to also be some GSSAPI user auth. | ||
Revision as of 09:31, 3 March 2026
Hi,
I'm preparing this page. It can take a long time till I finish. If you are also wishing to write on this topic, feel free to integrate the content.
goal
to describe a working setup for building aports in easiest/fastests fashion not planning to add versatility or features where it would make the setup more errorprone.
audience
people running software builds on alpine and have multiple computers
installation
you need, on each host
distcc
distccd-openrc
you also need stuff to do compiles alpine sdk clang binutils ... elfutils(-dev)
settings for distcc there's /etc/default/distcc there's /etc/conf.d/distcc make all your settings here command_whitelist.sh this is half functional, you need to set things here but you also need to maintain the symlinks that are collected under /usr/lib/distcc (for your compilers) and /usr/lib/distcc/bin (for itself)
you MUST run the script to update the compilers!
distcc hosts file
idk about that thing it's odd
settings for aports
detail infos
hosts syntax
- myhost otherhost
- myhost,cpp,lzo myotherhost,cpp,lzo
the host
hostname/ip localhost 127.0.0.1
- 1 - does not work
protocol
- no protocol given
- ,cpp,lzo protocol
cpp implies lzo, it requires compression, even if you have 10gbit/s or more, it's just hardcoded
threads
/number of workers
architecture
it can handle C, C++, ObjC and some other stuff
- what happens with normal xmit
- what happens with pump mode
- at which step the include server is used and how it collects the includes
distribution algorithm
honestly I simply don't get it
- The order matters
- The number of threads matters
localhost
- localhost precedence
- localhost fallback
variable: DISTCC_FALLBACK
0 = Fail to compile if it would need to fallback to a normal local gcc call 1 = If remote compile fails, just do it yourself
Latency
Operation
startup and shutdown
service distcc stop is not entirely reliable (it can take a minute after the stop until the processes are gone and sometimes it will never stop this is very bad with openrc, the openrc script returns after a second and only relies on its service flags, not the process status. manually check after stopping, wait a min, if needed, kill it all. at some point the rc file needs to be rewritten, it can't stay like it is.
if you used a pump mode session, that also needs a logout (pump --shutdown) avoid running multiple startups without shutdown in one session. it's safe as far as I can tell but nothing cleans up these processes.
ccache and memcached
CCACHE is said to be conflicting with pump mode unless when you call them in the backend so, where you start the compile, you don't use it where the compile happens, you use it they can share the cache via memcached, this is a nice trick for consistency
dockerized / native
it remains mostly the same, a container needs to make sure it monitors the right services (distccd, nginx, include_server) if you're using zeroconf, you need to somehow expose the mdns service broadcasts & reception
troubleshooting / analysis
testing
1. turn off fallback
2. do example compiles 2.1 code example C with included header 2.2 code example C++ with included header 2.3 code example ObjC with included header
security
the security model consists of ip restrictions. there seems to also be some GSSAPI user auth. further, commands that can be called are restricted by name and location. this appears to be a runtime whitelist lookup, meaning it's done and authorized by the same parts of the daemon as processes the compile request along with the intended compiler. so the main weaknesses against malicious clients seem to be in sending things to compile, and in overriding the remote compiler to use. it can be assumed that a malicious client able to exploit the compiler handshake can then run arbitrary stuff. There's at least a github issue regarding this google.com/search?q=distcc+seccomp&rlz=1C5CHFA_enDE1121DE1121&oq=distcc+seccomp&gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIHCAEQIRigATIHCAIQIRigAdIBCDM1NjRqMGo3qAIAsAIA&sourceid=chrome&ie=UTF-8 suggesting running over ssh. That does only partitally alleviate this risk with regard to a key based verfication of a client versus a the standard ip restrictions which always include some parsing. So this protects against someone directly exploiting the TCP code of distcc. It does not protect against malicious clients. (ssh force command can't be used or you'll not compile anything)
The basic step for protecting access should be filtering who can access the distcc server, so use nftables etc. to restrict access to port 3262 (??) set up the internal filter the same way.
The next thing is to confine the compiler calls to only write in their temp directory and that they can only run compilers (using nsjail, apparmor, selinux etc)
The other internal security bit is that they do some priviledge dropping. it runs as a dedicated user (distcc), so you can also have an audit policy, and can/could use something like iptables' to ensure it can only connect to the other distcc/memcached hosts, but nothing else.
I think that's safe enough, the main vector is breakouts and confinement is possible. there's also a selinux policy for distcc if one is so inclined.