Conversation
|
|
|
Maybe I can get away without The symlink issue will be fixed with containers/fuse-overlayfs#205. The missing .MTREE might be a race condition in overlayfs or bash globs? It seems to work pretty well if I add
|
|
I'm a bit skeptical of adding |
|
The fuse-overlayfs issue is fixed. Nice. So I'll try take a dig at how nsjail works and how we can try land this :) |
|
Here are some more tests I did. I'm not sure if that is the best way, though. Maybe there should only be one nsjail/nspawn command that in turn runs a custom script instead of calling it for each command separately. |
|
As a sidenote, there is also https://github.com/containers/bubblewrap which tries some of the same things like nsjail. |
|
Also, If you are able to rebase ontop of master it would be easier for me to test things. |
|
It should be rebased already.
Seems like bubblewarp can operate without SUID: https://github.com/containers/bubblewrap/blob/ff533b84d056f2c22633a84b34323dd085bd977a/bwrap.xml#L57 |
|
It's not rebased :) |
|
Ah, rebased onto the wrong master branch. Now it should be ready. |
|
Thanks for the work! I'll mark this as next on my todo and consider patching our fuse-overlayfs package |
|
For testing purposes I have published |
|
Zomg! It works! I managed to rebuild There are a few issues which I need to look at. λ devtools-repro rootless» git diff
diff --git a/repro.in b/repro.in
index 759c4a9..d06b437 100755
--- a/repro.in
+++ b/repro.in
@@ -84,7 +84,7 @@ function require_userns_tools() {
function mountoverlay() {
if ((rootless_userns)); then
- ~/Projekte/fuse-overlayfs/fuse-overlayfs "$@"
+ fuse-overlayfs "$@"
else
mount -t overlayfs overlayfs "$@"
fi
@@ -285,6 +285,11 @@ function init_chroot(){
mkdir -p "$BUILDDIRECTORY/root"
tar xvf "$IMGDIRECTORY/$bootstrap_img" -C "$BUILDDIRECTORY/root" --strip-components=1 > /dev/null
+ # Custom rootless_userns configs here
+ if ((rootless_userns)); then
+ cat /etc/resolv.conf > "$BUILDDIRECTORY"/root/etc/resolv.conf
+ fi
+
printf 'Server = %s\n' "$HOSTMIRROR" > "$BUILDDIRECTORY"/root/etc/pacman.d/mirrorlist
printf '%s.UTF-8 UTF-8\n' en_US de_DE > "$BUILDDIRECTORY"/root/etc/locale.gen
printf 'LANG=en_US.UTF-8\n' > "$BUILDDIRECTORY"/root/etc/locale.conf |
| } | ||
|
|
||
| # Desc: Enter a user namespace with virtual privileges | ||
| function become_rootless() { |
There was a problem hiding this comment.
I struggle to understand why we need to setup this as an replacement for sudo and su. And why we can't simply skip this and contain this logic to the exec_nsjail function
There was a problem hiding this comment.
The reason I have this is that fuse-overlayfs has to run inside the user-namespace so it can map the subuids and I think some files inside the container filesystem are accessed from outside. With some refactoring becom_rootless can be removed I think, i.e. only call exec_nsjail twice. Once for the root setup and once for the build.
function build_package(){
## [snip]
mkdir -p "./build"
for pkgfile in "$BUILDDIRECTORY/$build"/pkgdest/*; do
mv "$pkgfile" "./build/"
done
chown -R "$src_owner" "./build"
}This logic doesn't work for rootless container. The persmission we end up with are from the container it seems
This is fixed by setting |
While inside the user-namespace, it should be chowned to root, which will be mapped to the id of the user that created the namespace. I guess --keep-env in the first nsjail command causes problems. Either remove that and only pass necessary environment variables along or explicitly set $USER to root or something like that. |
|
Current todo for this draft
Any other things from your side which needs testing and/or some improvements? If the restructuring takes too much time or is complicated I don't think it's needed for this run. |
These changes allow to run repro without root privileges. Due to some issues, the resulting builds are not reproducible. This mode can be enabled with the
-rflagRequirements
sysctl kernel.unprivileged_userns_clone=1)/etc/subuidand/etc/subgid. I used the following for both files:become-roothas to be installed.nsjail did deny
/proc/self/setgroupswhich caused issues with sudo/su. Maybe it is a kernel-bug thatbecome-rootworks without that setting. I did not understand that section in the user_namespaces manpages completely.It may be possible to get sudo (not su) working with setgroups disabled with the
preserve_groupsoption.nsjailis required instead of systemd-nspawnfuse-overlayfsis required since overlayfs is not allowed for unprivileged namespaces in the upstream kernel.unsharefrom util-linux. I use it to create a mount namespace for the container and as pid1 in the nsjail commands. The mount namespace might not be necessary and for pid1 it may be better to use dumb-init. I don't think it would be a good idea to directly run e.g. pacman as pid1, and I don't know about using bash.Known Issues
(touch -h fails to set the date for symlinks containers/fuse-overlayfs#204)
Additional notes
I changed
systemd-machine-id-setupto run inside the container, so rootless repro might run without systemd installed.