You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/linux-hardening/privilege-escalation/docker-security/docker-breakout-privilege-escalation/sensitive-mounts.md
+61-1Lines changed: 61 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -291,13 +291,73 @@ locate the other containers' filesystems and SA / web identity tokens
291
291
292
292
293
293
294
+
### Other Sensitive Host Sockets and Directories (2023-2025)
295
+
296
+
Mounting certain host Unix sockets or writable pseudo-filesystems is equivalent to giving the container full root on the node. **Treat the following paths as highly sensitive and never expose them to untrusted workloads**:
chroot /host /bin/bash # full root shell on the host
314
+
```
315
+
316
+
A similar technique works with **crictl**, **podman** or the **kubelet** API once their respective sockets are exposed.
317
+
318
+
Writable **cgroup v1** mounts are also dangerous. If `/sys/fs/cgroup` is bind-mounted **rw** and the host kernel is vulnerable to **CVE-2022-0492**, an attacker can set a malicious `release_agent` and execute arbitrary code in the *initial* namespace:
319
+
320
+
```bash
321
+
# assuming the container has CAP_SYS_ADMIN and a vulnerable kernel
sh -c "echo 0 > /tmp/x/cgroup.procs"# triggers the empty-cgroup event
328
+
```
329
+
330
+
When the last process leaves the cgroup, `/tmp/pwn` runs **as root on the host**. Patched kernels (>5.8 with commit `32a0db39f30d`) validate the writer’s capabilities and block this abuse.
runc ≤1.1.11 leaked an open directory file descriptor that could point to the host root. A malicious image or `docker exec` could start a container whose *working directory* is already on the host filesystem, enabling arbitrary file read/write and privilege escalation. Fixed in runc 1.1.12 (Docker ≥25.0.3, containerd ≥1.7.14).
336
+
337
+
```Dockerfile
338
+
FROM scratch
339
+
WORKDIR /proc/self/fd/4 # 4 == "/" on the host leaked by the runtime
A race condition in the BuildKit snapshotter let an attacker replace a file that was about to be *copy-up* into the container’s rootfs with a symlink to an arbitrary path on the host, gaining write access outside the build context. Fixed in BuildKit v0.12.5 / Buildx 0.12.0. Exploitation requires an untrusted `docker build` on a vulnerable daemon.
345
+
346
+
### Hardening Reminders (2025)
347
+
348
+
1. Bind-mount host paths **read-only** whenever possible and add `nosuid,nodev,noexec` mount options.
349
+
2. Prefer dedicated side-car proxies or rootless clients instead of exposing the runtime socket directly.
4. In Kubernetes, use `securityContext.readOnlyRootFilesystem: true`, the *restricted* PodSecurity profile and avoid `hostPath` volumes pointing to the paths listed above.
-[Understanding and Hardening Linux Containers](https://research.nccgroup.com/wp-content/uploads/2020/07/ncc_group_understanding_hardening_linux_containers-1-1.pdf)
298
359
-[Abusing Privileged and Unprivileged Linux Containers](https://www.nccgroup.com/globalassets/our-research/us/whitepapers/2016/june/container_whitepaper.pdf)
0 commit comments