Sporadic errors with Kubernetes liveness probes
Due to a still-under-investigation bug outside of Prisma Cloud (either in runc or the kernel), there have been cases where Kubernetes liveness probes fail under load. If your Defender log file shows the following error, then permanently activate Prisma Cloud’s runC shim layer.
false Error: OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused "process_linux.go:138: adding pid 3101 to cgroups caused \"failed to write 3101 to cgroup.procs: write /sys/fs/cgroup/cpu,cpuacct/docker/1d7b29b96dfd7bc97c6a2d6cbff82b00509cdcc4dbf2ac72ef5dd2bef9db7067/cgroup.procs: invalid argument\"": unknown
Prisma Cloud has added retry logic to our runC shim (runc.tw) to automatically retry commands when a failure is returned from runC/kernel. Since failures are intermittent, the probe simply works on the next retry. For more information about Prisma Cloud’s runC shim, see the article on Defender architecture. Note that when Always enable Defender runc proxy is enabled, the shim is always in place, even when you don’t have any blocking rules. That way, Twistock can intercept and retry all exec commands.
To apply the work-around:
Log into Console
Go to Manage > Defenders > Manage.
Click on Advanced Settings.
Turn Always enable Defender runc proxy to On.