Skip to content

fix: reap HEALTHCHECK zombies by running tini as PID 1#4252

Open
JamBalaya56562 wants to merge 1 commit intoDokploy:canaryfrom
JamBalaya56562:fix/tini-pid1-zombie-reap-v2
Open

fix: reap HEALTHCHECK zombies by running tini as PID 1#4252
JamBalaya56562 wants to merge 1 commit intoDokploy:canaryfrom
JamBalaya56562:fix/tini-pid1-zombie-reap-v2

Conversation

@JamBalaya56562
Copy link
Copy Markdown
Contributor

@JamBalaya56562 JamBalaya56562 commented Apr 19, 2026

Problem

On a running Dokploy host (v0.29.0), [curl] <defunct> zombies accumulate inside the container — over a hundred observed after several days. Their parent is the Node.js pnpm start process running as PID 1.

Root cause

  1. The Dockerfile's final CMD runs exec pnpm start, making Node.js PID 1 inside the container.
  2. The image's HEALTHCHECK spawns curl every 10s in the same PID namespace.
  3. Node.js only reaps children it spawned itself via child_process; it installs no generic waitpid reaper. Any process parented to PID 1 by the container runtime stays <defunct> until PID 1 exits.

Grepping the repo confirms no Node code invokes curl via child_process, so the image's HEALTHCHECK is the sole runtime source. This is the well-known "Node.js as PID 1" containerization pitfall.

Fix

Install tini (already in Debian, ~28 KB) and set it as ENTRYPOINT. It reaps orphaned children regardless of origin while still forwarding signals (SIGTERM/SIGINT) to Node — the same mechanism docker run --init uses internally.

The HEALTHCHECK is preserved unchanged: it is still useful as a Swarm rolling-update readiness gate and for docker ps operator visibility. With tini in place its curl children are reaped.

Only the main Dockerfile is modified. Dockerfile.cloud, Dockerfile.schedule, and Dockerfile.server declare no HEALTHCHECK and are out of scope.

Testing

Built the image locally and verified:

  • /usr/bin/tini --version reports tini version 0.19.0.
  • Container PID 1 is tini.
  • Spawned 20 short-lived processes as PID 1 children via docker exec -d; zombie count after 3s was 0.
  • Let the baked-in HEALTHCHECK fire for 65s (6 cycles, Postgres absent so the check fails as expected); zombie count was 0.
  • docker stop terminated the container promptly, confirming signal forwarding works.

Greptile Summary

This PR fixes zombie process accumulation ([curl] <defunct>) caused by Node.js running as PID 1 and failing to reap HEALTHCHECK curl children. It installs tini via apt-get and sets it as ENTRYPOINT, delegating PID 1 reaping duties to tini while leaving the existing HEALTHCHECK, CMD, and all other image contents unchanged.

Confidence Score: 5/5

Safe to merge — the fix is minimal, correct, and well-tested; no behaviour changes beyond zombie reaping and signal forwarding.

Only one file is touched with two small changes: adding tini to an existing apt-get line and replacing a bare CMD with a proper ENTRYPOINT+CMD pair. The approach (tini as PID 1 init) is the canonical Docker solution for this problem, /usr/bin/tini is the correct Debian Bookworm path, and the -- separator is used correctly. No logic, API, or schema changes are involved.

No files require special attention.

Reviews (1): Last reviewed commit: "fix: reap HEALTHCHECK zombies by running..." | Re-trigger Greptile

Node.js as PID 1 does not reap children it did not spawn, so the
HEALTHCHECK curl fired every 10s accumulates as <defunct>. Install
tini and set it as ENTRYPOINT so any orphaned child is reaped while
signals still reach Node.
@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant