PID Files

#resources #resources/programming #resources/programming/systems #resources/programming/operating-systems

PID Files

A small text file containing a single number — the process ID of a daemon. The convention dates to early Unix and is still ubiquitous: /var/run/sshd.pid, /var/run/nginx.pid, ~/.lazydap/daemon.pid.

A daemon writes its own PID to a known file at startup; other tools read it to send signals, check liveness, or coordinate.

Why a PID file

Three uses:

Liveness check — "is the daemon running?" Read the PID, kill(pid, 0) (signal 0 is a no-op that succeeds iff the process exists). If yes, alive. If no (ESRCH), dead — file is stale.
Signalling — cat /var/run/nginx.pid | xargs kill -HUP to reload nginx. The standard way to control daemons from shell scripts and init systems.
Single-instance enforcement — if a PID file exists and the PID is alive, refuse to start another instance.

File location

Conventions:

System daemons (root): /var/run/<name>.pid or /run/<name>.pid (modern; /run is tmpfs).
User daemons: $XDG_RUNTIME_DIR/<name>/<name>.pid on Linux, ~/Library/Application Support/<name>/<name>.pid on macOS.
Per-project daemons (mxr, lazydap): instance-keyed, ~/.cache/<name>/instance-<hash>/<name>.pid.

Use a directory the daemon can reasonably create and write to. Don't pretend /var/run/ works for user daemons.

Format

The file contains the decimal PID, optionally followed by a newline. That's it. Nothing else.

$ cat /var/run/sshd.pid
1234

Some daemons add metadata in adjacent files (.pid for the number, .lock for an exclusive flock, .socket for the IPC socket path). Don't pile metadata into the PID file itself; it breaks the simple cat | xargs kill pattern.

Atomic write

Always write atomically. Writing to a temp file then renaming guarantees that any process reading sees either the old PID or the new one, never half-written:

let tmp = pid_path.with_extension("tmp");
std::fs::write(&tmp, format!("{}\n", std::process::id()))?;
std::fs::rename(&tmp, &pid_path)?;

Naive File::create + write can leave a zero-byte file if the daemon crashes mid-write. The atomic rename pattern avoids this.

File locking

Stronger guarantee than just writing the PID: take an exclusive flock on the file (or a sibling .lock file). Two daemons can't acquire the lock simultaneously; this kills race conditions where two start-up attempts collide.

let file = std::fs::OpenOptions::new()
    .write(true).create(true).truncate(true)
    .open(&pid_path)?;
file.try_lock_exclusive()?;
write!(&file, "{}\n", std::process::id())?;
// hold the file handle for the daemon's lifetime

When the process exits (cleanly or not), the kernel releases the lock automatically. The next attempt can lock and start.

Stale PID files

When a daemon crashes or is killed without cleanup, the PID file remains pointing at a dead PID. Worse: the OS may eventually reuse that PID for an unrelated process.

Detection on startup:

1. Read the PID file.
2. kill(pid, 0)  — does the process exist?
   - ESRCH (no such process): stale. Delete and continue.
   - EPERM (process exists, not ours): risky. The PID was reused. Refuse to start, alert the user, ask them to investigate.
   - 0 (process exists, ours): another instance is running. Refuse to start.

Locking via flock dodges this entirely: a stale lock isn't possible because the kernel released it on exit.

Cleanup on exit

A graceful shutdown deletes the PID file (or unlocks the flock). A crashing process may not — that's why the lock-based approach is more robust. Belt-and-suspenders: trap SIGTERM/SIGINT and clean up; rely on the kernel to release the flock if you didn't.

// In Tokio:
tokio::signal::ctrl_c().await?;
let _ = std::fs::remove_file(&pid_path);
std::process::exit(0);

What mxr and lazydap do

Per Mxr's crates/daemon/src/server.rs (and Lazydap inherits the same pattern):

PID file at {data_dir}/daemon.pid.
Atomic write + flock at startup.
Probe sequence on client start: check socket exists → ping daemon → if no response, read PID file → check process alive → if stale, fork new daemon.
Cleanup on graceful shutdown via SIGTERM handler.

Common pitfalls

Race between two simultaneous client invocations — both probe, both find no daemon, both try to spawn. Lock-based PID files prevent the second from succeeding; the second client retries the probe and finds the first's daemon.
PID reuse on long-running systems — the OS may reuse a PID after the original process exits. Without flock, naive PID-file checks can think a stale entry is alive. Use locks.
Wrong permissions — PID file world-readable but socket world-accessible is a security hole. Both should be owner-only (0600 / 0700).
No cleanup path — daemons that don't trap SIGKILL can't clean up (SIGKILL can't be trapped). Use flock so the kernel cleans up for you.

Modern alternatives

systemd manages PIDs internally; daemons under systemd often don't need PID files at all. Use Type=notify and sd_notify(READY=1) for liveness signalling.
launchd (macOS) similar.
abstract Unix sockets (Linux only, no filesystem path) — sidestep PID files for IPC purposes.

For mxr and lazydap (auto-spawning, user-level, cross-platform): old-school PID file + flock is the sweet spot.

PID Files

Why a PID file

File location

Format

Atomic write

File locking

Stale PID files

Cleanup on exit

What mxr and lazydap do

Common pitfalls

Modern alternatives

See also