PID Files
PID Files
A small text file containing a single number — the process ID of a daemon. The convention dates to early Unix and is still ubiquitous: /var/run/sshd.pid, /var/run/nginx.pid, ~/.lazydap/daemon.pid.
A daemon writes its own PID to a known file at startup; other tools read it to send signals, check liveness, or coordinate.
Why a PID file
Three uses:
- Liveness check — "is the daemon running?" Read the PID,
kill(pid, 0)(signal 0 is a no-op that succeeds iff the process exists). If yes, alive. If no (ESRCH), dead — file is stale. - Signalling —
cat /var/run/nginx.pid | xargs kill -HUPto reload nginx. The standard way to control daemons from shell scripts and init systems. - Single-instance enforcement — if a PID file exists and the PID is alive, refuse to start another instance.
File location
Conventions:
- System daemons (root):
/var/run/<name>.pidor/run/<name>.pid(modern;/runis tmpfs). - User daemons:
$XDG_RUNTIME_DIR/<name>/<name>.pidon Linux,~/Library/Application Support/<name>/<name>.pidon macOS. - Per-project daemons (mxr, lazydap): instance-keyed,
~/.cache/<name>/instance-<hash>/<name>.pid.
Use a directory the daemon can reasonably create and write to. Don't pretend /var/run/ works for user daemons.
Format
The file contains the decimal PID, optionally followed by a newline. That's it. Nothing else.
$ cat /var/run/sshd.pid
1234
Some daemons add metadata in adjacent files (.pid for the number, .lock for an exclusive flock, .socket for the IPC socket path). Don't pile metadata into the PID file itself; it breaks the simple cat | xargs kill pattern.
Atomic write
Always write atomically. Writing to a temp file then renaming guarantees that any process reading sees either the old PID or the new one, never half-written:
let tmp = pid_path.with_extension("tmp");
std::fs::write(&tmp, format!("{}\n", std::process::id()))?;
std::fs::rename(&tmp, &pid_path)?;
Naive File::create + write can leave a zero-byte file if the daemon crashes mid-write. The atomic rename pattern avoids this.
File locking
Stronger guarantee than just writing the PID: take an exclusive flock on the file (or a sibling .lock file). Two daemons can't acquire the lock simultaneously; this kills race conditions where two start-up attempts collide.
let file = std::fs::OpenOptions::new()
.write(true).create(true).truncate(true)
.open(&pid_path)?;
file.try_lock_exclusive()?;
write!(&file, "{}\n", std::process::id())?;
// hold the file handle for the daemon's lifetime
When the process exits (cleanly or not), the kernel releases the lock automatically. The next attempt can lock and start.
Stale PID files
When a daemon crashes or is killed without cleanup, the PID file remains pointing at a dead PID. Worse: the OS may eventually reuse that PID for an unrelated process.
Detection on startup:
1. Read the PID file.
2. kill(pid, 0) — does the process exist?
- ESRCH (no such process): stale. Delete and continue.
- EPERM (process exists, not ours): risky. The PID was reused. Refuse to start, alert the user, ask them to investigate.
- 0 (process exists, ours): another instance is running. Refuse to start.
Locking via flock dodges this entirely: a stale lock isn't possible because the kernel released it on exit.
Cleanup on exit
A graceful shutdown deletes the PID file (or unlocks the flock). A crashing process may not — that's why the lock-based approach is more robust. Belt-and-suspenders: trap SIGTERM/SIGINT and clean up; rely on the kernel to release the flock if you didn't.
// In Tokio:
tokio::signal::ctrl_c().await?;
let _ = std::fs::remove_file(&pid_path);
std::process::exit(0);
What mxr and lazydap do
Per Mxr's crates/daemon/src/server.rs (and Lazydap inherits the same pattern):
- PID file at
{data_dir}/daemon.pid. - Atomic write + flock at startup.
- Probe sequence on client start: check socket exists → ping daemon → if no response, read PID file → check process alive → if stale, fork new daemon.
- Cleanup on graceful shutdown via SIGTERM handler.
Common pitfalls
- Race between two simultaneous client invocations — both probe, both find no daemon, both try to spawn. Lock-based PID files prevent the second from succeeding; the second client retries the probe and finds the first's daemon.
- PID reuse on long-running systems — the OS may reuse a PID after the original process exits. Without flock, naive PID-file checks can think a stale entry is alive. Use locks.
- Wrong permissions — PID file world-readable but socket world-accessible is a security hole. Both should be owner-only (0600 / 0700).
- No cleanup path — daemons that don't trap SIGKILL can't clean up (SIGKILL can't be trapped). Use flock so the kernel cleans up for you.
Modern alternatives
- systemd manages PIDs internally; daemons under systemd often don't need PID files at all. Use
Type=notifyandsd_notify(READY=1)for liveness signalling. - launchd (macOS) similar.
- abstract Unix sockets (Linux only, no filesystem path) — sidestep PID files for IPC purposes.
For mxr and lazydap (auto-spawning, user-level, cross-platform): old-school PID file + flock is the sweet spot.
See also
- Daemons — what PID files are for
- Signal Handling — what you do when you have the PID
- How Daemons Work — synthesis