Вопрос

I'm working on a daemon that communicates with several processes. The daemon can't monitor the processes all the time, but it must be able to properly identify if a process dies to release scare resources it holds for it.

The processes can communicate with the daemon, giving it some information at the start, but not vice versa. So the daemon can't just ask a process its identity.

The simplest form would be to use just their PID. But eventually another process could be assigned the same PID without my tool noticing.

A better approach would be to use PID plus the time the process started. A new process with the same PID would have a distinct start time. But I couldn't find a way how to get the process start time in a POSIX way. Using ps or looking at /proc/<pid>/stat seems not portable enough.

A more complicated idea that seems POSIX-compliant would be:

  • Each process creates a temporary file.
  • Locks it using flock
  • Tells my daemon "my identity is connected with this file".
  • Any time the daemon can check the temporary file. If it's locked, the process is alive. If it's not, the process is dead.

But this seems unnecessarily complicated.

Is there a better, or standard way?

Edit: The daemon must be able to resume after a restart, so it's not possible to keep a persistent connection for each process.

Это было полезно?

Решение

But I couldn't find a way how to get the process start time in a POSIX way.

Try the standard "etime" format specifier: LC_ALL=C ps -eo etime= $PIDS

In fairness, I would probably construct my own table of live processes rather that relying on the process table and elapsed time. That's fundamentally your file-locking approach, though I'd probably aggregate all the lockfiles together in a known place and name them by PID, e.g., /var/run/my-app/8819.lock. Indeed, this might even be retrofitted on to the long-running processes, since file locks on file descriptors can be inherited across exec().

(Of course, if the long-running processes I cared about had a common parent, then I'd rather query the common parent, who can be a reliable authority on which processes are running and which are not.)

Другие советы

The standard way is the unnecessarily complicated one. That' life in a POSIX-compliant environment...

Other methods than the file exist and have various benefits/tradeoffs - most of the "standard" IPC mechanisms would work for this as well - a socket, pipe, message queue, shared memory... Basically pick one mechanism that allows your application to announce to the daemon that it has started (and maybe that it's exiting, for an orderly shutdown). In between, it could send periodic "I'm still here" messages and the daemon could notice when it doesn't get one, or the daemon could poll periodically or something... There's quite a few ways to accomplish what you want, but without knowing more about the exact architecture you're trying to achieve, it's difficult to point at the "one best way"...

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top