Question

I realize "fast" is a bit subjective so I'll explain with some context. I'm working on a Python module called psutil for reading process information in a cross-platform way. One of the functions is a pid_exists(pid) function for determining if a PID is in the current process list.

Right now I'm doing this the obvious way, using EnumProcesses() to pull the process list, then interating through the list and looking for the PID. However, some simple benchmarking shows this is dramatically slower than the pid_exists function on UNIX-based platforms (Linux, OS X, FreeBSD) where we're using kill(pid, 0) with a 0 signal to determine if a PID exists. Additional testing shows it's EnumProcesses that's taking up almost all the time.

Anyone know a faster way than using EnumProcesses to determine if a PID exists? I tried OpenProcess() and checking for an error opening the nonexistent process, but this turned out to be over 4x slower than iterating through the EnumProcesses list, so that's out as well. Any other (better) suggestions?

NOTE: This is a Python library intended to avoid third-party lib dependencies like pywin32 extensions. I need a solution that is faster than our current code, and that doesn't depend on pywin32 or other modules not present in a standard Python distribution.

EDIT: To clarify - we're well aware that there are race conditions inherent in reading process iformation. We raise exceptions if the process goes away during the course of data collection or we run into other problems. The pid_exists() function isn't intended to replace proper error handling.

UPDATE: Apparently my earlier benchmarks were flawed - I wrote some simple test apps in C and EnumProcesses consistently comes out slower and OpenProcess (in conjunction with GetProcessExitCode in case the PID is valid but the process has stopped) is actually much faster not slower.

Was it helpful?

Solution

OpenProcess could tell you w/o enumerating all. I have no idea how fast.

EDIT: note that you also need GetExitCodeProcess to verify the state of the process even if you get a handle from OpenProcess.

OTHER TIPS

There is an inherent race condition in the use of pid_exists function: by the time the calling program gets to use the answer, the process may have already disappeared, or a new process with the queried id may have been created. I would dare say that any application that uses this function is flawed by design and that optimizing this function is therefore not worth the effort.

Turns out that my benchmarks evidently were flawed somehow, as later testing reveals OpenProcess and GetExitCodeProcess are much faster than using EnumProcesses after all. I'm not sure what happened but I did some new tests and verified this is the faster solution:

int pid_is_running(DWORD pid)
{
    HANDLE hProcess;
    DWORD exitCode;

    //Special case for PID 0 System Idle Process
    if (pid == 0) {
        return 1;
    }

    //skip testing bogus PIDs
    if (pid < 0) {
        return 0;
    }

    hProcess = handle_from_pid(pid);
    if (NULL == hProcess) {
        //invalid parameter means PID isn't in the system
        if (GetLastError() == ERROR_INVALID_PARAMETER) { 
            return 0;
        }

        //some other error with OpenProcess
        return -1;
    }

    if (GetExitCodeProcess(hProcess, &exitCode)) {
        CloseHandle(hProcess);
        return (exitCode == STILL_ACTIVE);
    }

    //error in GetExitCodeProcess()
    CloseHandle(hProcess);
    return -1;
}

Note that you do need to use GetExitCodeProcess() because OpenProcess() will succeed on processes that have died recently so you can't assume a valid process handle means the process is running.

Also note that OpenProcess() succeeds for PIDs that are within 3 of any valid PID (See Why does OpenProcess succeed even when I add three to the process ID?)

I'd code Jay's last function this way.

int pid_is_running(DWORD pid){
    HANDLE hProcess;
    DWORD exitCode;
    //Special case for PID 0 System Idle Process
    if (pid == 0) {
        return 1;
    }
    //skip testing bogus PIDs
    if (pid < 0) {
        return 0;
    }
    hProcess = handle_from_pid(pid);
    if (NULL == hProcess) {
        //invalid parameter means PID isn't in the system
        if (GetLastError() == ERROR_INVALID_PARAMETER) {
             return 0;
        }
        //some other error with OpenProcess
        return -1;
    }
    DWORD dwRetval = WaitForSingleObject(hProcess, 0);
    CloseHandle(hProcess); // otherwise you'll be losing handles

    switch(dwRetval) {
    case WAIT_OBJECT_0;
        return 0;
    case WAIT_TIMEOUT;
        return 1;
    default:
        return -1;
    }
}

The main difference is closing the process handle (important when the client of this function is running for a long time) and the process termination detection strategy. WaitForSingleObject gives you the opportunity to wait for a while (changing the 0 to a function parameter value) until the process ends.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top