Question

I need to call a library function that sometimes won't terminate within a given time, unfortunately. Is there a way to call the function but abort it if it doesn't terminate within n seconds?

I cannot modify the function, so I cannot put the abort condition into it directly. I have to add a timeout to the function externally.

Is it maybe a possible solution to start it as a (boost) thread, which I can then terminate after a certain time? Would something like that work? I actually believe the function is not thread-safe, but that wouldn't matter if I run it as the only single thread, right? Are there other (better) solutions?

Was it helpful?

Solution

You could spawn a boost::thread to call the API:

boost::thread api_caller(::api_function, arg1, arg2);
if (api_caller.timed_join(boost::posix_time::milliseconds(500)))
{
    // API call returned within 500ms
}
else
{
    // API call timed out
}

Boost doesn't allow you to kill the worker thread, though. In this example, it's just orphaned.

You'll have to be careful about what that API call does, because it may never release resources it's acquired.

OTHER TIPS

I think the only safe way to accomplish this would be to spawn a separate sandbox process that calls the library function as a proxy to your application. You'll need to implement some type of IPC between your application and the proxy. Implementing a timeout on reading the IPC reply is then fairly trivial. If the read fails due to timeout, you can then safely terminate the proxy without risking the health of your application.

What you are talking about is typically called a "watchdog" system. The watchdog is typically a second thread which checks on the status of all the other threads. The watchdog typically is setup to run periodically. If no response has been received from the other threads, the watchdog can notify the user, or even kill the offending thread if its possible to do so safely (depends on your application).

The problem with threads is that some resources you won't be able to free after thread termination. If you don't acquire resources which you have to release then go with threads.

The problem is that with an in-process solution without support from the function you end up with potentially invalid state.

Example: When you terminate the thread while a memory allocation is taking place, your process heap may be corrupted.

So you might terminate the call, but then you also have to terminate the process. In many cases, the chances for destructive side effects are small, but I wouldn't bet my computation on that.

You can, as Ben Straub suggests, just orphan the thread: put it on lowest priority and let it run for infinity. That is of course only a limited solution: if the thread consumes ressources (likely), they will slow down the system, also there's a limit on threads per process (usually due to address space for thread stack).

Generally, I'd prefer the external process solution. A simple pattern is this:
Write input data to a file, start the external process with the file as argument. The external process writes progress (if any) to a disk file that can be monitored, and may even allow the process to resume from where it started. Results are written to disk, and the parent process can read them in.

When you terminate the process, you still have to deal with synchronizing access to external ressources (like files), and how to deal with abandoned mutices, half-written files etc. But it's generally THE way to a robust solution.

What you need is a thread and a Future Object that can hold the result from the function call.

For an example using boost see here.

You would need to check the future after the timeout and if it is not set, act accordingly.

Go with an Orphan process, launch it and time its execution. If it runs out of time, invoke the OS to kill it.

How to avoid race conds. on this pattern:

  • create a file to store in args (of course, everything is passed on as VALs). The orphan process is only allowed to read data from this file.

  • The orphan processes input data, creates an output file with result values and closes it.

  • Only when everything is done, orphan deletes the input file, a fact that signals the master process that work was done.

This avoids reading half-written files problem, since the master first notices absense of input file, opens for read the output file, which is surely completed (because was closed prior to deleting input, and OS call stacks are sequential).

"I need to call a library function that sometimes won't terminate within a given time, unfortunately. Is there a way to call the function but abort it if it doesn't terminate within n seconds?"

The short answer is no. That's usually trouble... The call itself must terminate at some time (implementing its own timeout), but blocking calls are usually trouble (e.g. gethostbyname()) because then it's up to their (or system) timeout, not yours.

So, whenever possible try to make the code running in the thread exit cleanly when necessary--the code itself must detect and handle the error. It can send a message and/or set statuses so that the main (or aother) thread knows what went on.

Personal preference, in highly available systems, I like my threads spinning often (no busy-locking though) with specific timeouts, calling non-blocking functions, and with precise exit conditions in place. A global or thread-specific 'done' variable does the trick for a clean exit.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top