Question

I have a thread pool (executor) which I would like to monitor for excessive resource usage (time since cpu and memory seem to be quite harder). I would like to 'kill' threads that are running too long, like killing an OS process. The workers spend most time calculating, but significant time is also spent waiting for I/O, mostly database...

I have been reading up on stopping threads in java and how it is deprecated for resource cleanup reasons (not properly releasing locks, closing sockets and files and so on). The recommended way is to periodically check in a worker thread whether it should stop and then exit. This obviously expect that client threads be written in certain ways and that they are not blocked waiting on some external I/O. There is also ThreadDeth and InterruptedException which might be able to do the job, but they may actually be circumvented in improperly/malicously written worker threads, and also I got an impression (though no testing yet) they InterruptedException might not work properly in some (or even all) cases when the worker thread is waiting for I/O.

Another way to mitigate it would be to use multiple OS processes to isolate parts of the system, but it brings some unwanted increases in resource consumption.

That led me to that old story about isolates and/or MVM from more than five years ago, but nothing seems to have happened on that front, maybe in java 8 or 9...

So, actually, this all has made me to wander whether some poor mans simulation of processes could be achieved through using threads that would each have their own classloader? Could that be used to simulate processes if each thread (or group) would be loaded in its own classloader? I am not sure how much an increase in resource consumption would that bring (as there would not be much code sharing and code is not tiny). At least process copy-on-write semantics enable code sharing..

Any recommendations/ideas?

EDIT:

I am asking because of general interest and kind of disappointment that no solutions for this exist in the JVM to date (I mean shared application servers are not really possible - application domains, or something like that, in .NET seem to address exactly this kind of problem). I understand that killing a process does not guarantee reverting all system state to some initial condition, but at least all resorces like handles, memory and cpu are released. I was thinking of using classloaders since they might help with releasing locks held by the thread, which is one of the reasons that Thread.stop is deprecated. In my current situation the only other thing should be released (I can think of currently) is database connection, that could be handled separately/externally (by watchdog thread) if needed.. Though, really, in my case Thread.stop might actually be workable, I just dislike using deprecated methods..

Also I am considering this as a safety net for misbehaving processes, Ideally they should behave nicely, and are in a quite high degree under my control.

So, to clarify, I am asking how do for example java people on the server side handle runaway threads? I suspect by using many machines in the cluster to offset the problem and restarting misbehaving ones - when the application is stateless at least..

Was it helpful?

Solution

The difference between a thread and a process is that thread implicitly share memory and resources like sockets and files (making thread local memory a workaround). Processes implicitly have private memory and resources.

Killing the thread is not the problem. The problem is that a poorly behaving thread or even a reasonable behaviour thread can leave resources in an inconsistent state. Using a class loader will not help you track this, or solve the problem for you. For processes its easier to track what resources they are using, as most of the resources are isolated. Even for processes they can leave locks, temporary files and shared IPC resources in an incorrect state if killed.

The real solution is to write code which behaves properly so it can be managed and working around and trying to handle every possible poorly behaving code is next to impossible. If you have a bad third party library you have to use, you can try killing and cleaning it up and you can come up with an ok solution, but you can't expect it to be a clean one.


EDIT: Here is a simple program which will deadlock between two processes or machines because it has a bug in it. The way to stop deadlocks is to fix the code.

public static void main(String... args) throws IOException {
    switch(args.length) {
        case 1: {
            // server
            ServerSocket ss = new ServerSocket(Integer.parseInt(args[0]));
            Socket s = ss.accept();
            ObjectInputStream ois = new ObjectInputStream(s.getInputStream());
            ObjectOutputStream oos = new ObjectOutputStream(s.getOutputStream());
            // will deadlock before it gets here
            break;
        }
        case 2: {
            Socket s = new Socket(args[0], Integer.parseInt(args[1]));
            ObjectInputStream ois = new ObjectInputStream(s.getInputStream());
            ObjectOutputStream oos = new ObjectOutputStream(s.getOutputStream());
            // will deadlock before it gets here
            break;
        }
        default:
            System.err.println("Must provide either a port as server or hostname port as client");
    }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top