How to design : Avoid resource leaking when randomly accessing files

https://stackoverflow.com/questions/9852073

26-05-2021
|

Question

I have client/server application where the client app will open files. Those files get split in chunks, and sent to the server.

Not only does the client send file chunks, but it sends other data as well. Each message (data or filechunk) has a priority level.

All messages get buffered and written to the TCP stream according to their priority. Currently input and output streams get closed on both sides (client/server) whenever a file is fully sent or received. This means that the streams remain open for as long as it takes to send the files.

The TCP connection is allowed to fail, as it will reconnect and message sending will be resumed, therefore, the streams will be closed at some point.

BUT if and when the JVM would be killed for example, the streams will not be cleaned up.

My first thought on solving this was to add cleanup code in the finalizers, but I understand these may not run when the JVM gets killed (or if System.exit is called).

My second thought is to rewrite part of the application and have only use the streams for as long as it takes to read/write one chunk. Therefore, I would end up opening/closing the files for as many times as there are chunks. This method has the advantage of allowing me to use try-catch-finally constructs, but I have the gut feeling openening and closing the files this often implies a fair bit of overhead.

So how does one clean up resources when the design doesn't allow for finally{} blocks? Or should such a design be avoided, maybe in a way similar to what I described?

I'm also concerned about possibly having as many files open as there are priorities (which in theory are unlimited).

Most files will typically be a few KiB's big, but in some cases they may get as big as several GB.

Thanks in advance for your input.

EDIT: Added image

File transfer as it is

Solution

If I understand the question correctly, you are worried about not cleaning up properly in case the JVM is terminated in an uncontrolled manner.

While this is a general problem, in your particular case I think there is no issue to worry about. You only open your files for reading, so there is no persistent state that your application could break (as I understand, the other side of the TCP connection can handle disconnections gracefully). A file being open is a kind of state which is internal to an application. If the app is killed, the operating system takes care of cleaning up all locks or any data structures it may be internally using to handle operations on that file. This means that no "garbage" will be left over in the OS kernel, and while it's neither an elegant nor a recommended way of cleaning up, it just works (of course, make use of this only for emergencies, not as a normal way of handling things). Killing your app will close open files automatically and should not bring the files into any inconsistent state.

Of course, if the app is killed, it won't finish any operations it was performing. This may lead to inconsistent application-level state or to other issues with some kinds of application logic, but still can't hurt any OS-level staructures (assuming the OS isn't buggy). You may lose data or break your app's state, but you shouldn't be able to break the filesystem or data in the kernel.

So, you may run into issues if you kill an application where you have a transaction-style operation (one which you want to either be completely performed or not at all, but no intermediate state should ever become visible to outside world). An example would be if you had a file and you needed to replace it with a newer version. If you first truncate the old file and then write to it new contents (which is the obvious way to do it), if your app is killed after truncating the file but before writing the new contents, you're in trouble. However, in order to provoke such risks, you need mutable state, i.e. writing something. If you only read stuff, you are almost certainly safe.

If you do run into such a case, you can take several paths. One is trying to make the app bulletproof and to assure that it always cleans up nicely. In practice, this is very hard. You can add shutdown hooks to a Java app, which will be executed when the application closes, but it only works for controlled shutdown (like a regular kill (SIGTERM) on Linux). But, this won't protect you from the app getting forcefully killed by an administrator (kill -9 on Linux), by the OOM-killer (also Linux) etc. Of course, other operating systems have equivalents of these situations or other cases where an app is closed in a way it cannot control. If you are not writing a real-time app that runs in a controlled hardware and software environment, it's next to impossible to prevent all ways an app could be forcefully terminated and prevented from running its cleanup procedures.

So, a reasonable compromise is often to take only simple precautions in the app (like shutdown hooks), but to keep in mind that you can't prevent everything and therefore to make manual cleanup possible and easy. For example, a solution for the case of overwriting a file would be to split the operation into first moving the old file to a new name to keep as a backup (this operation is usually atomic at OS level and therefore safe), then writing the new contents of the file to the old name, and then after checking that the new file was correctly written, deleting the backup. This way, if the app is killed between operations, there exists a simple cleanup procedure: moving the backup file over to the original name and thus reverting to an older but consistent state. This can be accomplished manually (but should be documented), or you could add a cleanup script or a special "cleanup" command to your app in order to make this operation simple, visible and to remove the possibility of human error during the procedure. Assuming your app is only killed rarely, this is usually more effective than spending lots of time on trying to make the app bulletproof only to realize it's not really possible. Embracing failure is often better than trying to prevent it (not only in programming).

You could still get burned at OS and hardware level, but that's really hard to prevent with commodity hardware and OS. Even if your app sees the new file in correct location, it may not have physically been written to disk and if it only resides in cache, it won't get lost if your app is killed, but will get lost if someone pulls the power plug on the machine. But dealing with this kind of failure is a completely different story.

Long story short, in your particular case where you are only reading files, the OS should perform all cleanup if your app is killed, and for other cases some tips are mentioned above.

OTHER TIPS

Have a look at java.lang.Runtime.addShutdownHook(). I'd try adding a hook that closes all open streams, list of which you have to maintain for that case. However, note this:

In rare circumstances the virtual machine may abort, that is, stop running without shutting down cleanly. This occurs when the virtual machine is terminated externally, for example with the SIGKILL signal on Unix or the TerminateProcess call on Microsoft Windows. The virtual machine may also abort if a native method goes awry by, for example, corrupting internal data structures or attempting to access nonexistent memory. If the virtual machine aborts then no guarantee can be made about whether or not any shutdown hooks will be run.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow