Question

I have a Java application that monitors a folder for incoming XML files. When a new file is detected I need to test the file that it is not currently being updated and is closed. My thought is to use File.canWrite() to test this. Is there any issue with doing this? Is this a good way to test that a file has been completely written?

Other ideas I am throwing around are:

  • Parse the incoming XML file and test that the closing tag is there.
  • Check for the EoF character.

I just am not sure that any of these methods will handle all scenarios.

Was it helpful?

Solution

No, canWrite is not suitable for this purpose. In general the file will be writable even if another process is writing.

You need a higher level protocol to coordinate the locking. If you plan to use this code on a single platform, you may be able to use NIO's FileLock facility. But read the documentation carefully, and note that on many platforms, the lock is only advisory.

Another approach is to have one process write the file with a name that your process won't recognize, then rename the file to a recognizable name when the write is complete. On most platforms, the rename operation is atomic if the source and destination are the same file system volume. The name change might use a different file extension, or even moving the file from one directory to another (on the same volume).

Since in this case you are working exclusively with XML, looking for a close tag would work, but it isn't foolproof—what if there are comments after the final markup, or the writer or simply doesn't write valid XML?

Looking for the EOF will not work. There will always be an EOF, even when the writer has just opened the file and hasn't written anything yet. If this weren't so, the easiest thing would be to allow the reader to start parsing as soon as the file showed up; it would simply block until the writer closed the file. But the file system doesn't work this way. Every file has an end, even if some process is currently moving it.

OTHER TIPS

Additionally, if you do a check followed by a write, then you have a race condition. The state could change between the check and the write. Sometimes its best to try and do the thing you want and handle errors gracefully. perhaps an n-attempt retry mechanism with a increased fallback delay time.

Or redefine your test. In this case, you could perhaps test that the filesize hasn't changed over a period of time before processing it.

Another option is to split the code into two, you could have another thread -- perhaps a quartz task -- responsible for moving finished files into a different directory that your main code processes.

One thing that appears to work in Windows is this - Create a File() object that represents the file in question (using constructor with full filename) - Create a second identical File Object, same way. - Try firstFile.renameTo(secondFile)

This dummy renaming exercise seems to succeed with files that are not open for editing by another app (I tested with Word), but fails if they are open.

And as the nw filename = the old filename it doesn't create any other work.

As far as I know, there is no way to tell if another process currently has an open handle to a file from Java. One option is to use the FileLock class from new io. This isn't supported on all platforms, but if the files are local and the process writing the file cooperates, this should work for any platform supporting locks.

If you control both the reader and writer, then a potential locking technique would be to create a lock directory -- which is typically an atomic operation -- for the read and the write process duration. If you take this type of approach, you have to manage the potential failure of a process resulting in a "hanging" lock directory.

As Cheekysoft mentioned, files are not atomic and are ill suited for locking.

If you don't control the writer -- for instance if it's being produced by an FTP daemon -- then the rename technique or delay for time span technique are your best options.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top