Question

Our application is using Commons VFS to read various types of files. We use the automatic file type detection VFS provides, via its file extension mapping.

The problem: VFS misclassifies gz files (ie. files whose name ends in .gz) as regular files, rather than as GZIP files. This prevents us from using VFS to read the (decompressed) content of gz files, without some special-case manually hack-arounding.

I've traced the problem to org.apache.commons.vfs2.impl.FileContentInfoFilenameFactory.create(), which calls

FileNameMap fileNameMap = URLConnection.getFileNameMap();
contentType = fileNameMap.getContentTypeFor(name);

This loads the file content-types.properties from the current Java installation. This file (on Windows, at least) contains this mapping:

application/octet-stream: \
    description=Generic Binary Stream;\
    file_extensions=.saveme,.dump,.hqx,.arc,.obj,.lib,.bin,.exe,.zip,.gz    

According to the source code, org.apache.commons.vfs2.impl.FileTypeMap allows this mapping to take precedence over the file extension map with which VFS was configured.

Can anyone think of a way of either (a) extending a class or two of VFS to work around this problem or (b) configuring VFS and/or Java itself so that VFS correctly classifies gz files?

Was it helpful?

Solution

Create a class like the following, to override the getContentTypeFor method of FileNameMap and exclude the troublesome application/octet-stream entry:

public static class MyFileNameMap implements FileNameMap
{
    private FileNameMap delegate = URLConnection.getFileNameMap();

    @Override
    public String getContentTypeFor( String fileName )
    {
        String contentType = delegate.getContentTypeFor( fileName );
        if( "application/octet-stream".equals( contentType ) )
        {
            // Sun's java classifies zip and gzip as application/octet-stream,
            // which VFS then uses, instead of looking at its extension
            // map for a more specific mime type
            return null;
        }
        return contentType;
    }
}

Install this new class via:

URLConnection.setFileNameMap( new MyFileNameMap() );

Now when you call FileSystemManager.resolveFile(), VFS will choose the correct file type for gz files by falling back to its extensions map.

Note: This is a global change to the current JVM, so be careful if you are using any other code that needs this mime type entry for things like .exe files.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top