Question

I have an URL to file which I can download. It looks like this:

 http://<server>/recruitment-mantis/plugin.php?page=BugSynchronizer/getfile&fileID=139&filehash=3e7a52a242f90c23539a17f6db094d86

How to get content type of this file? I have to admin that in this case simple:

   URL url = new URL(stringUrl);

   URLConnection urlConnection = url.openConnection();
   urlConnection.connect();

   String urlContent = urlConnection.getContentType();

returning me application/force-download content type in every file (no matter is jpg or pdf file). I want to do this cause I want to set extension of downloaded file (which can be various). How to 'get around' of this application/force-download content type? Thanks in advance for your help.

Was it helpful?

Solution

Check urlConnection.getHeaderField("Content-Disposition") for a filename. Usually that header is used for attachments in multipart content, but it doesn't hurt to check.

If that header is not present, you can save the URL to a temporary file, and use probeContentType to get a meaningful MIME type:

Path tempFile = Files.createTempFile(null, null);
try (InputStream urlStream = urlConnection.getInputStream()) {
    Files.copy(urlStream, tempFile, StandardCopyOption.REPLACE_EXISTING);
}
String mimeType = Files.probeContentType(tempFile);

Be aware that probeContentType may return null if it can't determine the type of the file.

OTHER TIPS

How to 'get around' of this application/force-download content type?

I had the same problem with my uploaded content-type. Although you can trust the content-type from the URL, I chose to go looking for a content-type utilities to determine the content from the byte content.

After trying 5 or so implementations I decided to reinvent the wheel and released my SimpleMagic package which makes use of the magic(5) Unix content-type files to implement the same functionality as the Unix file(1) command. It uses either internal config files or can read /etc/magic, /usr/share/file/magic, or other magic(5) files and determine file content from File, InputStream, or byte[].

Location of the github sources, javadocs, and some documentation are available from the home page.

With SimpleMagic, you do something like the following:

ContentInfoUtil util = new ContentInfoUtil();
ContentInfo info = util.findMatch(byteArray);

It works from the contents of the data (File, InputStream, or byte[]), not the file name.

I guess this content type is set from the server your are downloading from. Some server use these kind of content type to force browsers to download the file instead of trying to open it. For example when my server return content type "application/pdf" chrome will try to open it as pdf, but when the server returns "application/force-download" the browser will save it to disk, because he has no clue what to do with this.

So you need to change the server to return the correct content type or better try some other heuristic to get the correct file type, because the server can always lie to you by setting it to jpg but giving you an exe.

I see with Java 7 you can try this method: http://docs.oracle.com/javase/7/docs/api/java/nio/file/Files.html#probeContentType%28java.nio.file.Path%29

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top