How to find out the name of the default page displayed by a webserver?
-
22-10-2019 - |
Question
I'm downloading various files through I/O-streaming in my Java application. Receiving and saving those files works well as long as I have a full URL-path including file name, but how can I find out the name of the index file (as defined in, for example, Apache's DirectoryIndex
) of a domain? The HTTP header doesn't provide this information and neither does the URLConnection
method.
Thanks alot!
Be well
S.
Solution
As far as I know there is no way of retrieving this information. The HTTP specification doesn't provide it, and I think this isn't a bad thing. Your clients requests the URL "/", it's up to the web server how to handle that, there is no obligation to return a filename too.
It's also worth pointing out (I'm sure you're aware of it but just in case) that just because a URL looks like /somedir/somefile.html
, it doesn't mean that is the actual file being served. It could be being served via a proxy to another host, mod_rewrite etc - in other words, the name as arbitrary and doesn't necessarily bear any relation to the physical name on disk.
In short, I think your best bet would be to pick a default filename e.g. index.html
for those cases and stick to it.
OTHER TIPS
Only way out is to:
- Inspect Content-Disposition header and use it to generate filename. If server is serving a file, it would set this header. E.g. http://server:port/DownLoadServlet URL might set this header to indicate name as "statement.pdf".
- IF this header is missing, use Heuristics to generate filename. This is what browsers do to generate filenames like Doc[10].pdf Doc[12].pdf etc.
- Use content-type header (if available) to guess file extension.