Pregunta

Last week I started quite a fuss in my Computer Networks class over the need for a mandatory Host clause in the header of HTTP 1.1 GET messages.

The reason I'm provided with, be it written on the Web or shouted at me by my classmates, is always the same: the need to support virtual hosting. However, and I'll try to be as clear as possible, this does not appear to make sense.

I understand that in order to allow two domains to be hosted in a single machine (and by consequence, share the same IP address), there has to exist a way of differentiating both domain names.

What I don't understand is why it isn't possible to achieve this without a Host clause (HTTP 1.0 style) by using an absolute URL (e.g. GET http://www.example.org/index.html) instead of a relative one (e.g. GET /index.html). When the HTTP message got to the server, it (the server) would redirect the message to the appropriate host, not by looking at the Host clause but, instead, by looking at the hostname in the URL present in the message's request line.

I would be very grateful if any of you hardcore hackers could help me understand what exactly am I missing here.

¿Fue útil?

Solución

This was discussed in this thread:

modest suggestions for HTTP/2.0 with their rationale.

  1. Add a header to the client request that indicates the hostname and port of the URL which the client is accessing.

Rationale: One of the most requested features from commercial server maintainers is the ability to run a single server on a single port and have it respond with different top level pages depending on the hostname in the URL.

Making an absolute request URI required (because there's no way for the client to know on beforehand whether the server homes one or more sites) was suggested:

Re the first proposal, to incorporate the hostname somewhere. This would be cleanest put into the URL itself :-

GET http://hostname/fred http/2.0

This is the syntax for proxy redirects.

To which this argument was made:

Since there will be a mix of clients, some supporting host name reporting and some not, it just doesn't matter how this info gets to the server. Since it doesn't matter, the easier to implement solution is a new HTTP request header field. It allows all clients and servers to operate as they do now with NO code changes. Clients and servers that actually need host name information can have tiny mods made to send the extra header field containing the URL and process it.

[...]

All I'm suggesting is that there is a better way to implement the delivery of host name info to the server that doesn't involve hacking the request syntax and can be backwards compatible with ALL clients and servers.

Feel free to read on to discover the final decision yourself. But be warned, it's easy to get lost in there.

Otros consejos

The reason for adding support for specifying a host in an HTTP request was the limited supply of IP addresses (which was not an issue yet when HTTP 1.0 came out).

If your question is "why specify the host in a Host header as opposed to on the Request-Line", the answer is the need for interopability between HTTP/1.0 and 1.1.

If the question is "why is the Host header mandatory", this has to do with the desire to speed up the transition away from assigned IP addresses.

Here's some background on the Internet address conservation with respect to HTTP/1.1.

The reason for the 'Host' header is to make explicit which host this request refers to. Without 'Host', the server must know ahead of time that it is supposed to route 'http://joesdogs.com/' to Joe's Dogs while it is supposed to route 'http://joscats.com/' to Jo's Cats even though they are on the same webserver. (What if a server has 2 names, like 'joscats.com' and 'joescats.com' that should refer to the same website?)

Having an explicit 'Host' header make these kinds of decisions much easier to program.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top