Parsing data from a network stream?

Question 1

HttpListener and other such components can parse the messages because the format is deterministic. The Request is well documented. The request header is a series of CRLF-terminated lines, followed by a blank line (two CRLF in a row).

The message body can be difficult to parse, but it's deterministic in that the header tells you what encoding is used, whether it's compressed, etc. Even multi-part messages are not terribly difficult to parse.

Yes, you do need a state machine to parse HTTP messages. And yes you have to parse it byte-by-byte. It's somewhat involved, but it's very fast. Typically you read a bunch of data from the stream into a buffer and then process that buffer byte-by-byte. You don't read the stream one byte at a time because the overhead will kill performance.

You should take a look at the HttpListener source code to see how it all works. Go to http://referencesource.microsoft.com/netframework.aspx and download the .NET 4.5 Update 1 source.

Be prepared to spend a lot of time digging through that and through the HTTP spec.

By the way, it's not difficult to create a program that handles a small subset of HTTP requests. But I wonder why you'd want to do that when you can just use HttpListener and have all the details handled for you.

Update

You are talking about two different protocols. HTTP and WebSocket are two entirely different things. As the Wikipedia article says:

The WebSocket Protocol is an independent TCP-based protocol. Its only relationship to HTTP is that its handshake is interpreted by HTTP servers as an Upgrade request.

With HTTP, you know that the server will send the stream and then close the connection; it's a stream of bytes with a defined end. WebSocket is a message-based protocol; it enables a stream of messages. Those messages have to be delineated in some way; the sender has to tell the receiver where the end of the message is. That can be implicit or explicit. There are several different ways this is done:

The sender includes the length of message in the first few bytes of the message. For example, the first four bytes are a binary integer that says how many bytes follow in that message. So the receiver reads the first four bytes, converts that to an integer, and then reads that many bytes.
The length of the message is implicit. For example, sender and receiver agree that all messages are 80 bytes long.
The first byte of the message is a message type, and each message type has a defined length. For example, message type 1 is 40 bytes, message type 2 is 27 bytes, etc.
Messages have some terminator. In a line-oriented message system, for example, messages are terminated by CRLF. The sender sends the text and then CRLF. The receiver reads bytes until it receives CRLF.

Whatever the case, sender and receiver must agree on how messages are structured. Otherwise the case that you're worried about does crop up: the receiver is left waiting for bytes that will never be received.

In order to handle possible communications problems you set the ReceiveTimeout property on the socket, so that a Read will throw SocketException if it takes too long to receive a complete message. That way, your program won't be left waiting indefinitely for data that is not forthcoming. But this should only happen in the case of communications problems. Any reasonable message format will include a way to determine the length of a message; either you know how much data is coming, or you know when you've reached the end of a message.

Question 2

If you want to send a message you can just pre-pend the size of the message to it. Get the number of bytes in the message, pre-pend a ulong to it. At the receiver, read the size of a ulong, parse it, then read that amount of bytes from the stream and then close it.

In a HTTP header you can read: Content-Length The length of the request body in octets (8-bit bytes)