Question

I've had a new found interest in building a small, efficient web server in C and have had some trouble parsing POST methods from the HTTP Header. Would anyone have any advice as to how to handle retrieving the name/value pairs from the "posted" data?

POST /test HTTP/1.1
Host: test-domain.com:7017
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://test-domain.com:7017/index.html
Cookie: __utma=43166241.217413299.1220726314.1221171690.1221200181.16; __utmz=43166241.1220726314.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none)
Cache-Control: max-age=0
Content-Type: application/x-www-form-urlencoded
Content-Length: 25

field1=asfd&field2=a3f3f3
// ^-this

I see no tangible way to retrieve the bottom line as a whole and ensure that it works every time. I'm not a fan of hard-coding in anything.

Was it helpful?

Solution

You can retrieve the name/value pairs by searching for newline newline or more specifically \r\n\r\n (after this, the body of the message will start).

Then you can simply split the list by the &, and then split each of those returned strings between the = for name/value pairs.

See the HTTP 1.1 RFC.

OTHER TIPS

Once you have Content-Length in the header, you know the amount of bytes to be read right after the blank line. If, for any reason (GET or POST) Content-Length is not in the header, it means there's nothing to read after the blank line (crlf).

You need to keep parsing the stream as headers until you see the blank line. The rest is the POST data.

You need to write a little parser for the post data. You can use C library routines to do something quick and dirty, like index, strtok, and sscanf. If you have room for it in your definition of "small", you could do something more elaborate with a regular expression library, or even with flex and bison.

At least, I think this kind of answers your question.

IETF RFC notwithstanding, here is a more to the point answer. Assuming that you realize that there is always an extra /r/n after the Content-Length line in the header, you should be able to do the work to isolate it into a char* variable named data. This is where we start.

char *data = "f1=asfd&f2=a3f3f3";
char f1[100], 
char f2[100];
sscanf(data, "%s&%s", &f1, &f2); // get the field tuples

char f1_name[50];
char f1_data[50];
sscanf(f1, "%s=%s", f1_name, f1_data);  

char f2_name[50];
char f2_data[50];
sscanf(f2, "%s=%s", f2_name, f2_data);  
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top