What's the easiest way to grab a web page in C?
Question
I'm working on an old school linux variant (QNX to be exact) and need a way to grab a web page (no cookies or login, the target URL is just a text file) using nothing but sockets and arrays.
Anyone got a snippet for this?
note: I don't control the server and I've got very little to work with besides what is already on the box (adding in additional libraries is not really "easy" given the contraints -- although I do love libcurl)
Solution
I do have some code, but it also supports (Open)SSL so it's a bit long to post here.
In essence:
parse the URL (split out URL scheme, host name, port number, scheme specific part
create the socket:
s = socket(PF_INET, SOCK_STREAM, proto);
populate a
sockaddr_in
structure with the remote IP and portconnect the socket to the far end:
err = connect(s, &addr, sizeof(addr));
make the request string:
n = snprinf(headers, "GET /%s HTTP/1.0\r\nHost: %s\r\n\r\n", ...);
send the request string:
write(s, headers, n);
read the data:
while (n = read(s, buffer, bufsize) > 0) { ... }
close the socket:
close(s);
nb: pseudo-code above would collect both response headers and data. The split between the two is the first blank line.
OTHER TIPS
I'd look at libcurl if you want SSL support for or anything fancy.
However if you just want to get a simple webpage from a port 80, then just open a tcp socket, send "GET /index.html HTTP/1.0\n\r\n\r
" and parse the output.