Question

I am writing code in C++ (using the Poco net libraries) to try to create a program for fun which will email me every few hours with updates to the TwitchPlaysPokemon stream (stupid, I know). Here is my code:

#include <iostream>   
#include "Poco/Net/SocketAddress.h"
#include "Poco/Net/StreamSocket.h"
#include "Poco/Net/SocketStream.h"
#include "Poco/StreamCopier.h"

using namespace std;
using namespace Poco::Net;
using namespace Poco;

int main(int argc, char *argv[])
{   
    string url = "www.reddit.com";

    string fullPage;
    SocketAddress sa(url, 80);
    StreamSocket socket(sa);
    SocketStream str(socket);
    str << "GET / HTTP/1.1\r\n"
     "Host: " << url << "\r\n"
     "\r\n";
    str.flush();

    StreamCopier::copyStream(str, cout);    
}

This exact code works perfectly fine. It grabs the raw html of www.reddit.com and prints it to the console. However, I'm trying to get information from one of two places for my program:

Either:

Here (url = "http://www.reddit.com/live/sw7bubeycai6hey4ciytwamw3a")

or

Here (url = "https://sites.google.com/site/twitchplayspokemonstatus/")

Either of these will be fine for my purposes. The problem is that when I plug these values in as the url in my program, the program has no idea what I'm talking about. Specifically, I get the following:

Unhandled Exception

so clearly it cannot find the host. This is where I am stuck, as I know very little about internet protocol, hosts, etc. I tried to see if there was a specific IP address for this website (using cmd prompt ping), but it couldn't figure it out either ( it says "Ping request could not find the host www.reddit.com/live/sw7bubeycai6hey4ciytwamw3a"). The Poco library accepts written out urls (www.reddit.com), IPv4, and IPv6 addresses as the host input to SocketAddress (where I use the variable url, the other variable is the port which I've been told should basically always be 80?)

Question: I need help figuring out how I should be identifying the host to the Poco library. In other words, how do I properly refer to the host for either of those two sites listed above in such a way that my code can recognize it and grab the HTML from the page.

Was it helpful?

Solution

It sounds as though you may not understand HTTP correctly. Here's a brief refresher.

To get the contents of the URL http://www.example.com/path/page.html, the corresponding HTTP request would be sent to www.example.com on port 80, and would have the contents:

GET /path/page.html HTTP/1.1\r\n
Host: www.example.com\r\n
\r\n    

The critical part that it doesn't look like you're doing correctly here is splitting the URL into the hostname and path components. Having a single url variable won't work (unless you manually split it on the first slash).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top