Question

what is the right way to read chunked data (from http request) from socket?

sf::TcpSocket socket;
socket.connect("0.0.0.0", 80);

std::string message = "GET /address HTTP/1.1\r\n";
socket.send(message.c_str(), message.size() + 1);

// Receive an answer from the server
char buffer[128];
std::size_t received = 0;
socket.receive(buffer, sizeof(buffer), received);
std::cout << "The server said: " << buffer << std::endl;

But server sends infinite data and socket.receive doesn't return management. Any right ways to read chunked data part by part? (The answer is chunked data).

Was it helpful?

Solution

The right way to process HTTP requests is to use a higher-level library that manages the socket connections for you. In C++ one example would be pion-net; there are others too like Mongoose (which is C, but fine to use in C++).

OTHER TIPS

Well infinite data is theoretically possible while the practical implementation differ from process to process.

  • Approach 1 - Generally many protocol do send size in the first few bytes ( 4 bytes ) and you can have a while loop

{

int i = 0, ret = 1;
unsigned char buffer[4];
while ( i<4 && ret == 0)
   socket.receive(buffer + i,  1 , ret);

// have a while loop to read the amount of data you need. Malloc the buffer accordingly

}

  • Approach 2 - Or in your case where you don't know the lenght ( infinite )

{

char *buffer = (char *)malloc(TCP_MAX_BUF_SIZE);
std::size_t total = 0, received = 0;
while ( total < TCP_MAX_BUF_SIZE && return >= 0) {
    socket.receive(buffer, sizeof(buffer), received);
    total += received;
}

//do something with your data

}

You will have to break at somepoint and process your data Dispatch it to another thread of release the memory.

If by "chunked data" you are referring to the Transfer-Encoding: chunked HTTP header, then you need to read each chunk and parse the chunk headers to know how much data to read in each chunk and to know when the last chunk has been received. You cannot just blindly call socket.receive(), as chunked data has a defined structure to it. Read RFC 2616 Section 3.6.1 for more details.

You need to do something more like the following (error handling omitted for brevity - DON'T omit it in your real code):

std::string ReadALine(sf::TcpSocket &socket)
{
    std::string result;

    // read from socket until a LF is encountered, then
    // return everything up to, but not including, the
    // LF, stripping off CR if one is also present...

    return result;
}

void ReadHeaders(sf::TcpSocket &socket, std::vector<std::string> &headers)
{
    std::string line;

    do
    {
        line = ReadALine(socket);
        if (line.empty()) return;
        headers.push_back(line);
    }
    while (true);
}

std::string UpperCase(const std::string &s)
{
    std::string result = s;
    std::for_each(result.begin(), result.end(), toupper);            
    return result;
}

std::string GetHeader(const std::vector<std::string> &headers, const std::string &s)
{
    std::string prefix = UpperCase(s) + ":";

    for (std::vector<std::string>::iterator iter = headers.begin(), end = headers.end(); iter != end; ++iter)
    {
        if (UpperCase(i)->compare(0, prefix.length(), prefix) == 0)
            return i->substr(prefix.length());
    }

    return std::string();
}

sf::TcpSocket socket; 
socket.connect("0.0.0.0", 80); 

std::string message = "GET /address HTTP/1.1\r\nHost: localhost\r\n\r\n"; 
socket.send(message.c_str(), message.length()); 

std:vector<std::string> headers;

std::string statusLine = ReadALine(sockeet);
ReadHeaders(socket, headers);

// Refer to RFC 2616 Section 4.4 for details about how to properly
// read a response body in different situations...

int statusCode;
sscanf(statusLine.c_str(), "HTTP/%*d.%*d %d %*s", &statusCode);

if (
    ((statusCode / 100) != 1) &&
    (statusCode != 204) &&
    (statusCode != 304)
    )
{
    std::string header = GetHeader(headers, "Transfer-Encoding");

    if (UpperCase(header).find("CHUNKED") != std::string::npos)
    {
        std::string extensions;
        std::string_size_type pos;
        std::size_t chunkSize;

        do
        {
            line = ReadALine(socket);
            pos = line.find(";");
            if (pos != std::string::npos)
            {
                extensions = line.substr(pos+1);
                line.resize(pos);
            }
            else
                extensions.clear();

            chunkSize = 0;
            sscanf(UpperCase(line).c_str(), "%X", &chunkSize);
            if (chunkSize == 0)
                break;

            socket.receive(someBuffer, chunkSize);
            ReadALine(socket);

            // process extensions as needed...
            // copy someBuffer into your real buffer...
        }
        while (true);

        std::vector<std::string> trailer;
        ReadHeaders(socket, trailer);

        // merge trailer into main header...
    }
    else
    {
        header = GetHeader(headers, "Content-Length");

        if (!header.empty())
        {
            uint64_t contentLength = 0;
            sscanf(header.c_str(), "%Lu", &contentLength);

            // read from socket until contentLength number of bytes have been read...
        }
        else
        {
            // read from socket until disconnected...
        }
    }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top