GetResponseStream() or ReadBytes() who is actually responsible for downloading the data and how?

Question 1

GetResponseStream opens and returns a Stream object. The stream object is sourced from the underlying Socket. This Socket is sent data by the network adapter asynchronously. The data just arrives and is buffered. GetResponseStream will block execution until the first data arrives.

ReadByte pulls the data up from the socket layer to c#. This method will block execution until there is a byte avaliable.

Closing the stream prematurely will end the asynchronous transfer (closes the Socket, the sender will be notified of this as their connection will fail) and discard (flush) any buffered data that you have not used yet.

Question 2

var webRequest = HttpWebRequest.Create('url of a big file approx 700MB') as HttpWebRequest;

Okay, we're set up ready to go. It's a bit different if you PUT or POST a stream of your own, but the differences are analogous.

var webResponse = webRequest.GetResponse();

When GetResponse() returns, it will at the very least have read all of the HTTP headers. It may well have read the headers of a redirect, and done another request to the URI it was redirected to. It's also possible that it's actually hitting a cache (either directly or because the webserver setnt 304 Not Modified) but by default the details of that are hidden from you.

There will likely be some more bytes in the socket's buffer.

using (BinaryReader ns = new BinaryReader(webResponse.GetResponseStream()))
{

At this point, we've got a stream representing the network stream.

Let's remove the Thread.Sleep() it does nothing except add a risk of the connection timing out. Even assuming it doesn't timeout while waiting, the connection will have "backed off" from sending bytes since you weren't reading them, so the effect will be to slow things even more than you did by adding a deliberate slow-down.

var buffer = ns.ReadBytes(bufferToRead);

At this point, either bufferToRead bytes have been read to create a byte[] or else fewer than bufferToRead because the total size of the stream was less than that, in which case buffer contains the entire stream. This will take as long as it takes.

At this point, because a successful HTTP GET was performed, the underlying web-access layer may cache the response (probably not if it's very large - the default assumption is that very large requests don't get repeated a lot and don't benefit from caching).

Error conditions will raise exceptions if they occur, and in that case no caching will ever be done (there is no point caching a buggy response).

There is no need to sleep, or otherwise "wait" on it.

It's worth considering the following variant that works at just a slightly lower level by manipulating the stream directly rather than through a reader:

using(var stm = webResponse.GetResponseStream())
{

We're going to work on the stream directly;

byte[] buffer = new byte[4096];
do
{
    int read = stm.Read(buffer, 0, 4096);

This will return up to 4096 bytes. It may read less, because it has a chunk of bytes already available and it returns that many immediately. It will only return 0 bytes if it is at the end of the stream, so this gives us a balance between waiting and not waiting - it promises to wait long enough to get at least one byte, but whether or not it waits until it gets all 4096 bytes is up to the stream to choose whether it is more efficient to wait that long or return fewer bytes;

    DoSomething(buffer, 0, read);

We work with the bytes we got.

} while(read != 0);

Read() only gives us zero bytes, if it's at the end of the stream.

And again, when the stream is disposed, the response may or may not be cached.

As you can see, even at the lowest level .NET gives us access to when using HttpWebResponse, there's no need to add code to wait on anything, as that is always done for us.

You can use asynchronous access to the stream to avoid waiting, but then the asynchronous mechanism still means you get the result when it's available.

Question 3

To answer your question about when streaming starts, GetResponseStream() will start receiving data from the server. However, at some point the network buffers will become full and the server will stop sending data if you don't read off the buffers. For a detailed description of the tcp buffers, etc see here.

So your sleep of 60000 will not be helping you much as the network buffers along the way will fill up and data will stop arriving until you read it off. It is better to read it off and write it in chunks as you go.

More info on the workings of ResponseStream here. If you are wondering about what buffer size to use, see here.