Streams: what has happened after instantiation but before reading?

https://softwareengineering.stackexchange.com/questions/302907

08-12-2020
|

Question

I'm trying to grok streams - my world is C# but I suspect the principles are general.

I understand the general principle of reading/writing bytes from/to a store. However, what I don't understand, specifically in a read scenario, is what has happened when you've created an instance of a stream (FileStream, for instance) but have not yet invoked any read methods.

e.g.

var fileStream = new FileStream( "Test.txt", FileMode.Open )

At this point the fileStream.Length property is available to me. This leads me to wonder the following:

Has the FileStream class already read all the bytes in the file and ascertained the length of the file?
Has the file been able to report its length to the FileStream?
Is the file now already loaded into memory and the filestream is offering that up to me in a chunked fashion?
Am I reading into memory the file's bytes at the time I call the read methods?

Solution

Has the FileStream class already read all the bytes in the file and ascertained the length of the file?

Has the file been able to report its length to the FileStream?

You can actually look for yourself at the reference source. It looks as if it uses the Win32Native-API. Hence, getting the length of the stream is basically the same as right clicking a file, and viewing the "properties" of it. The size of the file is saved as meta-data. So in short; no, it does not read the entire file. From top to end, most implementations of Stream does not, this would defeat the purpose of streaming.

Is the file now already loaded into memory and the filestream is that up to me in a chunked fashion?

No, a FileStream is what it says on the tin, a file stream, it "streams" data from the file. If you copy the FileStream to a MemoryStream, then it's an in-memory stream.

Am I reading into memory the file's bytes at the time I call the read methods?

Yes.
To clarify, when you call ReadByte() you are reading one byte into memory, from the streams Position. When you call Read() you are reading the specified amount of bytes into memory (your buffer).

One of the main points of having a stream, is not to have to store something in memory. A byte-array does this, MemoryStream is in fact basically a wrapper for a byte array/buffer.

OTHER TIPS

The answer to all these questions is: "You don't know!".

Please bear in mind that this is a good thing. The entire point of a stream class is to abstract from the tedious detail of how much to read when, and where to allocate the buffer to store stuff that you've already been given by the controller but not yet delivered to your caller. If you wanted to manage all that yourself, you might as well not use a stream and opt to work directly with open() and read() calls. Usually this is not what you want, so a typical stream class not only manages the details, but in fact hides them completely.

A FileStream has its length stored as data about the file. The file is not read yet. Think of a several-GB file - Reading it takes long. But returning the length is quick.

Some other types of Streams do not support seeking and therefore don't return Length. See https://msdn.microsoft.com/en-us/library/system.io.stream.canseek.aspx .

Only when you Read from the file are the bytes actually read from the media into memory. (Sometimes the OS might read ahead data that you did not yet read. But that is an OS implementation.)

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange