Question

Much like the same functionality in HxD editor, I am implementing a program that searches for a specific hex-value (say 32 bits) in a big binary files (> 1 GB). Memory is limited and it seems that reading chunk by chunk is quite slow with BinaryReader class. HxD returns the search result (reaching almost the end of the file) in about 12 seconds which is acceptable.

Was it helpful?

Solution

BinaryReader should be able to read a gigabyte in 12 seconds, provided your disk subsystem can handle it (which it apparently can, since HxD is doing it). The key is opening the file with a larger input buffer. That is, rather than:

var f = File.OpenRead(filename)

Call

var f = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.None, 65536);

That will cause .NET to read the file in 64 KB chunks rather than the default 4 KB chunks.

Although why you're using BinaryReader at all is something of a mystery. Why not read the stream directly? For example:

var buff = new byte[1024*1024];
int bytesRead = f.Read(buff, 0, buff.Length);

With a 64 KB file buffer, .NET has to make only 16 calls to the OS to fulfill your request. Using the default 4K buffer, it would have to make 256 calls to the OS. The difference is remarkable.

Using a buffer size argument larger than 64 kilobytes doesn't give you much in the way of performance improvement. And a buffer larger than 256 KB actually caused the system to read slower in my tests. 64 KB seems to be the "sweet spot," at least on the systems I tested with.

If you decide to use BinaryReader for some reason, you should expect a similar performance increase with the larger buffer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top