Is there a faster way (than this) to calculate the hash of a file (using hashlib) in Python?

Question

Using an 874 MiB random data file which required 2 seconds with the md5 openssl tool I was able to improve speed as follows.

Using your first method required 21 seconds.
Reading the entire file (21 seconds) to buffer and then updating required 2 seconds.
Using the following function with a buffer size of 8096 required 17 seconds.
Using the following function with a buffer size of 32767 required 11 seconds.
Using the following function with a buffer size of 65536 required 8 seconds.
Using the following function with a buffer size of 131072 required 8 seconds.
Using the following function with a buffer size of 1048576 required 12 seconds.

def md5_speedcheck(path, size):
    pts = time.process_time()
    ats = time.time()
    m = hashlib.md5()
    with open(path, 'rb') as f:
        b = f.read(size)
        while len(b) > 0:
            m.update(b)
            b = f.read(size)
    print("{0:.3f} s".format(time.process_time() - pts))
    print("{0:.3f} s".format(time.time() - ats))

Human time is what I noted above. Whereas processor time for all of these is about the same with the difference being taken in IO blocking.

The key determinant here is to have a buffer size that is big enough to mitigate disk latency, but small enough to avoid VM page swaps. For my particular machine it appears that 64 KiB is about optimal.