Check the signature on large data sets efficiently using JCA
-
03-07-2019 - |
Question
I have to verify the signature on a file that may be as large as 2Gb, and I want to do so in a way that is as memory-efficient as possible. For various reasons, the file will already be loaded completely into memory, and is accessed using an InputStream
by the application. I would like to verify the signature using the stream interface, but the JCA Signature
class' update
method only accepts byte[]
and related classes.
How can I do this efficiently? I don't want to load the beast into a second byte array, otherwise we'll be seeing some seriously high memory use, but the interface doesn't seem to support it otherwise.
Update
If it matters, the signing algorithm is SHA-1
Solution
Why not just read the input stream a block (4096bytes or whatever convenient size) at a time, call update() for each block.
OTHER TIPS
Create a byte array to act as a buffer and read buffer at a time from the InputStream, calling update() on the Signature each time. Provided the buffer is of a reasonable size, the CPU time consumed transferring the data from one process to another (I'm guessing that's what you're doing?) is likely to be negligible compared to the calculation time. In the case of reading from disk, the cut-off point for negligible return on CPU usage appears to be a buffer size of around 8K, and I suspect that this will more or less apply in your case too. (In case it's interesting, see the page I put together on InputStream buffer sizes.)