Your method 1 (the for
loop), is bad for the cache.
As arr1
, arr2
, arr3
may not be anywhere near each other in memory, and very likely will not be in the cache together, you could have frequent cache-misses, and the CPU will have to constantly fetch new pieces from memory, just to set them to zero.
By doing a set of memset
operations, you'll hit ALL of arr1
, at once, almost certainly entirely from the cache. Then you'll cache and set all of arr2
very quickly, etc, etc.
That, and because memset
may well have assembly tricks and optimizations to make it faster, I would definitely prefer option 2 over option 1.