Thanks to Robert Crovella
My guess is that there is some sort of display timeout on the mac (also here) As you increase to a larger size, the matrix multiply kernel takes longer. At some point the display driver timeout in the Mac OS resets the GPU. If that is the case, you could work around it by switching to a system/GPU where the GPU is not hosting a display. Both Linux and Windows (TDR) also have such timeout mechanisms.
You have to boot into >console mode in Mac OS and also disable automatic graphic switching as the console mode turns off Aqua (GUI in Mac) and thus is supposed to remove the limitation.