An array of pointer require an extra level of indirection, and thus will be slower. Moreover, it might cause more cache misses.
So I tried with 1000*1000 matrix :
Test 2 (int**)
$ perf stat ./test2
Performance counter stats for './test1':
8561,688348 task-clock (msec) # 1,000 CPUs utilized
13 context-switches # 0,002 K/sec
9 cpu-migrations # 0,001 K/sec
3 058 page-faults # 0,357 K/sec
24 844 304 630 cycles # 2,902 GHz [83,32%]
21 302 837 742 stalled-cycles-frontend # 85,75% frontend cycles idle [83,32%]
2 110 745 845 stalled-cycles-backend # 8,50% backend cycles idle [66,68%]
7 030 427 722 instructions # 0,28 insns per cycle
# 3,03 stalled cycles per insn [83,36%]
1 004 889 984 branches # 117,371 M/sec [83,37%]
1 077 360 branch-misses # 0,11% of all branches [83,32%]
8,561758966 seconds time elapsed
Test 3 (int*)
$ perf stat ./test3
Performance counter stats for './test2':
1367,856713 task-clock (msec) # 0,995 CPUs utilized
195 context-switches # 0,143 K/sec
3 cpu-migrations # 0,002 K/sec
3 001 page-faults # 0,002 M/sec
3 977 759 335 cycles # 2,908 GHz [83,41%]
975 477 913 stalled-cycles-frontend # 24,52% frontend cycles idle [83,41%]
179 003 530 stalled-cycles-backend # 4,50% backend cycles idle [66,81%]
7 017 803 848 instructions # 1,76 insns per cycle
# 0,14 stalled cycles per insn [83,41%]
1 002 321 021 branches # 732,768 M/sec [83,42%]
1 046 066 branch-misses # 0,10% of all branches [82,97%]
1,374613079 seconds time elapsed
As we can see, a lot more cycles int test2.
I also mesured the cache missed :
Test 2 (int**)
$ perf stat -e 'L1-dcache-loads,L1-dcache-load-misses,L1-dcache-stores,L1-dcache-store-misses' ./test2
Performance counter stats for './test1':
3 009 028 415 L1-dcache-loads
1 259 622 058 L1-dcache-load-misses # 41,86% of all L1-dcache hits
6 427 561 L1-dcache-stores
1 141 461 L1-dcache-store-misses
8,650905202 seconds time elapsed
Test 3 (int*)
$ perf stat -e 'L1-dcache-loads,L1-dcache-load-misses,L1-dcache-stores,L1-dcache-store-misses' ./test3
Performance counter stats for './test2':
2 004 807 223 L1-dcache-loads
596 388 192 L1-dcache-load-misses # 29,75% of all L1-dcache hits
2 111 264 L1-dcache-stores
384 198 L1-dcache-store-misses
1,384352257 seconds time elapsed