Question

I have a code which uses standard routines of numerical recipes for finding eigen vectors of a 3x3 matrix. While the code runs perfectly on linux machines, it fails with a segmentation fault 11 on mac. With gdb, when I back tracked I find that

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000140400008
0x0000000100002a88 in tqli (d=0x7fff5fbffaa4, e=0x7fff5fbffa98, n=3, z=0x140400000) at     ac_nr.c:402
402                         f=z[k][i+1];

where tqli is the standard routine in numerical recipies, and z is defined properly. I can say this confidently because in linux machines the program finds no difficulties in executing and gets me the correct answer. Google search is not helping to get any relevant answer. Can anybody hint at what is happening in mac or how to go ahead in fixing this?

Thanks a lot,

Was it helpful?

Solution

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000140400008
0x0000000100002a88 in tqli (d=0x7fff5fbffaa4, e=0x7fff5fbffa98, n=3, z=0x140400000)
at     ac_nr.c:402
402                         f=z[k][i+1];

As you see, your z pointer is 0x140400000, and the error happens at 0x0000000140400008, which is 8 bytes later. This is in all probability a buffer overflow against z (array out of bounds).

Why does it work in Linux: that may have to do with allocation strategies, for example if the alloc library allocates memory in blocks of 16 bytes to improve performances, and you request a block of 8 bytes, then 16 bytes will be allocated. Without protectors or canaries, nothing will stop you from addressing one item beyond the bounds, and having allocated what amounts to one extra item, you won't get a segfault.

But on a machine that allocates 8 bytes blocks, the same code will crash.

I strongly endorse Andreas Florath's suggestion: run valgrind against your program in Linux and check for out-of-bounds accesses. Or you may want to verify the size of z and where you actually access it.

OTHER TIPS

I'm pretty certain that your z is NOT defined properly (or some calling conventon setting or similar is wrong).

It may well be that the code "works" on Linux but not on Mac - perhaps because memory regions are arranged differently, but I'm almost certain that the problem is with what you are passing to tqli() - if you look at the d & e values, they look like saen addresses (for stack variables). The z value is completely different range, and the address that goes wrong is just after that, so probably the first read [I bet i and k are zero if you check the variables in a debugger].

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top