Question

I am making a virtual machine for a small computer language. This virtual machine is developed in C using the GNU utility Flex. The project compilation is therefore with GNU GCC and then Flex.

Within this virtual machine, I have a GC Stop & Copy. Before my changes, GC's working memory could not enlarge - e.g go from 512 bytes to 1024 bytes if the first flip has not optimized the space used to be able to make a new allocation.

This changes seemed to work. In fact, I don't know if it worked really since those changes, but now I have a bug. It just appeared in the first flip. Indeed, when it comes copying data, I have a constant variable that changes. But this variable is important because it points to the item that I want to copy. In the Stop & Copy, this variable is used to change a slot (here SLOT_FORWARD) to inform the new position of the data in memory (in case we would still copy).

So I have a loop that copies each box of previous container whose position in memory is specified by the variable old. And I have a new container which is filled from addr position. But the old value is changed during iteration! And after the copy, I want to change the slot forward to put the new container's address. But as old has changed, you can imagine that I record this value in the wrong place.

So I spent a long time debugging cases where this happens very rarely (it happens some times after 2 flip with 3 to 4 container). I use GDB to be aware the value changed in one of my debug function (while she was also amended by adding debug functions). I then changed compiler (clang to gcc) to restart GDB and see that it was a brace character (still in debug function) that changed the value ... Finally, I put all my parameters for all functions const when it was possible, and now I am told that the value have been changed in the file iofwrite.c line 37. It is therefore a mistake from another world.

The code in question where the bug is here:

static t_case
copy(t_dono *dono, const t_case old)
{
  t_case  addr;
  t_case  size;
  t_case  temp;
  int     i;

  temp = old;

  if (mem[old + SLOT_FORWARD] >= ns
      && mem[old + SLOT_FORWARD] <= ts)
    return (mem[old + SLOT_FORWARD]);
  else
    {
      addr = mp;
      size = mem[old + SLOT_SIZE];
      i = 0;

      fprintf(stderr, "change:\t");
      dump(stderr, mem, old);

      assert(old == temp);

      while (i < size)
        {
          fprintf(stderr, "!!!COPY:\t");
          dump(stderr, mem, old);
          assert(old == temp);
          mem[addr + i] = mem[old + i]; /* BUG IS HERE */
          i = i + 1;
        }

      mem[old + SLOT_FORWARD] = addr;
      fprintf(stderr, "change:\t");
      dump(stderr, mem, old);
      assert(old == temp);
      mp = mp + size;

      return (addr);
    }
}

As you can see, I did a lot of debug to target the error and I got this log file:

ref:            [ 0005 0001 0003 0004 0035 ]
copy:           [ 0007 0001 0003 0004 0075 0001 00f9 ]
change:         [ 0007 0001 0003 0004 0075 0001 00f9 ]
!!!COPY:        [ 0007 0001 0003 0004 0075 0001 00f9 ]
!!!COPY:        [ 0007 0001 0003 0004 0075 0001 00f9 ]
!!!COPY:        [ 0007 0001 0003 0004 0075 0001 00f9 ]
!!!COPY:        [ 0003 0001 0003 ]
!!!COPY:        [ 0003 0004 0003 ]
!!!COPY:        [ 0003 0004 0075 ]
!!!COPY:        [ 0003 0004 0075 ]
change:         [ 0003 0033 0075 ]

I also used Valgrind which told me that many errors but only after this bug (which is normal since the GC will be accessing random data now). During this change of variable, I have absolutely no errors.

We can see that other containers passing through function copy (l: 662) do not get this undefined behavior (see the log file at line 10, 48, 54, 66, 82, 120, 126 and 134) . It is only at execution time that everything goes wrong, which, of course, erroneous all GC data.

The code is really long (about 1000 lines) because the goal is to run the VM in a single file C. I'm sorry I can not make the code clearer. But the problem just appears magically and I am not able to go further and make the language of the future that will surpass Python (joke).

The link of repository is: git.osau.re
The change's link is: ompldr

Kind regards.

Was it helpful?

Solution

Tracking down random memory clobbers (which is what this sounds like) is very hard. My approach for this would be to figure out when the compiler has decided to put the changing variable (probably somewhere on the stack), and then setting a watchpoint for that location. This should tell you where the code is modifying the value -- you'll likely find it is in some called function that is stepping back up the stack to step on your variable. Then you need to figure out why that is happening.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top