Question

How are pascal strings laid out in memory?

I read: http://www.freepascal.org/docs-html/ref/refsu12.html It says that strings are stored on the heap and reference counted. To figure out where the length and reference was stored, I created a string and did tests on it a lot:

type PInt = ^Integer;

var
    str: String;
begin
    str := 'hello';
    writeln(PInt(@str[1]) - (sizeof(integer) * 1)); //length
    writeln(PInt(@str[1]) - (sizeof(integer) * 2)); //reference count
end.

The first one prints the length and the second one prints the reference count. It does this perfectly fine and it works.

Now I tried to emulate the same thing in C:

Export char* NewCString()
{
    const char* hello_ptr = "hello";

    int length = strlen(hello_ptr);

    //allocate space on the heap for:  sizeof(refcount) + sizeof(int) + strlength
    char* pascal_string = (char*)malloc((sizeof(int) * 2) + length);

    *((int*)&pascal_string[0]) = 0; //reference count to 0.
    *((int*)&pascal_string[sizeof(int)]) = length;  //length of the string.

    strcpy(&pascal_string[sizeof(int) * 2], hello_ptr); //copy hello to the pascal string.

    return &pascal_string[sizeof(int) * 2]; //return a pointer to the data.
}

Export void FreeCString(char* &ptr)
{
    int data_offset = sizeof(int) * 2;
    free(ptr - data_offset);
    ptr = NULL;
}

Then in pascal I do:

var
    str: string;
begin
    str := string(NewCString());
    writeln(PInt(@str[1]) - (sizeof(integer) * 1)); //length - prints 5. correct.
    writeln(PInt(@str[1]) - (sizeof(integer) * 2)); //reference count - prints 1! correct.
   //FreeCString(str);  //works fine if I call this..
end.

The pascal code prints the length correctly and the reference count is increased by one due to the assignment. This is correct.

However, as soon as it is finished executing, it crashes badly! It seems to be trying to free the string/heap. If I call FreeCString myself, it works just fine! I'm not sure what is going on.

Any ideas why it crashes?

Was it helpful?

Solution 2

Just because the runtime system lays out strings a particular way in memory, doesn't mean that writing C code to duplicate that memory layout will work. String management may involve additional constraints or external data structures. To make a string compatible with FreePascal, use FreePascal's own library routines.

It sounds like FreePascal requires something besides free() happen when the refcount goes to zero, but it's likely impossible to tell what without some reverse engineering or digging into ABI specs.

OTHER TIPS

  1. "string" is an alias that can point to 3 different string types (shortstring,ansistring and unicodestring)
  2. ansistring and unicodestring changed layout going from FPC 2.6 to FPC 2.7.x+ (equal to Delphi 2007 to Delphi 2009)
  3. Any Delphi mem allocator must be able to tell the size of an allocated block. Usually this is done by putting the 32-bit size in the block.
  4. FreePascal and Delphi have pluggable memory allocators. The default Free Pascal manager is an own suballocator. To have it use (on *nix) whatever libc uses, use unit cmem as first unit in your main program.
  5. As ansistring and unicodestring are refcounted, using manual tricks you are responsible for maintaining the integrity of the ref count. Which includes maintaining Pascal ABI in this for the Pascal <-> C changeovers.

In short don't, and the rare case that you must, add a constructor and a destructor function to pascal, and do all allocation via that.

P.s. you may want to have a look at rtl/inc/astrings.inc P.s.2 on Windows it might be easiest to use COM compatible widestring (BSTR) for interlanguage string types.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top