Question

I have known that If you define a bunch of @"" NSString objects in the source code in Mac OS. These NSStrings will be stored in a segment in the Mach-O library.

Section
sectname __ustring
 segname __TEXT
    addr 0x000b3b54
    size 0x000001b7
  offset 731988
   align 2^1 (2)
  reloff 0
  nreloc 0
   flags 0x00000000
reserved1 0
reserved2 0

If I hex dump the binary, they are aligned closely one by one with a 0x0000as separator. What I want to know is how does the loader in Mac OS X load these NSStrings when the program runs? Are they loaded simpily by recognize the 0x0000 separator or these is a string offset table elsewhere in the binary pointing to separate NSString objects? Thanks.

(What I really want to do is the increase the length of one of the NSString, so I have to know how the loader recognize these separate objects)

added: I know if you define CStrings like @"abc" in the code it will goes to cstring segment. If it is a string like @"“”" with out of ascii characters it will goes to ustring section according to my digging.

Was it helpful?

Solution

There is a cstring section with all the constant C strings. Each constant NSString just refers to one of those C strings. The C struct for a constant NSString looks like this:

struct NSConstantString {
  Class isa;
  char *bytes;
  int numBytes;
};

Look in the __DATA __cfstring section.

Edit:

The __ustring segment is the equivalent of the __cstring segment, except with UTF16 strings. So a constant NSString may refer to either ustring or cstring data.

The only reference to the ustring data is probably from the cfstring it is used by. If you lengthen one string, the cfstring referring to the next string will instead refer to the tail of lengthened string unless you fix it. You may be able to find some free space elsewhere that you can point the cfstring at.

OTHER TIPS

No. Each string has an address in the binary. If you insert a character in one string, the address will increase of all the ones above it and you'll need to adjust their addresses wherever they are referred to in the binary, plus if you make the segment bigger, you'll possibly need to adjust the locations of any subsequent segments depending on how much packing there was for alignment of the segment. It's far easier to just recompile the program and let the linker take care of it.

NB NSStrings are not stored internally as sequences of C chars. It's an implementation detail, but I suspect that NSStrings use a 16 bit character width.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top