Question

How can I create a reserved pointer value?

The context is this: I have been thinking of how to implement a data structure for a dynamic scripting language (I am not planning on implementing this - just wondering how it would be done).

Strings may contain arbitrary bytes, including NUL. Thus, it is necessary to store the value separately. This requires a pointer (to point to the array) and a number. The first trick is that if the pointer is NULL, it cannot possibly be a valid string, so the number can be used for an actual integer.

If a second reserved pointer value could be created, this could be used to imply that the other field is now being used as a floating-point value. Can this be done?

One thought is to mmap() an address with no permissions, which could also be done to replace the usage of the NULL pointer.

Was it helpful?

Solution

On any modern system, you can just use the pointer values 1, 2, ... 4095 for such purposes. Another frequent choice is (uintptr_t)-1, which is technically inferior, but used more frequently than 1 nevertheless.

Why are these values "safe"?
Modern systems safeguard against NULL pointer accesses by making it impossible to map anything at virtual address zero. Almost any dereferencing of a NULL pointer will hit this nonexistant region, and the hardware will tell the OS system that something bad happened, which triggers the OS to segfault the process.
Since virtual memory pages are page aligned (at least 4k on current hardware), and nothing is mapped to address zero, nothing can be mapped to the entire range 0, ..., 4095, protecting all these addresses in the same way, and you can use them as special purpose values.

How much virtual memory space is reserved for this purpose is a system parameter, on linux it is controlled by /proc/sys/vm/mmap_min_addr, and the root user can change it to zero, which would disable this protection (which would not be a very smart idea). The default on Ubuntu is 64k (i. e. 16 pages).

This is also the reason why (uintptr_1)-1 is less safe than 1; even though any load of more than one byte will hit the zero page, the address (uintptr_1)-1 itself is not necessarily protected in this way. Consequently, doing string operations on (char*)-1 does not necessarily segfault.

Edit:
My original explanation with the special mapping seems to have been a bit stale, probably this was the way things were handled on the old Mac/PPC platform. Even though the effect is pretty much the same, I changed the details of the answer to reflect modern linux. Anyway, the important point is not how the null page protection is achieved, the important point is that any sane, modern system will have some null page protection that encompasses at least the mentioned address range. Some more details can be found in this SO answer: https://stackoverflow.com/a/12645890/2445184

OTHER TIPS

In standard C (and standard C++), the approach that's 100% valid and works is simple: declare a variable, use its address as a magic value.

char *ptr;
char magic;
if (ptr == &magic) { ... }

This guarantees that magic will never have any overlap with another object.

Magic pointer values such as (char *) 1 have their advantages too, but it's so easy to get them wrong (even if you disregard the theoretical implementations where (char *) 1 may be a valid object, if you use (int *) 1 as a magic pointer value, and the optimiser assumes int * values are suitably aligned, it may removes checks that are no-ops only in 100% valid code, not in your code) that I'd recommend the standard approach, and optionally temporarily switch to magic pointer values only if you find they help you debug.

mmaping an address can fail if the address is already assigned. Probably it would better to use an address of some static variable or function. Or to obtain an unique address via malloc(1).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top