سؤال

while reading "Understanding Linux Kernel" I found that union is being used for Process Descriptor data structure.

union thread_union {
   struct thread_info thread_info;
   unsigned long stack[2048]; /* 1024 for 4KB stacks */
};

why use union here for union union thread_union here when both data structures are being used ?

هل كانت مفيدة؟

المحلول

First of all, it is

union thread_union {
    struct thread_info thread_info;
    unsigned long stack[THREAD_SIZE/sizeof(long)];
};

as defined in kernel include/linux/sched.h). This is important, because the macro THREAD_SIZE is used in a lot of places (several hundred times in the kernel sources, overall), and varies between architectures.

The OP is wondering why not use a structure instead:

struct thread_struct {
    struct thread_info thread_info;
    unsigned long stack[(THREAD_SIZE - sizeof (struct thread_info))/sizeof (long)];
};

(I am assuming the related macros, init_thread_info and init_stack, are adjusted accordingly, i.e. to both refer to the start of the init_thread_union, so that actual memory layout is not changed.)

The simple reason is that the two members of the union are intended to reside in the same memory area, therefore a union is more appropriate.

The full reasoning is more complicated. The main point is that all architectures define an init_thread_union variable of this union type in init/init_task.c, used for the initial kernel thread at bootup, and preprocessor macros

#define init_thread_info    (init_thread_union.thread_info)
#define init_stack          (init_thread_union.stack)

in architecture-specific header files (for example, in arch/x86/include/asm/thread_info.h on x86). These macros refer to the initial thread (the one that boots up the kernel) and its stack, respectively.

As far as I can tell, the union thread_union type is not used for any other purpose except for that initial stack and thread info. Furthermore, the init_thread_info part is only needed during bootup, not later on.

This means if a structure was used instead of an union, the struct thread_info part would stay unused in memory, for as long as the kernel was running. Sure, it's not a lot of bytes.. But, using an union -- remember that in Linux, stacks grow down -- the initial thread info sits at the end of the initial stack area, and if at some point there is a deep enough call chain inside the kernel code that requires every bit of available kernel stack, the initial thread_info would be overwritten by the stack data. Which would be okay, since it is no longer needed.

(If you are very sharp, you'll realize that using the structure would have the same practical effect: running out of init_stack would overflow into the init_thread_info member, overwriting it. Assuming, as I noted in the parentheses, that the macros are adjusted to point to the start of the union. If the macros were not adjusted, then the initial thread info would stay in memory, unused, until reboot or shutdown.)

So, to summarize, the union is more appropriate because the kernel developers use the union type exclusively for the initial thread info and initial stack (for the thread that boots up the kernel), and it is expressly desirable that they occupy the same memory area. Although the exact same practical effects could be achieved using a structure, it would make the init_thread_info and init_stack macros needlessly complex, wasting other/future developers' time in trying to decipher the original intent.

Finally, keep in mind that kernel developers are much more interested in practical results than in theory or standards. For example, C compiler writers could point out that according to the C standards, accessing a different member of an union than was used in the last assignment to the union, yields undefined results. That does not matter: the kernel depends on actual, real-world behaviour, not on the text of any standard. This also means that reading the code, comments, and the discussions on LKML or other mailing lists related to the kernel, is always more instructive and reliable than relying on general C knowledge.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top