Question

I was reading How to write a simple operating system, which says:

For your very first OS, you're better off sticking with assembly language, as used in MikeOS. It's more verbose and non-portable, but you don't have to worry about compilers and linkers. Besides, you need a bit of assembly to kick-start any OS.

Why is assembly code required to kick-start a kernel? Why not just c code?

What i have seen in some implementations is assembly file having just a call to some function defined in an external c file apart from setting some magic numbers.

Does use of assembly have something to do with dealing with the real physical memory as opposed to dealing with the abstraction of virtual memory?

EDIT

Is(n't) assembly used to load kernel's text-section to address 0x100000 of real memory ?

Was it helpful?

Solution 2

C code requires for example a stack to be setup, so if nothing else you need to set the stack pointer before you enter the first level of code. And from C (not counting inline asm which in this case would be assembly not C) you cant set the stack pointer. So you have that chicken and egg problem. Next is bss and data segments, although you can certainly write clean C code that does not rely on bss nor data being setup, but to meet the C standards you need them setup. So you will find some code that zeros bss and if need be copies .data into place from nv storage.

In general assembly can do anything the processor can do C cannot nativly do anything the processor do. In particular you cant generally bootstrap C with C, you need asm...

OTHER TIPS

In addition to dwelch's answer about doing initial setup, there are also some instructions and operations which are extremely specific to the processor architecture and do not belong in a portable language such as C.

For example, on x86 you need to enable protected mode or long mode, you need to set an Interrupt Descriptor table, you need to set up a Global Descriptor table - each one of these involve specialized instructions that only mean anything if you have an x86 CPU. It does not make sense to put these Intel-specific instructions into a programming language that might not need them when running on some other CPU. And of course those other CPUs have their own forms of the same concepts which do not apply at all to x86.

Another common use of assembly in a kernel is for atomic operations - though these are starting to make their way into higher-level language specs (C++11 comes to mind). Even still, the implementations of these will need to use assembly, and a kernel will want to have total control (you can't use some higher-level wait primitives that a usermode implementation might -- because in a new kernel these abstractions don't exist yet or even in the same form).

A number of issues make it a pain to write an OS purely in C:

  • Calling conventions. The computer's bootstrapping code isn't assuming anything about the language being used for the kernel. On your typical IBM-compatible PC, it barely even does anything -- it sets things up just enough to load and jump to some code from a boot sector, and assumes that code will take over and set up everything as the OS wants it.) And unless the environment is set up just as C expects, the transition will generally not go very well.

    The initial bit of assembly code hands control over to the OS in an orderly fashion, setting up enough of an environment to ensure C code works as expected.

  • Machine-specific stuff. C doesn't make many assumptions about the underlying platform, and the compiler tries to abstract away a lot of the platform-specific magic. It'd have a bit of a time switching to protected mode on x86, for example, because C doesn't assume there will even be such a thing as "protected mode", let alone give you the fine-grained control required to make the switch. Any code that switches to protected mode will either be assembly code (whether inline, or as a separate statically linked library), or compiler- and CPU-specific "intrinsics" that will almost always be non-portable (and are basically assembly code themselves).

  • Space constraints. On your standard PC, the BIOS loads a sector (512 bytes, minus a two-byte boot sector signature) from a bootable drive/partition and jumps to it. That's not a lot of space at all. Assembly code can get around that constraint quite easily, by loading another sector immediately after itself -- in effect, modifying the running code. C code would require a bit of black magic to accomplish the same task, assuming it got around the "machine-specific stuff" problem mentioned above.


To be fair, many issues somewhat lessen if the OS is designed for (U)EFI (a BIOS complement/replacement found in many x86-64 machines). A UEFI system (+libraries/headers) provides an environment a bit more hospitable to low-to-mid-level languages like C, and enough hardware support to do the basics.

There are drawbacks, though. UEFI is not universal yet (and probably won't be for a while), and is largely incompatible with the old BIOS boot process. So if you want to support dual booting, requiring UEFI support might be a non-starter for a few years.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top