C/C++ How Does Dynamic Linking Work On Different Platforms?

Question 1

To answer your questions one by one:

Dynamic linking defers part of the linking process to runtime. It can be used in two ways: implicitly and explicitly. Implicitly, the static linker will insert information into the executable which will cause the library to load and resolve the necessary symbols. Explicitly, you must call LoadLibrary or dlopen manually, and then GetProcAddress/dlsym for each symbol you need to use. Implicit loading is used for things like the system library, where the implementation will depend on the version of the system, but the interface is guaranteed. Explicit loading is used for things like plug-ins, where the library to be loaded will be determined at runtime.
The .lib file is only necessary for implicit loading. It contains the information that the library actually provides this symbol, so the linker won't complain that the symbol is undefined, and it tells the linker in what library the symbols are located, so it can insert the necessary information to cause this library to automatically be loaded. All the header files tell the compiler is that the symbols will exist, somewhere; the linker needs the .lib to know where.
Under Unix, all of the information is extracted from the .so. Why Windows requires two separate files, rather than putting all of the information in one file, I don't know; it's actually duplicating most of the information, since the information needed in the .lib is also needed in the .dll. (Perhaps licensing issues. You can distribute your program with the .dll, but no one can link against the libraries unless they have a .lib.)

The main thing to retain is that if you want implicit loading, you have to provide the linker with the appropriate information, either with a .lib or a .so file, so that it can insert that information into the executable. And that if you want explicit loading, you can't refer to any of the symbols in the library directly; you have to call GetProcAddress/dlsym to get their addresses yourself (and do some funny casting to use them).

Question 2

The .lib file on Windows is not required for loading a dynamic library, it merely offers a convenient way of doing so.

In principle, you can use LoadLibrary for loading the dll and then use GetProcAddress for accessing functions provided by that dll. The compilation of the enclosing program does not need to access the dll in that case, it is only needed at runtime (ie. when LoadLibrary actually executes). MSDN has a code example.

The disadvantage here is that you need to manually write code for loading the functions from the dll. In case you compiled the dll yourself in the first place, this code simply duplicates knowledge that the compiler could have extracted from the dll source code automatically (like the names and signatures of exported functions).

This is what the .lib file does: It contains the GetProcAddress calls for the Dlls exported functions, generated by the compiler so you don't have to worry about it. In Windows terms, this is called Load-Time Dynamic Linking, since the Dll is loaded automatically by the code from the .lib file when your enclosing program is loaded (as opposed to the manual approach, referred to as run-time dynamic linking).

Question 3

How does dynamic linking work generally?

The dynamic link library (aka shared object) file contains machine code instructions and data, along with a table of metadata saying which offsets in that code/data relate to which "symbols", the type of the symbol (e.g. function vs data), the number of bytes or words in the data, and a few other things. Different OS will tend to have different shared object file formats, and indeed the same OS may support several, but that's the gist of it.

So, imagine the shared library's a big chunk of bytes with an index like this:

SYMBOL       ADDRESS        TYPE        SIZE
my_function  1000           function    2893
my_number    4800           variable    4

In general, the exact type of the symbols need not be captured in the metadata table - it's expected that declarations in the library's header files contain all the missing information. C++ is a bit special - compared to say C - because overloading can mean there are several functions with the same name, and namespaces allow for further symbols that would otherwise be ambiguously named - for that reason name mangling is typically used to concatenate some representation of the namespace and function arguments to the function name, forming something that can be unique in the library object file.

A program wanting to use the shared object can generally do one of two things:

have the OS load both itself and the shared object around the same time (before executing main()), with the OS Loader responsible for finding the symbols and examining metadata in the program file image about the use of those symbols, then patching in symbol addresses in the memory the program uses, such that the program can then just run and work functionally as if it'd known about the symbol addresses when it was first compiled (but perhaps a little slower)
or, explicitly in its own source code call dlopen sometime after main runs, then use dlsym or similar to get the symbol addresses, save them into (function/data) pointers based on the programmer's knowledge of the expected data types, then call them explicitly using the pointers.

On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link...

That doesn't sound right. Should be one or the other I'd think.

Wtf does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?

A lib file is - at this level of description - pretty much the same as a shared object file... the main difference is that the compiler's finding the symbol addresses before the program's shipped and run.

Question 4

Modern *nix systems derive process of dynamic linking from Solaris OS. Linux, particularly, doesn't need separate .lib file because all external dependencies are contained in ELF format. .interp section of ELF file indicates that there are external symbols inside this executable that needed to be resolved dynamically. This comes for dynamic linking.

There is a way to handle dynamic linking in user space. This method is called dynamic loading. This is when you are using system calls to get function pointers to methods from external *.so.

More information can be found from this article http://www.ibm.com/developerworks/library/l-dynamic-libraries/.

Question 5

Relatedly, on OS X (and I assume *nix... dlopen), you don't need a lib file... How how does the compiler know that the methods described in the header will be available at runtime?

Compilers or linkers do not need such information. You, the programmer, need to handle the situation that the shared libraries you try to open by dlopen() may not exist.

Question 6

You can use a DLL file in Windows in two ways: Either you link with it, and you're done, nothing more to do. Or you load it dynamically during run-time.

If you link with it, then the DLL library file is used. The link-library contains information that the linker uses to actually know which DLL to load and where in the DLL functions are, so it can call them. When your program is loaded, the operating system also loads the DLL for you, basically what is does it call LoadLibrary for you.

In other operating systems (like OS X and Linux) it works in a similar way. The difference is that on these systems the linker can look directly at the dynamic library (the .so/.dynlib file) and figure out what's needed without a separate static library like on Windows.

To load a library dynamically, you don't need to link with anything related to the library you want to load.

Question 7

Like others already said: what is included in a .lib file on Windows is included directly in the .so/.dynlib on Linux/OS X. But the main question is... why? Isn't *nix solution better? I think it is, but the .lib has one advantage. The developer linking to the DLL doesn't actually need to have access to the DLL file itself.

Does a scenario like that happen often in the real world? Is it worth the effort of maintaining two files per DLL file? I don't know.

Edit: Ok, guys let's make things even more confusing! You can link directly to a DLL on Windows, using MinGW. So the whole import library problem is not directly related to Windows itself. Taken from sampleDLL article from MinGW wiki:

The import library created by the "--out-implib" linker option is required iff (==if and only if) the DLL shall be interfaced from some C/C++ compiler other than the MinGW toolchain. The MinGW toolchain is perfectly happy to directly link against the created DLL. More details can be found in the ld.exe info files that are part of the binutils package (which is a part of the toolchain).

Question 8

Linux also requires to link, but instead against a .Lib library it needs to link to the dynamic linker /lib/ld-linux.so.2, but this usually happens behind the scenes when using GCC (however if using an assembler you do need to specify it manually).

Both approaches, either the Windows .LIB approach or the Linux dynamic linker linking approach, are considered in reality as static linking. There is, however, a difference that in Windows part of the work is done at link time although it still has work at load time (I am not sure, but I think that the .LIB file is merely for the linker to know the physical library name, the symbols however are only resolved at load time), while in Linux everything besides linking to the dynamic linker happen at load time.

Dynamic linking is in general referring to open manually the DLL file at runtime (such as using LoadLinrary()), in which case the burden is entirely on the programmer.

Question 9

In shared library, such as .dll .dylib and .so, there is some information about symbol's name and address, like this:

------------------------------------
| symbol's name | symbol's address |
|----------------------------------|
| Foo           | 0x12341234       |
| Bar           | 0xabcdabcd       |
------------------------------------

And the load function, such as LoadLibrary and dlopen, loads shared library and make it available to use.

GetProcAddress and dlsym find you symbol's address. For example:

HMODULE shared_lib = LoadLibrary("asdf.dll");
void *symbol = GetProcAddress("Foo");
// symbol is 0x12341234

In windows, there is .lib file to use .dll. When you link to this .lib file, you don't need to call LoadLibrary and GetProcAddress, and just use shared library's function as if they're "normal" functions. How can it work?

In fact, the .lib contains an import information. It's like that:

void *Foo; // please put the address of Foo there
void *Bar; // please put the address of Bar there

When the operating system loads your program (strictly speaking, your module), operating system performs LoadLibrary and GetProcAddress automatically.

And if you write code such as Foo();, compiler convert it into (*Foo)(); automatically. So you can use them as if they're "normal" functions.