Why binary compatibility?

Question 1

It avoids the Fragile Binary Interface Problem. It goes like this:

Program uses library.
User upgrades library. Upgrade changes something in the library's binary interface.
Program now doesn't work until it is recompiled because it was built to the old binary interface.

One of the advantages of the PIMPL idiom is that it allows you to move things that would normally be part of the public interface of a class into its private interface (in fact, into the interface of a private class). You can change the private interface without breaking binary compatibility.

Question 2

v1.0

Lets consider the following class in a v1.0 of libMagic library

//MagicNumber.h
struct MagicNumber {
  MagicNumber();
  int get();
  int id;
}
//MagicNumber.cpp
int MagicNumber::get() {
  return 42;
}

Application code:

void foo() {
  MagicNumber m;
  int i = 27;
  std::cout << m.get() + i << '\n';
}

When the above application code is compiled by dynamically linking against libMagic.so, foo function is compiled as follows

foo:
   Allocate 4 bytes space in stack for m
   Allocate 4 bytes space in stack for i and write 27 in it
   Call MagicNumber::get //This address is resolved on application startup.
   ... //Rest of processing

v1.0.1

Now, when libMagic releases a new version v1.0.1 with the below change in implementation but no change in header files

//MagicNumber.cpp
int MagicNumber::get() {
  return call_real_magic_number_fn();
}

application does not have to recompile and hence need not be updated. It will automatically call the updated version with new implementation.

v1.1.0 - Binary incompatible

Lets say there is another update to the library (v1.1.0) with below changes.

//MagicNumber.h
struct MagicNumber {
  MagicNumber();
  int get();
  int id;
  int cache; //Note: New member
}
//MagicNumber.cpp
int MagicNumber::get() {
  if(cache != 0) return cache;
  cache = call_real_magic_number_fn();
  return cache;
}

Now, the compiled foo function will not allocate space for the new member added. The library has broken binary compatibility.

foo:
   Allocate 4 bytes space in stack for m //4 bytes is not enough for m
   Allocate 4 bytes space in stack for i and write 27 in it.
   Call MagicNumber::get //This address is resolved on application startup.
   ... //Rest of processing

What happens is undefined behavior. Likely i=27 will write on the cache variable and MagicNumber::get will return 27. But anything could happen.

If libMagic had used PIMPL idiom, All member variables will belong to MagicNumberImpl class whose size is not exposed to application code. So library authors could add new members in later versions of library without breaking binary compatibility.

struct MagicNumberImpl;
struct MagicNumber {
  MagicNumber();
  private:
    MagicNumberImpl* impl;
}

The above class definition will not change in new versions and size of a pointer does not change when new members are added to a class.

Note: Binary compatibility is a concern only in the following cases

The library is linked using dynamic linkage (e.g. a .so file in linux).
The library is updated to new version without recompiling application code. If the library and binary is in same project - Your build system will recompile and update both automatically. So, no need to bother about this or PIMPL.

Note2: There is another way to solve the same problem without using PIMPL - ABI versioning of namespaces.

Question 3

The advantage of the PIMPL idiom is not so much binary compatibility but rather the reduced need for recompilation if you change the implementation or even the layout of a class. For example, if you add a new data member to a class, that changes the layout of the class and you would normally need to recompile all clients of the class, but not if you use the PIMPL idiom.

Binary compatibility is more about being compatible with multiple compilers (and compiler versions), and the only way to do that in C++ is with interfaces (abstract classes) that are implemented by classes not exposed to clients. This is because the vtable layout of abstract classes is implemented identically by all compilers. Many APIs, such as the DirectX APIs, are exposed this way so that they can be used with any compiler.