How does a compiler compile a compiler?

Question 1

Typically, compiler authors go one of two routes:

Write the entire compiler in some other existing language. This is generally the simplest option.
Write just enough code in some other language to have a minimally usable translator, and use that "scaffolding" as a base to write the compiler proper in the language it's intended to compile. This is more complicated, and usually takes longer, but inherently offers the chance to flush out language bugs and weaknesses by testing the language in a real project.

The first program to translate code was written at least partly in machine code -- the actual numbers that tell the CPU what to do. It's the lowest level because there is not really a "compiler" for machine code^*; it's just numbers arranged a certain way, and the CPU has circuitry within it to process them without outside help.

^{^* There are programs to help design the hardware that interprets and executes the instructions, but that arguably sits outside the definition of a compiler. Such programs generate hardware descriptions -- circuit diagrams and the like -- as opposed to the directly executable files a compiler outputs.}

Question 2

You can always use your favourite compiler A to write another compiler, say B. In this B you added some extra functionality, so it can easily become your favourite one, and you will use it for writing compiler C, ...

How to start then? In the old days people simply filled the memory with the raw numbers to interpret by the CPU directly. This is why source is often referred to as code. Once a minimal compiler has been programmed this way, it can be executed to create another one written in the language it compiles. That again can be used to create a higher level one and so forth.

In fact filling raw instruction codes to memory can itself be treated as a zero level compilation process, where the human is the compiler.

It is quite usual that a compiler for a given language is written in the same language. This is the case with the C programming language for example. This is somewhat more than coincidal, because who knows a language good enough to dare writing a compiler for it, likely has this language among his favourite ones to use for programming. It is simply a typical case though, not necessary as there are many languages to choose from, including ones especially good for compiler construction.

Question 3

Numerical machine code is binary. 1s and 0s. Compiling implies reducing it into some still-lower form, so it's not really compiled.

For example, from the wiki article you quoted: For example, on the Zilog Z80 processor, the machine code 00000101, which causes the CPU to decrement the B processor register, would be represented in assembly language as DEC B.

So you'd have a compiler when you were writing the Z80 assembly language, and the instruction DEC B would be compiled into '00000101' -- not vice versa.

Question 4

Numerical Machine code represents series of off and on states to circuits and is what all electronic data is at the lowest level. There is no "compiler" per say for this low level language instead the circuits in a computer are combined and structured in such a way as to "interpret" them by reading the on's and off's in the code realized by high or low electrical states. Anyways these high or low level states cause different gates/circuits to open or close, in general behave differently. Check out more on Electronic Gates.