How CompCert “proves” different things in its codebase

https://cs.stackexchange.com/questions/93637

05-11-2019
|

Question

In order to understand examples of formal proofs, I am interested in how CompCert applies "proof" techniques. Specifically, I am wondering what a particular example is of something CompCert "proves" in their GitHub repo, so I can get a grasp of how proofs are applied in practice (to something that is somewhat familiar, compiler technology). I have had some minimal experience with Coq so I should be able to follow along. I also have some experience with x86, so perhaps something in the x86 folder would be a good example. Or perhaps something directly to do with the C language (such as Clight), or with memory.

In x86/Asm.v, they have several Inductive definitions, which I think is a proof in itself. For example:

Inductive ireg: Type :=
  | RAX | RBX | RCX | RDX | RSI | RDI | RBP | RSP
  | R8  | R9  | R10 | R11 | R12 | R13 | R14 | R15.

Inductive instruction: Type :=
  ...

Then they have some Definition definitions:

Definition code := list instruction.

Definition is_label (lbl: label) (instr: instruction) : bool :=
  match instr with
  | Plabel lbl' => if peq lbl lbl' then true else false
  | _ => false
  end.

Wondering if this is also a "proof" of some sort since (I think) it is some sort of type.

They also have Lemma definitions, where the explicit proofs are:

Lemma is_label_correct:
  forall lbl instr,
  if is_label lbl instr then instr = Plabel lbl else instr  Plabel lbl.
Proof.
  intros.  destruct instr; simpl; try discriminate.
  case (peq lbl l); intro; congruence.
Qed.

I understand about Coq tactics, but haven't actually written any proofs myself so don't have applied knowledge.

So here we have Inductive is used in a Definition which is used in a Lemma. And then it looks like that lemma is used by other lemmas:

Lemma transl_code_label:
  forall lbl f c ep tc,
  transl_code f c ep = OK tc ->
  match Mach.find_label lbl c with
  | None => find_label lbl tc = None
  | Some c' => exists tc', find_label lbl tc = Some tc' /\ transl_code f c' false = OK tc'
  end.
Proof.
  induction c; simpl; intros.
  inv H. auto.
  monadInv H. rewrite (transl_instr_label' lbl _ _ _ _ _ EQ0).
  generalize (Mach.is_label_correct lbl a).
  destruct (Mach.is_label lbl a); intros.
  subst a. simpl in EQ. exists x; auto.
  eapply IHc; eauto.
Qed.

In this question I am wondering what a proof consists of in CompCert, given some example like the above. I would like to be able to know what I need to write a proof for my own software, by first knowing what CompCert is actually "proving", and how they are "proving it". It seems that there are types (Inductive and Definition), and then proofs (Lemma and Proof). Not sure if the Inductive/Definition things are proofs in and of themselves. Proofs traditionally are essentially just derivation trees of applying rules, but I'm not exactly sure if that is what is happening here. It seems the Coq tactics are just creating a derivation tree, correct me if I'm wrong. Then the lemma/proof is just a derivation tree (a bunch of rules applied) down to the "axioms" (definitions and inductive types). The fact that there could be an electronics error isn't accounted for (I don't think), so the proof is really of a "pure" system in some sense. So practical proofs are applied to pure systems that make some assumptions about the environment.

In the end, I'm wondering, the kind of "proof system" CompCert is. What is making it a good system of formal proofs. Knowing that, I would be able to apply "proofs" to stuff outside of CompCert but in other programming contexts. Also wondering if there is some sort of "specification" that CompCert is formally verified against, or if it is just the "model" (as in the model in model checking vs. specification).

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange