Why do some inference engines need human assistance while others don't?

https://cs.stackexchange.com/questions/879

16-10-2019
|

Question

^{I am learning Automated Theorem Proving / SMT solvers / Proof Assistants by myself and post a series of questions about the process, starting here.}

Why is it that automated theorem provers, i.e. ACL2, and SMT solvers do not need human assistance while proof assistants, i.e. Isabelle and Coq, do?

^{Find the series' next question here.}

Solution

Validity of higher order formulae is in general not decidable and search spaces are huge, so all you can hope to do is to try to find a proof -- assuming it exists -- by cleverly enumerating the proof space (think sledgehammer, aptly named) but that is rough. Humans can play the oracle, providing the key lemmata to guide proof.

Automated provers, on the other hand, typically deal only with decidable (subsets of) logics, for instance propositional logic or subclasses of first-order logic, so they may run a long time but you know they are going to be successful eventually.

Note that there are approaches to enable proof assisstants to find those important lemmata, e.g. IsaPlanner. The tool guesses (inductive) lemmata by enumeration and random testing and then tries to prove them. By iterating the process, many lemmata of e.g. typical data type definitions can be found automatically.

Small ABC

validity -- a formula is valid it holds whatever you assign to the free variables.
free variables -- those variables that are not bound by quantifiers such as $\forall$ and $\exists$
decidability -- a (boolean) property is (Turing) decidable if there is an algorithm that answers "yes" or "no" (correctly) after a finite amount of time.
propositional logic -- also zero order logic; no quantification allowed.
first order logic -- quantification is only allowed over individuals, i.e. you can do $\forall x. P(x)$ but not $\forall f. f(4) > 0$.
higher order logic -- you may quantify over (and "build") arbitrarily complex objects, e.g. higher order functions (functions that take functions).

OTHER TIPS

I would say that the classic distinction of "automated theorem proving" (ATP) vs. "interactive theorem proving" (ITP) needs to be reconsidered. If you take a well-known ITP system like Isabelle/HOL today (Isabelle2013 from February 2013), it integrates quite a lot of add-on tools from the ATP portfolio:

On-board generic automated proof tools: old-school Isabelle tools like fast and blast (by L. Paulson) and newer automated provers like metis (by J. Hurd).
External ATPs for First-Order Logic that are invoked via Sledgehammer: E prover, SPASS, Vampire. The proof that is found is analyzed to figure out which lemmas have contributed to it, reducing 10000s to 10s, and feeding the result to metis.
External SMTs with partial proof reconstruction, notably for Z3 (by S. Boehme).
Tools to find counter examples of unproven statements: Nitpick/Kodkodi (J. Blanchette) and Quickcheck (L. Bulwahn).

Does all that automated stuff make Isabelle an automated theorem prover?

Ultimately, I think the distinction of "ATP" vs. "ITP" is just some kind of "label" that tells how you want to position or "sell" your system: ATPs claim to be "push-button tools", but in practice you will have to interact (indirectly), by providing parameters or hints, or reformulating your problem. That might actually be quite challenging, due to the long runtimes that are common-place in the ATP community.

In contrast, an ITP system is made for people waiting on the spot, with half-decent access to internal proof states, to see what's missing to finish a proof. An ITP system that wraps up ATP tools in the manner of Isabelle might turn out more appealing for more users and more applications, than either ITP or ATP alone.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange