How to automatically proof that two first-order formulas are equivalent?

https://stackoverflow.com/questions/11517805

21-06-2021
|

Question

What is the best way to automatically proof that two first-order formulas F and G are equivalent?

The formulas have some restrictions compared to "full" first-order formulas:

quantifier-free
function-free
implicitly universally quantified

I can transform those formulas in clause normal form and I have routines for unification for literals. However I'm not sure how to continue and if this problem is decidable.

Solution

As mentioned, to prove that F <=> G where both are closed (universally quantified) formulas, you need to prove F => G and also G => F. To prove each of these two formulas, you can use various calculi. I'll describe [resolution calculus]:

Negate the conjecture, so F => G becomes F & -G.
Convert to CNF.
Run resolution procedure.
If you derive an empty clause, you've proved the original conjecture F => G. If the search saturates and no more new clauses can be derived, the conjecture doesn't hold.

Under your conditions, all atomic formulas coming from F will be predicate symbols applied only to variables and all atomic formulas from G will be predicate symbols applied only toto skolem constants. So the resolution procedure will only produce substitutions that either map variables to other variables, or variables to those skolem constants. This implies that it can only derive a finite amount of distinct literals, and so the resolution procedure will always stop - it will be decidable.

You can also use automated tool for this purpose that will do all that work for you. I use The E Theorem Prover for such problems. As the input language I use the language of The TPTP Problem Library, which is easy to read/write for humans.

To give an example: Input file:

fof(my_formula_name, conjecture, (![X]: p(X)) <=> (![Y]: p(Y)) ).

then I run

eprover --tstp-format -xAuto -tAuto myfile

(-tAuto and -xAuto do some auto-configurations, most likely not needed in your case), and the result is

# Garbage collection reclaimed 59 unused term cells.

# Auto-Ordering is analysing problem.
# Problem is type GHNFGFFSF00SS
# Auto-mode selected ordering type KBO6
# Auto-mode selected ordering precedence scheme <invfreq>
# Auto-mode selected weight ordering scheme <precrank20>
#
# Auto-Heuristic is analysing problem.
# Problem is type GHNFGFFSF00SS
# Auto-Mode selected heuristic G_E___107_C41_F1_PI_AE_Q4_CS_SP_PS_S0Y
# and selection function SelectMaxLComplexAvoidPosPred.
#
# No equality, disabling AC handling.
#
# Initializing proof state
#
#cnf(i_0_2,negated_conjecture,(~p(esk1_0)|~p(esk2_0))).
#
#cnf(i_0_1,negated_conjecture,(p(X1)|p(X2))).
# Presaturation interreduction done
#
#cnf(i_0_2,negated_conjecture,(~p(esk1_0)|~p(esk2_0))).
#
#cnf(i_0_1,negated_conjecture,(p(X2)|p(X1))).
#
#cnf(i_0_3,negated_conjecture,(p(X3))).

# Proof found!
# SZS status Theorem
# Parsed axioms                        : 1
# Removed by relevancy pruning         : 0
# Initial clauses                      : 2
# Removed in clause preprocessing      : 0
# Initial clauses in saturation        : 2
# Processed clauses                    : 5
# ...of these trivial                  : 0
# ...subsumed                          : 0
# ...remaining for further processing  : 5
# Other redundant clauses eliminated   : 0
# Clauses deleted for lack of memory   : 0
# Backward-subsumed                    : 1
# Backward-rewritten                   : 1
# Generated clauses                    : 4
# ...of the previous two non-trivial   : 4
# Contextual simplify-reflections      : 0
# Paramodulations                      : 2
# Factorizations                       : 2
# Equation resolutions                 : 0
# Current number of processed clauses  : 1
#    Positive orientable unit clauses  : 1
#    Positive unorientable unit clauses: 0
#    Negative unit clauses             : 0
#    Non-unit-clauses                  : 0
# Current number of unprocessed clauses: 0
# ...number of literals in the above   : 0
# Clause-clause subsumption calls (NU) : 0
# Rec. Clause-clause subsumption calls : 0
# Unit Clause-clause subsumption calls : 1
# Rewrite failures with RHS unbound    : 0
# Indexed BW rewrite attempts          : 4
# Indexed BW rewrite successes         : 4
# Unification attempts                 : 12
# Unification successes                : 9
# Backwards rewriting index :     2 leaves,   1.00+/-0.000 terms/leaf
# Paramod-from index        :     1 leaves,   1.00+/-0.000 terms/leaf
# Paramod-into index        :     1 leaves,   1.00+/-0.000 terms/leaf

where the most important lines are

# Proof found!
# SZS status Theorem

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow