Question

I was taught about formal systems at university, but I was disappointed how they didn't seem to be used in the real word.

I like the idea of being able to know that some code (object, function, whatever) works, not by testing, but by proof.

I'm sure we're all familiar with the parallels that don't exist between physical engineering and software engineering (steel behaves predictably, software can do anything - who knows!), and I would love to know if there are any languages that can be use in the real word (is asking for a web framework too much to ask?)

I've heard interesting things about the testability of functional languages like scala.

As software engineers what options do we have?

Was it helpful?

Solution

Yes, there are languages designed for writing provably correct software. Some are even used in industry. Spark Ada is probably the most prominent example. I've talked to a few people at Praxis Critical Systems Limited who used it for code running on Boings (for engine monitoring) and it seems quite nice. (Here is a great summary / description of the language.) This language and accompanying proof system uses the second technique described below. It doesn't even support dynamic memory allocation!


My impression and experience is that there are two techniques for writing correct software:

  • Technique 1: Write the software in a language you're comfortable with (C, C++ or Java for instance). Take a formal specification of such language, and prove your program correct.

    If your ambition is to be 100% correct (which is most often a requirement in automobile / aerospace industry) you'd be spending little time programming, and more time proving.

  • Technique 2: Write the software in a slightly more awkward language (some subset of Ada or tweaked version of OCaml for instance) and write the correctness proof along the way. Here programming and proving goes hand in hand. Programming in Coq even equates them completely! (See the Curry-Howard correspondence)

    In these scenarios you'll always end up with a correct program. (A bug will be guaranteed to be rooted in the specification.) You'll be likely to spend more time on programming but on the other hand you're proving it correct along the way.

Note that both approaches hinges on the fact you have a formal specification at hand (how else would you tell what is correct / incorrect behavior), and a formally defined semantics of the language (how else would you be able to tell what the actual behavior of your program is).

Here are a few more examples of formal approaches. If it's "real-world" or not, depends on who you ask :-)

I know of only one "provably correct" web-application language: UR. A Ur-program that "goes through the compiler" is guaranteed not to:

  • Suffer from any kinds of code-injection attacks
  • Return invalid HTML
  • Contain dead intra-application links
  • Have mismatches between HTML forms and the fields expected by their handlers
  • Include client-side code that makes incorrect assumptions about the "AJAX"-style services that the remote web server provides
  • Attempt invalid SQL queries
  • Use improper marshaling or unmarshaling in communication with SQL databases or between browsers and web servers

OTHER TIPS

In order to answer this question, you really need to define what you mean by "provable". As Ricky pointed out, any language with a type system is a language with a built-in proof system which runs every time you compile your program. These proof systems are almost always sadly impotent — answering questions like String vs Int and avoiding questions like "is the list sorted?" — but they are proof systems none-the-less.

Unfortunately, the more sophisticated your proof goals, the harder it is to work with a system which can check your proofs. This escalates pretty quickly into undecidability when you consider the sheer size of the class of problems which are undecidable on Turing Machines. Sure, you can theoretically do basic things like proving the correctness of your sorting algorithm, but anything more than that is going to be treading on very thin ice.

Even to prove something simple like the correctness of a sorting algorithm requires a comparatively sophisticated proof system. (note: since we have already established that type systems are a proof system built into the language, I'm going to talk about things in terms of type theory, rather than waving my hands still more vigorously) I'm fairly certain that a full correctness proof on list sorting would require some form of dependent types, otherwise you have no way of referencing relative values at the type level. This immediately starts breaking into realms of type theory which have been shown to be undecidable. So, while you may be able to prove correctness on your list sorting algorithm, the only way to do it would be to use a system which will also allow you to express proofs which it cannot verify. Personally, I find this very disconcerting.

There is also the ease-of-use aspect which I alluded to earlier. The more sophisticated your type system, the less pleasant it is to use. That's not a hard and fast rule, but I think it holds true for the most part. And as with any formal system, you will often find that expressing the proof is more work (and more error prone) than creating the implementation in the first place.

With all that out of the way, it's worth noting that Scala's type system (like Haskell's) is Turing Complete, which means that you can theoretically use it to prove any decidable property about your code, provided that you have written your code in such a way that it is amenable to such proofs. Haskell is almost certainly a better language for this than Java (since type-level programming in Haskell is similar to Prolog, while type-level programming in Scala is more similar to SML). I don't advise that you use Scala's or Haskell's type systems in this way (formal proofs of algorithmic correctness), but the option is theoretically available.

Altogether, I think the reason you haven't seen formal systems in the "real world" stems from the fact that formal proof systems have yielded to the merciless tyranny of pragmatism. As I mentioned, there's so much effort involved in crafting your correctness proofs that it's almost never worthwhile. The industry decided a long time ago that it was easier to create ad hoc tests than it was to go through any sort of analytical formal reasoning process.

Typed languages prove the absence of certain categories of fault. The more advanced the type system, the more faults they can prove the absence of.

For proof that a whole program works, yes, you're stepping outside the boundaries of ordinary languages, where mathematics and programming meet each other, shake hands and then proceed to talk using Greek symbols about how programmers can't handle Greek symbols. That's about the Σ of it anyway.

You're asking a question a lot of us have asked over the years. I don't know that I have a good answer, but here are some pieces:

  • There are well-understood languages with the possibility of being proven in use today; Lisp via ACL2 is one, and of course Scheme has a well-understood formal definition as well.

  • a number of systems have tried to use pure functional languages, or nearly pure ones, like Haskell. There's a fair bit of formal methods work in Haskell.

  • Going back 20+ years, there was a short-lived thing for using hand-proof of a formal language which was then rigorously translated into a compiled language. Some examples were IBM's Software Cleanroom, ACL, Gypsy, and Rose out of Computational Logic, John McHugh's and my work on trustworthy compilation of C, and my own work on hand-proof for C systems programming. These all got some attention, but none of them made it much into practice.

An interesting question to ask, I think, is what would sufficient conditions be to get formal approaches into practice? I'd love to hear some suggestions.

You can investigate purely functional languages like Haskell, which are used when one of your requirements is provability.

This is a fun read if you're interested about functional programming languages and pure functional programming languages.

real world uses of such provable languages would be incredibly difficult to write and understand afterwoods. the business world needs working software, fast.

"provable" languages just dont scale for writing/reading a large project's code base and only seem to work in small and isolated use cases.

I'm involved in two provable languages. The first is the language supported by Perfect Developer, a system for specifying sotfware, refining it and generating code. It's been used for some large systems, including PD itself, and it's taught in a number of universities. The second provable language I'm involved with is a provable subset of MISRA-C, see C/C++ Verification Blog for more.

Check out Omega.

This introduction contains a relatively painless implementation of AVL trees with included correctness proof.

Scala is meant to be "real world", so no effort has been made to make it provable. What you can do in Scala is to produce functional code. Functional code is much easier to test, which is IMHO a good compromise between "real world" and provability.

EDIT ===== Removed my incorrect ideas about ML being "provable".

My (controversial) definition of "real-world" in the context of provably-correct code is:

  • It must involve some degree of automation, otherwise you're going to spend far too much time proving and dealing with messy little details, and it's going to be totally impractical (except maybe for spacecraft control software and that sort of thing, for which you might actually want to spend megabucks on meticulous checks).
  • It must support some degree of functional programming, which helps you write code that is very modular, reusable and easier to reason about. Obviously functional programming isn't needed for writing or proving Hello World, or a variety of simple programs, but at a certain point, functional programming does become very useful.
  • While you don't necessarily have to be able to write your code in a mainstream programming language, as an alternative, you should at least be able to machine-translate your code into reasonably efficient code written in a mainstream programming language, in a reliable way.

The above are relatively more universal requirements. Others, such as the ability to model imperative code, or the ability to prove higher-order functions correct, may be important or essential for some tasks, but not all.

In view of these points, many of the tools listed in this blog post of mine may be useful - although they are nearly all either new and experimental, or unfamiliar to the vast majority of industry programmers. There are some very mature tools covered there though.

How about Idris and Agda? Real world enough?

As a good real life example, there's a project that aims to provide a verified HTTP REST library written in Agda, called Lemmachine: https://github.com/larrytheliquid/Lemmachine

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top