What you can prove when making native HTTP calls from browser

https://softwareengineering.stackexchange.com/questions/388444

22-02-2021
|

题

From here it was advised:

Formal verification needs to have a goal: what precisely do you want to prove?

This is a good question. I am very intrigued by the idea of doing formal proofs in web applications after reading about TLA+. But all of the proof stuff is very low-level and doesn't deal directly with web applications. As such, I am not really sure what can be proved, or what would be good to prove.

When I think proof I think of the related but different unit tests. In a unit test for HTTP requests, I would check that I can make a GET, a POST, with this and that parameter, that I can do streams and this and that feature, etc. There is a lot I can test for in an HTTP request. Likewise with drawing, I can test that the colors are the colors I expect, or that the animation lasts a certain length, etc. Lots of visual properties to check too.

But my question is how to do this in terms of proofs. Going back to that post, the question is what even precisely can we prove in HTTP requests or rendering. Given that both of those features are native platform features, and that they are full of side effects we can't control, I am not sure what to do.

From the linked post, it was recommended to create axioms for the native feature, but I'm not sure if that is still okay when we're talking about more than termination. For example, take as an axiom that the HTTP request engine can make a GET request with caching. That doesn't make sense I don't think, it needs to be tested. But maybe there is a way to somehow still create an axiom after we have a suite of tests for it. That seems like the best approach. Then basically every unit test would become an axiom.

Other than that, I have a hard time thinking what could even be proven and how to prove it in HTTP requests and rendering. Even if we did take as axiom that the renderer drew a triangle with x data, I'd like to see an example of what we can build off of this in terms of proof. Maybe we can prove that it can draw n triangles or something, I'm not sure.

So my question is, what we can do to initialize a proof environment in this example of working with side-effect-ridden native features like HTTP requests and rendering, and what precisely are some examples of things we can prove. This will help demonstrate how to apply proof techniques.

Also, I am not asking whether or not it's a good idea to do, or worth the effort, etc. I just would like to know how a proof sketch would look independent of whether or not it is judged worthwhile.

解决方案

Formal verification is very rarely used:

the software system needs to be modelled in a formal system
that requires skilled labour
so formal verification is very expensive compared with other QA strategies.

It is literally not worth it for many programs, in particular for things like web applications that sit on top of a huge stack of unverified software. You are more likely to see formal verification techniques for

safety-critical embedded software (which tends to be a comparatively simple system)
systems-level software (where formal verification has higher value due to the scale at which this software is used)

For example, formal verification might be particularly interesting for parsers, or for parallel or distributed systems where ordinary tests cannot provide sufficient confidence (unless we use techniques like fuzzing or fault injection, e.g. compare Jepsen).

There is one exception to all of this: type systems. A type system is an automated proof system. With a retroactive proof, a problem frequently arises that the semantics are too complicated for the formal system. If the programming language has a built-in type system this avoids that problem, because a program that does not have a successful proof will not run: the type system restricts the expressiveness of the language. No separate modelling is required: the system being verified is identical with the proof system.

What can be proven depends dramatically on the language. Many type systems allow you to show that function arguments have the correct types. That is already a valuable result, because it means our application won't crash at run time. More advanced type systems let you describe higher-order abstractions, cover things such as nullability, or can make statements about the lifetime of some object. Type systems themselves are a kind of programming language. Unfortunately, the more expressive and powerful the type system is, the less guarantees you can make about the type system itself. For example, many more advanced type systems are Turing-complete so that the types/the proof is not decidable.

What does this mean for web applications?

You likely don't want to use any formal verification at all.
In particular, meaningful formal verification will have to make loads of assumptions about underlying layers, e.g. the browser, host OS, and the (unreliable) network.
If you do want formal verification, you might want to prove high-level relationships, e.g. “this kind of event leads to that kind of request being made”.
The most reasonable way to start using a minimal amount of formal verification is to start using a type system. E.g. use TypeScript instead of JavaScript. This lets you prove the absence of many classes of runtime errors, though not the correctness of your software in general.

其他提示

Ideally, you want to be able to prove that the specifications and/or implementations of HTTP are correct. But HTTP specifications themselves are not formally defined, so your next best chance is to create your own specifications and prove them.

Have a look at Ur/Web. It's built using formal method techniques to make implementations of HTTP meet specifications that they may not:

Suffer from any kinds of code-injection attacks

Return invalid HTML

Contain dead intra-application links

Have mismatches between HTML forms and the fields expected by their handlers

Include client-side code that makes incorrect assumptions about the "AJAX"-style services that the remote web server provides

Attempt invalid SQL queries

Use improper marshaling or unmarshaling in communication with SQL databases or between browsers and web servers

Ur/Web is implemented as a DSL and library. The DSL specifications, written in Coq (a proof assistant) have formal proofs. The specification is then extracted into language that can be compiled into executable code. The compiler is also specified so the output code remains faithful to the DSL specifications.

The advantage of this approach is users can build their applications using a higher language, i.e. higher than say PHP or Javascript, and relieved from the task of making the code error free because the code will be generated for them.

You are right when you say this is quite low level. But it's also straight forward that to be able to say that an application had implemented "GET returns some data", it also must not crash when processing the requests.

Another example related to HTTP is cryptography. It is critical to have proven cryptography to secure HTTP. This is also another "low level" detail that people don't need to think about but required for HTTP to function.

Since you mentioned TLA+, a "high level" example would be a distributed application which implements the Byzantine fault tolerance.

As for the rough sketch of a proof, at the low level end is inductive type which can be proven by using induction. High level example proof would be the Byzantine fault tolerance.

You don't prove things by testing (except in those rare situations where it's possible to exhaustively test every possible case). You prove by writing the specification of the system in a formal language, and then showing that your code must always act in accordance with that specification.

If your software relies on a third-party application that hasn't been formally tested, then all you can do is prove that your software does the right thing; that it makes the correct calls at the correct times. You have no control over the third-party code and what it does. Whether it's even worth the effort of doing that is debatable.

Testing is just something you can do at the end to give a bit of extra confidence that it's doing the right thing. One possible weakness of formal methods is that they only show that the code meets the specification, not that the specification is what the customer actually wants.

许可以下： CC-BY-SA 和归因

不隶属于 softwareengineering.stackexchange