Question

Well I'm no expert, but as a student, I'm curious about languages and their design patterns / goals.

I'd like to know, whether there are any points I miss in the following examples, and why techniques like this are not widely used in popular languages. Why is it a better idea in a real-life program to explicitly define all unit tests, instead of using some declarative code. why don't languages use some declarations as tests and implementation as well in general? Why split these thing into two parts?

Suppose you have the following unit tests:

assertRaises(avg([]), ValueError)
assertEquals(avg([1], 1)
assertEquals(avg([1, -2]), -.5)
assertEquals(avg([1, 2]), .5)

If you take this pseudocode,

def avg(items):
    if len(items) == 0:
        raise ValueError
    else if len(items) == 1:
        return items[0]
    else:
        return sum(items)/len(items)

most parts of avg seem to do nothing, or very little more than just declaring the way to pass some unit tests. In other words, for me it feels a little like code duplication. Especially the first test seems to be superfluous in this example.

I wonder if it would be a good design to get rid of some unit test this way:

def avg(items) {
    len(items) == 0 => raises ValueError
    len(items) == 1 => items[0]
    len(items) > 1 => sum(items)/len(items)
    (test) items == [1, -2] => $avg == -.5
    (test) items == [1, 2] => $avg == .5
    (test) avg(items) <= sum(items)
}

In this example, (test) indicates when a specific test case is not a useful definition (though maybe these could even be used for optimization in rare cases). Of course, this pseudocode is not well designed because it's less readable than the first implementation, but I think it's a more convenient way to declare unit tests. Actually the unit tests seem to be wrong here, since they do nothing but test the definition. And the last test is not trivial to perform. But it can be performed for example, when another test is run, which tests a code that has avg in it, so chances are, that it gives a useful set of test parameters. Also, for example, the last unit test has a secondary role too! If I have a code like this:

if avg(a) <= sum(a)

It's not necessary at all to evaluate the complex definition, since at runtime one of the test cases implies it directly. OK, in this case, it's not a real optimization (though an IDE could notify the user, that the expression will always be true, which is useful). However, I can imagine, that there are many complex examples, that can be optimized this way, without making the developer explicitly test for special cases at every single place where they are relevant.

Maybe a better example for this optimization

def power(a, n) {
    n == 0 => 1
    n == 1 => a
    n > 1 => a * power(a, n - 1)
    (test) for any (x) power(a, n) % x == power(a % x, n) % x
}

It's not secret I think, that a clever compiler could use it in many cases, and a clever interpreter/just-in-time compiler could use these in much more cases.

In order to understand tests and languages better, I'm curious whether my thoughts are useful, and if not, what am I missing, or if so, why popular languages don't implement such patterns.

Was it helpful?

Solution

The “sufficiently advanced compiler” has become a common joke when talking about programming languages. Some compilers actually have amazing features, but often this is used as an excuse for sloppy language design, or for the performance of certain dynamic languages which don't have such an advanced compiler.

The way you defined your functions reminds me of pattern matching, a feature most often found in ML-like functional languages, e.g. along the lines of

let avg xs =
  match xs with
  | []      => throw (ValueError "array must contain at least one element")
  | x :: [] => x
  | xs      => (fold (+) xs) / (array_length xs)

This is interesting because we can specify the semantics of some languages as a set of rewriting rules. Such a function definition specifies ways to rewrite an application of the form avg [1, 2, 3]. Certainly does this declarative phrasing of the solution make it easier to apply optimizations. But would it be that much harder given the more C-like form?

Num avg(Num[] xs) {
    if (xs.length == 0) throw new ValueError();
    if (xs.length == 1) return xs[0];
    return sum(xs)/xs.length;
}

A compiler that is aware of idioms of the language won't find such a representation any more difficult to reason about.


Now to your main question, on unit tests. The value of tests is this repetition. Ideally, tests and implementation are written by two different people, which avoids that a bug is copy&pasted from one representation to the other. In other words: what are the chances that two people failed to understand the problem, or made the same typo? (For example, avg([-1, -3]) <= sum([-1, -3]) is not correct).

The test asserts that a certain interface will be honored, and checks representative inputs to give us good confidence that the implementation works as expected. For simple pieces of code as given here, it might be obviously correct without having to write many tests. However, we might want to change the internal workings of our implementation. We want to make sure that our code still works as expected after any change, even if an insane implementation using reflection or other unoptimizable things and might not be obviously correct. Therefore, an implementation cannot be its test at the same time, and we have to encode the interface in another, second way. Once we've done that, this makes it really easy to keep backwards compatibility – your users will thank you.

In the examples you've given, a test can be written on a single line. In other cases, this is most definitively not the case. Today I opened up a test file that had 100 lines to set up a simple grammar, all just for a single test that checked that a lexer implementation could interface correctly with this grammar. Placing such a verbose test inside the implementation would make the actual code excessively hard to read due to all the interruptions. Placing the tests in a separate file gets rid of any readability problem, and makes it easier for my testing tool of choice to “prove” correctness of my implementation.

Licensed under: CC-BY-SA with attribution
scroll top