Does it make sense to write logical tests using JBehave?

https://stackoverflow.com/questions/7452960

21-01-2021
|

Question

I've encountered JBehave recently and I think we should use it. So I have called in the tester of our team and he also thinks that this should be used.

With that as starting point I have asked the tester to write stories for a test application (the Bowling Game Kata of Uncle Bob). At the end of the day we would try to map his tests against the bowling game.

I was expecting a test like this:

Given a bowling game
When player rolls 5
And player rolls 4
Then total pins knocked down is 9

Instead, the tester came with 'logical tests', in other words he was not being that specific. But, in his terms this was a valid test.

Given a bowling game
When player does a regular throw
Then score should be calculated appropriately

My problem with this is ambiguity, what is a 'regular throw'? What is 'appropriately'? What will it mean when one of those steps fail?

However, the tester says that a human does understand and that what I was looking for where 'physical tests', which where more cumbersome to write.

I could probably map 'regular' with rolling two times 4 (still no spare, nor strike), but it feels like I am again doing a translation I don't want to make.

So I wonder, how do you approach this? How do you write your JBehave tests? And do you have any experience when it is not you who writes these tests, and you have to map them to your code?

Solution

The amount of explicitness needed in acceptance criteria depends on level of trust between the development team and the business stakeholders.

In your example, the business is assuming that the developers/testers understand enough about bowling to determine the correct outcome.

But imagine a more complex domain, like finance. For that, it would probably be better to have more explicit examples to ensure a good understanding of the requirement.

Alternatively, let's say you have a scenario:

Given I try to sign up with an invalid email address
Then I should not be registered

For this, a developer/tester probably has better knowledge of what constitutes a valid or invalid email address than the business stakeholder does. You would still want to test against a variety of addresses, but that can be specified within the step definitions, rather than exposing it at the scenario level.

OTHER TIPS

His test is valid, but requires a certain knowledge of the domain, which no framework will have. Automated tests should be explicit, think of them as examples. Writing them costs more than writing "logical tests", but this pays in the long run since they can be replayed at will, very quickly, and give an immediate feedback.

You should have paired with him writing the first tests, to put it in the right direction. Perhaps you could give him your test, and ask him to increase the coverage by adding new tests.

I hate such vague words as "appropriately" in the "expected values". The "appropriately" is just an example of "toxic word" for the testing, and if not eliminated, this "approach" can get widespread, effectively killing the testing in general. It might "be enough" for human tester, but such "test cases" are acceptable only at first attempts to exploratory "smoke test".

Whatever reproducible, systematical and automatable, every test case must be specific. (not just "should".. to assume the softness of "would" could be allowed? Instead I use the present tense "shall be" or even better strict "is", as a claim to confirm/refuse.) And this rule is absolute once it comes to automation.

What your tester made, was rather a "test-area", a "scenario template", instead of a real test-case: Because so many possible test-results can be produced... You were specific, in your scenario: That was a very specific real "test case". It is possible to automate your test case, nice: You can delegate it on a machine and evaluate it as often as you need, automatically. (with the bonus of automated report, from an Continuous Integration server)

But the "empty test scenario template"? It has some value too: It is a "scenario template", an empty skeleton prepared to be filled by data: So I love to name these situations "DDT": "Data Driven Testing".

Imagine a web-form to be tested, with validations on its 10 inputs, with cross-validations... And the submit button. There can be 10 test-cases for every single input:

empty;
with a char, but still too short anyway;
too long for the server, but allowed within the form for copy-paste and further edits;
with invalid chars...

The approach I recommend is to prepare a set of to-pass data: even to generate them (from DB or even randomly), whatever you can predict shall pass the test, the "happy scenario". Keep the data aside, as a data-template, and use it to initialize the form, to fill it up, and then to brake-down some single value: Create test cases "to fail". Do it i.e. 10 times for every single input, for each of the 10 inputs (100 tests-cases even before cross-rules attempted) ... and then, after the 100 times of the refusing of the form by the server, fill up the form by the to-pass data, without distorting them, so the form can be accepted finally. (accepted submit changes status on the server-app, so needs to go as the last one, to test all the 101 cases on the same app-state)

To do your test this way, you need two things:

the empty scenario template,
and a table of 100 rows of data:
- 10 columns of input data: with only one value manipulated, as passing row by row down the table (i.e. ever heard about grey-code?),
- possibly keeping the inheritance history in a row-description, where from is the row derived and how, via which manipulated value.
- Also the 11th column, the "expected result" column(s) filled: to pass/fail expected status, expected err/validation message, reference to the requirements, for the test-coveradge tracking. (i.e. ever seen FitNesse?)
- And possibly also the column for the real detected result, when test performed, to track history of the single row-test-case. (so the CI server mentioned already)

To combine the "empty scenario skeleton" on one side and the "data-table to drive the test" on the other side, some mechanism is needed, indeed. And your data need to be imported. So you can prepare the rows in excel, which could be theoretically imported too, but for the easier life I recommend either CSV, properties, XML, or just any machine&human readable format, textual format.

His 'logical test' has the same information content as the phrase 'test regular bowling score' in a test plan or TODO list. But it is considerably longer, therefor worse.

Using jbehave at all only makes sense in the case the test team are responsible for generating tests with more information in them than that. Otherwise, it would be more efficient to take the TODO list and code it up in JUnit.

And I love words like "appropriately" in the "expected values". You need to use cucumber or other wrappers as the generic documentation. If you're using it to cover and specify all possible scenarios you're probably wasting a lot of your time scrolling through hundred of feature files.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow