How to comprehensively test software that doesn't play well with testing?

https://softwareengineering.stackexchange.com/questions/402356

05-03-2021
|

Domanda

I'm currently working in a project that aims to implement automatic testing of a software package. You can imagine this software is a bit like Excel in that it has a workspace that contains all the data, and a user interface that executes code that works on this workspace data. We are primarily focused on testing "core functionality", rather than the user interface in itself. Previously, a lot of testing has been manual and poorly documented.

One of the problems we're facing is that essentially anything that you might want to test requires a "complete" data set. Due to inherited architectural complications it is not really feasible to unit test anything - the mocking required is too extensive.

Another problem we're facing is that the code is in several different languages - mostly Python and C++. Some of the Python code is being run inside a Python interpreter running in the software, giving it access to C++ code that is otherwise less accessible.

Also, the current level of manual testing is deemed "good enough", and we aim to get at least "the same coverage". However, since manual testing inevitably goes through the UI and since the current manual cases are extremely poorly documented, who knows what we're currently covering with the manual tests?

I'm having a hard time understanding how to achieve good coverage (or even just how to measure coverage intelligently).

A lot of the standard answers I've come across so far don't really apply here. E.g., "write testable code" is a great tip when you start writing code, but refactoring millions of lines of code accumulated over 25 years is not in scope for this project, not a manageable task for a small team, and a political impossibility given the circumstances.

I'm looking for any and all suggestions on how to achieve good test coverage, how to measure coverage, and generally how to tackle the transition from poorly documented manual testing to some sort of sensibly comprehensive automatic testing.

I'm not an expert, so there may be low hanging fruit that I've overlooked, in particular if I happen to not know the relevant search terms - if that's the case, a gentle prod in the right direction may very well be a good start.

Soluzione

I'm afraid there is no easy solution. "write testable code" really is the only way to do it.

Writing testable code is non-trivial, and retro-fitting tests is hard. Many of the advances in modern coding have been aimed at making code testable, with other benefits as a secondary effect. Writing the tests is the easy bit!

MrSmith's comment is a good strategy:

whenever a peace of code is extended/refactored/added/fixed add tests for this part. Over time you will improve the test-coverage this way step by step.

This has a few benefits:

You don't need to argue with the doubters over whether it's worthwhile rewriting functional code.
It targets tests at areas that are most important.
You were already going to spend time testing these areas.
There's no risk of your refactoring breaking an area that was otherwise unchanged.

Altri suggerimenti

You are not alone

Though a redundant mention, anyone reading this question while not having this kind of trouble ought to remember that real-life software development is full of cases of multi-year sub-standard development carried out under much time pressure, resulting in messy code-bases, which, nevertheless, superficially provide everything that has been practically requested. When life hands you over-25-years-of-accumulated-code with no tests, running away is not always an option.

All you have at the moment is your manual tests. While you cannot "magically" find out what they cover, you must free yourself of the time spent manually testing, which is probably better spent documenting (or improving, or writing actual tests).

If your only usable "framework" for testing is the actual User Interface, I suggest looking into GUI testing, which is, practically, based on "simulating" what you do when you are using the software. So, you can generate clicks here and there on the various UI elements in the order that you know a manual tester would. You may have to overcome problems such as automatically loading your dataset or setting options, but, depending on the code-base, you may be able to achieve this in the end.

Although this strongly depends on your current GUI framework, so it may be easier or harder to achieve, I believe an initial attempt at automating your current testing methodology might save you some time and even let you write new automated "manual" tests. Take a look here for a list of GUI testing tools to get a brief idea of what is around.

Apart from all that, Robin Bennett's answer is very good and timeless advice indeed.

EDIT:

Maybe a "shot in the dark", but if things are very hard, I think I have come across something that might be of assistance. Like I said, one of the only things that can help as to what is being tested, would be a continuous call stack trace... It might be possible that Valgrind's tool named callgrind, a callgraph analyzer (as per Wikipedia), might be useful for that reason. The tool describes itself as:

Callgrind is a profiling tool that records the call history among functions in a program's run as a call-graph. By default, the collected data consists of the number of instructions executed, their relationship to source lines, the caller/callee relationship between functions, and the numbers of such calls. Optionally, cache simulation and/or branch prediction (similar to Cachegrind) can produce further information about the runtime behavior of an application.

Sounds promising and, in the absence of any other better alternative, I would suggest giving this (or any other similar tool) a shot! A detailed call graph "trace" captured during each UI-based test would make it possible to discover what exactly is actually going on behind the scenes.

(Special reference goes to this answer, which pointed me in that direction).

I think Vector Zita starts you out in the right direction with GUI testing, I'll add this answer to address the code coverage issues.

Do what Vector Zita said. You need GUI tests. Depending on which platform this application runs on you should be able to find tools that allow you to automate the user interface. While you do not want to test the user interface, you can still test the core functionality through the user interface. You just do it at a much higher level. Don't think about testing in terms of "user interface" and "core functionality." Think of it in terms of user acceptance tests, negative tests, boundary tests, and black box testing (lots of googleable buzz words here).

User Acceptance Tests cover the main use cases of the application, and can be gleaned from existing documentation. Think "happy paths" through the application. These are great tests to start with. They are smaller in number and allow you to test a wider swath of the code base in a short period of execution time — a "smoke test" of sorts.
Negative Tests cover the "unhappy paths" through the application, which would test things like "you cannot a future date for a birth date." This can be written from existing requirements documentation as well.
Boundary Tests are concerned about correct and incorrect values in a range of values, like the price must be between 0 and 50. Write one automated test each for how you expect the application to behave at -1, 0, 25 (a sensible middle value), 50 and 51. Again, consult the requirements to know which boundaries need testing.
Black Box Tests fill in the gaps where requirements documentation is lacking or non-existent. These tests involve throwing data at the user interface in what ever format and quantity you can think of to see how the application behaves. When lacking proper documentation about how the thing is supposed to work, write passing black box tests to at least know how it currently works.

Write all of these as automated tests that hit the user interface. It will be a lot of work. The tests will execute slower than unit tests, but at least you will have tests.

You can begin refactoring code to enable proper unit tests after you have all of these UI automated tests written. Refactor that 16-year-old code to use the Single Responsibility Principal here, Separation of Concerns there, and a sprinkle of Dependency Injection over here. Cover that new code properly with unit tests. Validate you have not broken anything by running the existing automated tests (see Red-Green-Refactor).

Rinse and repeat until you've whittled away that 20-year-old code and replaced it with new, shiny code. Hopefully this allows someone else in the future to replace your shiny new code with even with newer and shinier code at a much quicker pace.

This will not get you the most accurate code coverage metrics in the beginning. I've found code coverage tools report either 0% coverage when running UI tests or 99% code coverage with very little middle ground, depending on how much application code is reused as test code. Only after refactoring enough of the code base to support unit tests will code coverage tools give you more accurate numbers.

To start with though, you are not looking to know which lines of code are covered by a test, and which are not. You want to cover as many of the expected input and output cases as you can, and then refactor the code base incrementally as you have time, or as you need to change old features or fix bugs.

Also, have in mind the potential life of the product itself. If this product with lot of legacy code is in the end way, doesn't mind. But, if there is too much time there to maintain or improve, add new features, hence there is a potential benefit to pay your technical debt and also improve the health of the pipeline

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a softwareengineering.stackexchange