Why isn't TDD more popular in universities?

https://softwareengineering.stackexchange.com/questions/403150

06-03-2021
|

Question

Recently, a person here asked a basic question about how to compute in Python all permutations of elements from a list. As for most questions asked by students, I haven't provided the actual source code in my answer, but rather explained how to approach the problem, which essentially ended up in a basic presentation of test driven development.

Thinking about the problem itself, I remembered all the similar problems I did have to solve when I was myself studying programming in university. We were never taught about test driven development, nor any comparable method, but nevertheless were constantly asked to write algorithms which sort elements from a list, or solve Towers of Hanoi puzzle, etc. Nor do all the students who come on SoftwareEngineering.SE seem to know how TDD (Test Driven Development) would help them solving the problems they are given: if they were, they would have formulated their questions very differently.

The fact is, without TDD, all those problems are indeed quite challenging. I consider myself a professional developer, but I would still spend some time writing a list-ordering algorithm without TDD, if asked, and I would be mostly clueless if I were asked to compute all the permutations of the elements in a list, still without using TDD. With TDD, on the other hand, I solved the permutations problem in a matter of minutes when writing the answer to the student.

What is the reason TDD is either not taught at all or at least very late in universities? In other words, would it be problematic to explain the TDD approach before asking students to write list sorting algorithms and similar stuff?

Following the remarks in two of the questions:

In your question, you appear to present TDD as a "problem solving device" [...] In the past, TDD has been presented as a "solution discovery mechanism," but even Bob Martin (the principal advocate of TDD) concedes that you must bring a significant amount of prior knowledge to the technique.

and especially:

I'm curious why you think TDD makes a tricky algorithm problem with a well-defined spec easier.

I find it necessary to explain a bit more what is, in my opinion, so magical about TDD when it comes to solving problems.

In high school and in the university, I didn't have any specific techniques to solve problems, and this applied to both programming and mathematics. In retrospective, I suppose that one of such techniques is to review the current/last lesson/lecture and seek relation with the exercice. If the lesson is about integrals, there are chances the problem the teacher asked to solve requires to use integrals. If the lecture was about recursion, there are chances that the puzzle given to the students could be solved using recursion. I'm also sure there are well formalized approaches to solving problems in mathematics, and those approaches can be applied to programming as well; however, I never learned any.

This means that in practice, my approach was simply to poke the problem around, trying to guess how should it be solved. Back then, if I was given the challenge of generating permutations of elements from a list, I wouldn't start with an empty list as input, but rather an illustrative example, such as [4, 9, 2], trying to figure out why are there six possible permutations, and how can I generate them through code. From there, I need a lot of thinking to find a possible way to solve the problem. This is essentially what the author in the original question did, ending up using random. Similarly, when I was a student, no other students of my age would start with []: all would immediately rush to the case with two or three elements, and then remain stuck for half an hour, sometimes ending up with code which doesn't compile or doesn't run.

The TDD approach, for me, appears to be counter-intuitive. I mean, it works very well, but I would have never figured out myself, before reading a few articles about TDD, that (1) I should start with the simplest case, (2) write the test before writing code, and (3) never rush, trying to fulfil several scenarios in code. Looking at how beginner programmers think, I have an impression that I'm not the only one finding it counter-intuitive. It may be, I believe, more intuitive for programmers who have a good understanding of a functional language. I suppose that in Haskell, for example, it would be natural to handle the permutations problem by considering first a case of an empty list, then a case of a list with one element, and then a case of a list of multiple elements. In languages where recursive algorithms are possible, but not as natural as in Haskell, such approach is however much less natural, unless, of course, one practices TDD.

Solution

I am a part-time programming teacher at a local community college.

The first course that is taught at this college is Java Programming and Algorithms. This is a course that starts with basic loops and conditions, and ends with inheritance, polymorphism and an introduction to collections. All in one semester, to students who have never written a line of code before, an activity that is completely exotic to most of them.

I was invited once to a curriculum review board. The board identified a number of problems with the college's CS curriculum:

Too many programming languages taught.
No courses about the Software Development Life Cycle.
No database courses.
Difficulty in getting credits to transfer to state and local universities, partly because these schools can't agree on a uniform definition for the terms "Computer Science," "Information Technology," and "Software Engineering."

My advice to them? Add a two-semester capstone class to the curriculum where students can write a full-stack application, a course that would cover the entire software development life cycle from gathering requirements to deployment. This would make them hireable at local employers (at the apprentice level).

So where does TDD fit into all of this? I honestly don't know.

In your question, you appear to present TDD as a "problem solving device;" I see TDD mostly as a way to improve the design and testability of code. In the past, TDD has been presented as a "solution discovery mechanism," but even Bob Martin (the principal advocate of TDD) concedes that you must bring a significant amount of prior knowledge to the technique.

In other words, you still have to know how to solve problems in code first. TDD just nudges you in the right general direction relative to software design specifics. That makes it an upper-level course, not a lower-level one.

OTHER TIPS

First of all, we have to fundamentally distinguish between Computer Science and Software Engineering. (And maybe to a lesser extent between Software Engineering and Programming or "Coding".)

As one of my CS professors put it: if you need a keyboard, you are not doing CS.

TDD is very much a Software Engineering practice. It doesn't really have much relation to Computer Science. So, if you are asking why CS students don't learn TDD, it's because TDD doesn't really have much to do with CS.

Now, Software Engineering, that is a whole different kettle of fish. I very much agree that TDD should be taught there. I had one Software Engineering class (90 minutes per week for one semester) and in this class we learned Waterfall, V-Model, RUP, CMM, CASE, RAD, Spiral, Iterative, and probably some other stuff that I forgot. Surely, there would have been space to squeeze in TDD, especially if you integrate it with the rest of the courses and apply TDD to all homework and classwork in all courses where programming is required.

To be fair, though, in my case, the Agile Manifesto hadn't been written yet, and TDD was an obscure niche technique.

If you take a look at my personal favorite for teaching programming, How To Design Programs, you will find that they teach very early on that functions should come with usage examples and only slightly later, the book introduces automated unit testing and suggests that those usage examples should be written in the form of tests. They also teach that usage examples and what they call "purpose statements" (you can think of it as the summary line of a JavaDoc comment) should be written before the code.

They don't explicitly teach TDD (they have their own methodology), but this is at least somewhat close. HtDP is designed for high-school students to be taught in less than one year, so I think it is more than feasible to teach it in a single semester in university.

To be honest, though, it would already be a great win of students were simply taught that they can actually run the code they have written with different inputs. (In other words, a crude form of manual testing.) I am amazed at how often students are not able to take this mental step.

For example, I have seen multiple questions on Stack Overflow that essentially amount to "which method do I need to implement to make this code work". Note that this was not about the code of the method (the asker knew how to implement the method), it was purely about the name. However, not one of the askers came up with the idea of simply running the code and looking at the name of the missing method in the NoMethodError exception that will invariably be raised.

TDD is a nice process in "real-world" programming because problems often arrive on our desks underspecified. "Add a feature that does X", "fix the bug where it shows the wrong thing if you do Y", etc. TDD forces us to create a spec before we begin programming.

In CS/programming classes, the problems usually come to us extremely well-specified. "Write a function that takes an array of integers and returns an array with the same integers arranged in non-decreasing order". The spec is handed to us.

I'm curious why you think TDD makes a tricky algorithm problem with a well-defined spec easier. Specifically, I want you to explain this:

I would be mostly clueless if I were asked to compute all the permutations of the elements in a list, still without using TDD. With TDD, on the other hand, I solved the permutations problem in a matter of minutes when writing the answer to the student.

I suspect you'll say because you wrote test cases that asserted P([]) -> [[]], P([1]) -> [[1]], P([1, 2]) -> [[1, 2], [2, 1]], etc, and that caused you to see how the recursion should work. But that's not TDD, that's just... thinking about the problem. You can think about the problem without the process of writing actual unit tests that fail.

I suspect it's mostly because writing automated tests is more difficult than writing the code being tested. When you're still struggling with the basic mechanics, it's difficult to add another layer. Also, as other answers have pointed out, TDD is perceived as a software engineering process, even though its practitioners think of it more as a didactic device. Also, knowing what tests to write first is a skill in itself.

If I were teaching an introductory course, I would provide some tests in TDD order, and instruct the students to get the tests passing one at a time in that order. That's a very similar experience to pair programming with an experienced mentor. Personally, I think it would make classes like algorithms easier to teach, which would offset the time spent on the technique, because of the step-by-step progression. It would get students thinking about questions like, "I have some proven working code that will find all the permutations of a two-element list. How do I reuse as much of that code as possible to make it work for 3+ elements?"

The basic issue is that inventing the tests in the TDD process has nothing to do with computer science or software engineering. It requires knowledge of the application domain.

Of course that is the strength of TDD in real world applications: you can't escape from thinking about what the system you are building is actually going to be used for. But in typical CS and SE activities, the tasks the student is given have minimal context, and exist only to test or illustrate some learning objective.

For example, if the learning objective is to demonstrate understanding of the "case" construct in a programming language, TDD is irrelevant: you can write software that passes all the TDD tests without using "case" at all. If the example was artificial, not using "case" might in fact be a better solution to the problem in the real world, but though not in terms of the learning objective that was intended.

By all means encourage students to invent their own test cases and run them, but trying to use a formal TDD process is beside the point.

A second issue might be more to do with the way CE and SE are taught, but it is still a practical one: how to you grade the student's collection of tests when using TDD? There is no obvious way to automate that process.

Even if you produce a model solution which says "you should have tested for the following 15 or 20 situations" you still have to manually identify which of the student's tests corresponds (wholly or partially) to each part of your model answer. And except in trivial cases, the student might have produced a good set of tests that are logically structured in a different way from the model answer.

The average student really has a bad overview of what they are expected to know and do. To teach TDD they would need to understand:

How to troubleshoot technical problems. In the beginning they think code is written in one go. Its only later when they realize basic debugging strategies that they move over to more incremental writing. Even if you tell them this up front they will still do it this way because they think that's the way things are done (this is by the way also why people don't know how to draw, play piano, do math, etc etc.)
How to introspect your own actions so that you can find repeatable patterns in your own behavior. Once you can do this you can automate your own work. But this is utterly alien to most people until they reach a certain level.
How to write requirements.
Understanding corner cases of your problem space.
Understanding engineering philosophy on management on risks. (ha but you don't even know you're an engineer right)
Understanding refactoring and maintenance of code.
Understanding how programming works in a real company.

In fact I have had colleagues who were introduced to TDD way too soon, they didn't really benefit all that much. However, it is still possible to take baby steps towards this effect. Its just that people need quite some experience behind them to benefit.

So you can teach it but not something you start with. Especially since there is code that does not lend itself to TDD very easily.

TDD is not more popular in universities, because (generally speaking) universities do a very poor job in ensuring that the knowledge that is being given to the students can be transpiled to real-world contexts.

Arguably, that is not the role of university in the first place, and, it's more about teaching the fundamentals and core concepts of CS. That is instructive, important, and, essential, if the students would be interested in pursuing a career in academia, but, it falls short when the students will finish their degrees and enroll in the industry, working integrated in a team, that follows development best practices, CI/CD, etc.

Obviously, there are intersections and both can benefit from each other, but, ultimately, university is simply not yet up to speed with what needs to be done to graduate more capable, relevant, and up-to-date software engineers.

I never considered TDD helpful as a way to actually solve a problem and after reading your question I still don't.

TDD merely forces you to look at a problem from one end: the client's end. This can help prevent you to come up with a solution that does not match the client's question. This is generally a good thing, although it could also limit you, it keeps you from looking at the problem from a wider angle, there may be a better, more general solution to the problem your client did not consider.

Another way to look at it is that it is the ultimate top down approach. TDD could stand for Top Down Development. You start development at the most course level and gradually zoom in.

Either way, it is an abstract concept. You can tell a student about it, provide a definition, but that will be it. You cannot ask that many questions about it in an exam. So in any SE or CS class, although useful in a SE context, it could never be more than a side note.

To answer the question directly:

As someone suggested, there's enough learning for a whole semester. A separate course? Maybe. I could see TDD being combined with software design topics. IMO TDD's greatest value is locking down existing behaviors so we can add new behaviors later, and change the design to be appropriate for those new behaviors. So a software design class?

It would be nice to see the thinking process behind TDD (poorly named, and I once asked Kent Beck if he'd reconsider; he said "that train has left the station") be introduced as early as possible. I've been slinging code since 1976, and I know that TDD felt very unnatural at first. But now it feels quite natural. My time isn't recouped by writing test code, it's recouped later, when I only have to fix a serious defect once per year (The UofM Transplant Center's OTIS2 software, written entirely test-driven, had its last "software emergency" in 2004).

I encourage fresh graduates to try it out for a month. But it would be so much easier on them if they were exposed to TDD much earlier. I find using the unit-test framework useful when I'm learning a new language (True story: "Now how do I get this test to pass in Ruby? MAYBE this will work...HEY IT WORKED!"), or exploring a new library ("The docs are vague about the order these will be returned...I'll write a test...").

I find TDD valuable when implementing an algorithm (NOT inventing one...I'll get to that), or building any mildly complicated business rule from the ground up, as it were. Each test that is passing is "locked in" for all eternity. If the test is discrete enough, it won't ever have to change (but that takes a lot longer than a month to get good at).

So it would be interesting to incorporate TDD into earlier programming classes. Three points of improvement:

We could focus less on debugging and break-fix, which most developers think is just the nature of their job. It ain't. It doesn't have to be that painful or time-consuming.
Unit test frameworks are a simple replacement for #ifdef DEBUG / printf("...") / #endif - so we've been doing something similar (but more risky) since the beginning of time.
When we hear "you MUST write the test first" we're being mislead. I want to write down my expectations for the code I'm about to write. I want to leave that running while I go focus on the next behavior, or on cleaning up the design.

Another thing that might be useful is a textbook or, at the very least, a condensed no-nonsense curriculum with labs designed to convey various aspects of the TDD practice. I have a suggestion for that (see experience/disclaimer later).

To answer the critics:

I'm new here, but I note that some "answers" don't answer the question, but are fair critiques of TDD. So I'm guessing it's okay to address some of those concerns?

It's true, studies that have been done comparing TDD to unit-test-after (UTA?) show that TDD was initially slower. However, the UTA group had much lower test-coverage, and more defects. I think it was this study: Longitudinal Study of the Use of a Test-Driven Development Practice in Industry

If developers working on a product for more than a few months were really spending the majority of their time writing new features, then this would be a negative. But the six TDD teams that I worked on as a developer did spend their time that way, whereas the decade I spent slinging code prior to TDD were spent mostly in the debugger, or cautiously copying/pasting/modifying so I wouldn't break older/someone else's code. The benefits of TDD aren't all instantaneous, but they've been remarkably reliable and amazingly valuable as measured by lower developer overtime (and stress) and increased revenues. Bottom line: I always looked forward to going to work with those TDD teams. (Once I'd had some tea.)

It's also true that TDD cannot invent a new algorithm. I think it was Ron Jeffries who tried to create a Sudoku solver using TDD, and tried not to let his own Sudoku skills interfere with the implementation. He failed. I'm not surprised. TDD helps break down, simplify, and implement business rules; helps confirm we got an algorithm right (e.g., a complex encryption algorithm that may not be reducible); helps us reshape code later so that we can safely add new features; and does this by pinning down the investment we've already made in the behaviors of the product.

When you write a "test" (aka specification, aka scenario) you have to know the answer! Also, you want to write a test that is a small enough bite out of the solution that you already have some notion (prior knowledge) of how you are going to implement it.

What is often a surprise is how clear and simple the design can become if we pay attention to code-smells and we refactor as needed whenever all the tests related to that code are passing.

No one is suggesting we throw out our good knowledge of design patterns, or language idioms, or SOLID principles, or how to play Sudoku, or how to write a heap-sort.

TDD is about creating a safety-net that gives us confidence to rapidly add new features and alter the design to fit those new features without breaking anything else.

UTA is patching the safety net after someone falls through a hole to their death. (Interesting side note: The OTIS2 project is a life-critical system, so the risk of patient death wasn't a joke to us. Over 20,000 tests all running in 10 minutes, now that is a safetynet!)

Experience/disclaimers:

I did TDD pretty much full time from 1998 to 2004 and with ~6 different teams. All were at least 6 months of development. Before that, I had spent 13 years writing software the other way, and came to hate debuggers and manual testing with printf().

I started teaching software development practices in 2001, now including TDD, BDD, and SA-CSD courses. Sure, it's a good living, but I also know that TDD is a straightforward way to make team-centric software development sane and enjoyable again. So...

I'm writing what I hope will become a college or high-school textbook (with online repos, labs, videos, etc, of course): Essential Test-Driven Development

I was taught computing science in the 1970s, long before the acronym TDD was invented, and of course we were never explicitly taught the technique. But for one of the programming exercises we were given, we were supplied with a set of test cases and were told that our solution had to pass the tests. So I learnt the value of the approach without ever being taught it explicitly, and have used it ever since.

While writing tests is an essential skill for software development, the scientific evidence does not indicate that TDD ends up producing any better software than iterative test-last (ITL) development (OTOH, it also isn't worse).

For evidence, you can see Davide Fucci et al. "An External Replication on the Effects of Test-driven Development Using a Multi-site Blind Analysis Approach" (link) and Turhan et al's meta-analysis in Making Software (link).

So, unless we start seeing evidence to the contrary that TDD does, in fact, provide measurable gains it's less important that TDD as a specific practice is taught as opposed to simply instilling the good habit of test writing at some point in the development process.

but I would still spend some time writing a list-ordering algorithm without TDD, if asked, and I would be mostly clueless if I were asked to compute all the permutations of the elements in a list, still without using TDD. With TDD, on the other hand, I solved the permutations problem in a matter of minutes

This is confusing me seriously.
To me test driven development means to think about tests early and take the time to implement tests and keep them up to date with code.
Of course thinking about what to test is a good chance to elaborate the problem to solve and in some cases get a good picture of what the pitfalls are.
But not doing TDD, not concentrating on tests first doesn't forbid to think about the underlying problem anyway. You always should think about what you do and make the situation clear, no matter if you want to go one more step and implement tests right now or not.

would it be problematic to explain the TDD approach before asking students to write list sorting algorithms and similar stuff?

It depends to which extent you would explain TDD.
It's helpful to hear about something. But students hear a lot of stuff they can't yet interprete because other basics are still missing. So you shouldn't go too deep into a concept that without knowledge of basics would be a useless and alienating mess of unknown terms to them.
Besides this probably many programmers know what it feels like if you are looking for a how-to on some problem and find examples that indeed show a solution but first of all you have to sort out all kind of other irrelevant stuff the author added.

So to answer this question, if I was a student I'd like to have things separated. If you announce to explain us how to sort a list then please do tell us how to do that but don't leave the path to fill in testing stuff. If you want to start explaining tests then don't announce to implement sorting because this still doesn't happen for a long time.
Again there is some thinking required before we start to write the sorting or tests. What do lists look before and after sorting? Make examples, think about pitfalls, what must be considered to avoid failing when a list is empty and so on. All this must be considered but no line of tests is written yet.

Generally I'd say you are mixing up two things that should be kept seperately.

Thinking about the nature of the problem you want to solve.
Thinking about input to your code and what output should be given.

Thinking about the problem is very different from focusing on writing test cases.

How to do it bad

Once I came across an example of TDD by someone who was sort of obsessed with TDD.
Unfortunately it appeared to me the author was too fond of his tutorial so he didn't realize what crap it actually was.

The author only concentrated on test cases, not on the problem their code should handle. As you never can find all input permutations, you must have an overview of what you are actually doing, you must see the entirety of the problem. You can't always start with empty input, then subsequently add some more input and always append code to handle new input correctly.
You must have an idea what you're actually dealing with in general. But the author didn't.

If I try to translate it to sorting a list, we'd start with an empty list, then one element that can be returned as it is, a list with two elements that perhaps need to be swapped and perhaps end up in a recursion because three elements is like two (we already solved that) plus one more with one extra step...
A horrible szenario for a sorting algorithm- but noone will realize it because we only concentrate on out test cases.

Conclusion

The comments make me write a little more about my opinion.

I think the term "test driven development" is wrong. It should be "test-supported development", meaning we don't just code and hope but we think about testing too and we know it's always good to know early when something goes wrong.

If development is named driven by tests this could imply everything depends only on tests and we are done as soon as a couple of test cases are satisfied. This requiremend would be met even by very insufficient code that never tried to see a problem as a whole but got hacked back and forth until all test cases acidentally turn to green - then fails in real operation.

You've updated your question to be more "Why isn't TDD not taught as a core learning tool?". Other answers already explain well enough why TDD isn't a good topic for coding 101, but the main answer is really just that TDD, at its core, is not a problem solving tool. It can be used for that purpose, but like any tool you have to first understand when and how to use it.

TDD is a testing process, and thus most naturally will be taught either as part of a Development Processes course, or as part of a Software Testing course. In coding 101 courses, the goal isn't for students to solve problems, it's to teach them how to use various programming concepts. Generally, most coding 101 and 102 projects will be very explicit about how to solve the problem, the students just need to understand what they have learned to do the task in a non copy-paste kinda way.

Every student learns in different ways. Some students need to read about it, others need it verbally explained to them, and others will just never get it unless they get neck deep in code. Teaching TDD to aid in the learning process will not actually help most students, and the ones it does help? Teachers will have to decide if the time to teach TDD was worth the extra learning speed. On a net whole, teaching any learning method will not be worth the class time that could be spent on actual course specific topics. (In general, learning and problem solving skills are usually left to students to learn themselves, since only the student can identify what works best for them)

TL:RD; Different people have different effective processes. Universities don't prescribe how you should do anything; Just give you the tools so you can do what works best for you.

TDD is a fantastic implementation tool, and I think you are correct about it being beneficial to those who wish to write software.

What is the reason TDD is either not taught at all or at least very late in universities? In other words, would it be problematic to explain the TDD approach before asking students to write list sorting algorithms and similar stuff?

Probably the biggest reason is that the professors teaching these programs rarely know how to develop software, since that isn't their field of expertise. As other answers have mentioned, computer science and software engineering are different disciplines, and I would compare the expectation that computer science students learn software engineering to physics students learning how to design cars. TDD is a skill that takes a decent amount of practice to really be able to teach effectively, and computer science professors spend the bulk of their career working on computer science, so expecting the computer science faculty to really be able to teach TDD in a way that won't merely confuse their students is pretty unrealistic in my opinion.

We need to treat computer science and professional software development as the distinctly separate fields that they are. If your goal is to learn computer science, you should not be burdened with paying thousands of dollars learning how to make websites in React incorrectly from someone who has spent the last 30 years of their career doing graph theory on a chalk board. Likewise, if your goal is to become a software engineer, I don't know why you would want to spend 4 years and tens of thousands of dollars learning what is essentially just a specific field of mathematics. It's good to have a foundational understanding in the field, just like how someone designing an exhaust manifold needs to understand some amount of physics, however the person designing the exhaust manifold doesn't need too in-depth of an understanding in quantum mechanics, special relativity, and electromagnetism to do their job.

If you want to be an accountant, you can go get a degree in accounting and your professors will likely all have been CPAs at one point or another. If you want to be a mechanical engineer, you can get a degree in mechanical engineering and your professors will likely all have been licensed engineers at one point or another. But if you want to be a software engineer, any degree in software engineering will actually just be a computer science degree with some of your electives already chosen for you, and almost none of your professors will ever have been professional software developers. The accounting degree isn't part of the math department, and the mechanical engineering degree isn't part of the physics department, but the software engineering degree will almost always be part of the computer science department. Until academia as a whole completely separates these two fields into different departments run by different staff, I think there will always be a long list of things like TDD that are not taught to students aspiring to be software engineers.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange