Why don't mainframe applications have bugs?

https://stackoverflow.com/questions/1503990

mainframe

19-09-2019
|

Question

It seems like old iron is rock solid software. Why is that? Is it because the software is so mature, that all the bugs have been worked out? Or is it because people have gotten so used to the bugs that they don't even recognize them and work around them? Were the software specs perfect from day one and once the software was written, everything just worked? I'm trying to understand how we've come from mainframe computing days which everyone now espouses as just working to feeling that TDD is now the way to go.

Solution

Why on Earth do you think they don't have bugs?

IBM has a vast support infrastructure for bug reporting and resolution (PMRs, APARs and PTFs), which is heavily used.

Mainframe software which hasn't been touched for many years will certainly be well understood (at least in terms of its idiosyncrasies) and will likely have had many bugs either fixed or worked around. All of the new stuff being developed nowadays actually plans for a certain number of bugs and patches from GA (general availability) to at least GA + 36 months. In fact, an ex-boss of mine at IBM used to baulk at being forced to provide figures for planned bugs with the line: "We're not planning to have any bugs".

The mainframe espouses RAS principles (reliability, availability and serviceability) beyond what most desktop hardware and software could ever aspire to - that's only my opinion of course, but I'm right :-)

That's because IBM knows all too well that the cost of fixing bugs increases a great deal as you move through the development cycle - it's a lot cheaper to fix a bug in unit testing than it is to fix one in production, in terms of both money and reputation.

There's a great deal of effort and cost expended on only releasing bug-free software but even they don't get it right all the time.

OTHER TIPS

There are no bugs in main frame software, only features.

I used to work on mainframe apps. The earlier apps didn't have many bugs because they didn't do much. We wrote hundreds if not thousands of lines of FORTRAN to do what you'd do with a couple of formulas in Excel now. But when we went from programs that got their input by putting one value in columns 12-26 of card 1, and another value in columns 1-5 of card 2, etc, to ones that took input from an interactive ISPF screen or a light pen and output on a Calcomp 1012 plotter or a Tektronix 4107 terminal, the bug count went up.

There asre PLENTY of bugs on mainframe software, they are just not publisized as much due to the relatively small group of developers affected. Just ask someone who does mainframe development how many ABENDS they see on a daily basis!

I learned to use debuggers and analyse core dumps on big iron mainframes. Trust me they only came about because of bugs. You're just plain wrong.

However mainframe architectures have been designed for stability under high stress (well compared to say non mainframe systems) so maybe you can argue they are better in that way. But code wise? Nah bug are still there...

My experience with mainframe application software (as opposed to operating systems) is pretty out of date, but my recollection is that the majority of applications are batch applications that are, logically, very simple:

a) Read an input file
b) Process each record (if you are feeling daring, update a database)
c) Write an output file

No user input events to worry about, a team of qualified operators to monitor the job as it runs, little interaction with external systems, etc, etc.

Now the business logic may be complex (especially if it's written in COBOL 68 and the database isn't relational) but if that's all you have to concentrate on, it's easier to make reliable software.

I've never worked on software for mainframes myself, but my dad was a COBOL programmer in the 1970's.

When you wrote software in those days, finding bugs was not as simple as compiling your source code and looking at the error messages the compiler spits back at you or running your program and looking at what it was doing wrong. A typist had to punch the program into punch cards, which would then be read into the computer, which would print out the results of your program.

My dad told me that one day someone came with a cart full of boxes of paper and put them next to the door of the room where he was working. He asked "What's that?!", and the guy told him "That's the output of your program". My dad made a mistake which caused the program to print out a huge amount of gibberish on a stack of paper that could have used up a whole tree.

You learn from your mistakes quickly that way...

Oh, they definitely have bugs--see thedailywtf.com for some more entertaining examples. That said, most of the "mainframe" applications one sees today have had 30 years to get all the kinks worked out, so they have a bit of an advantage over most applications created in the last few years.

While I don't have experience with mainframes, I'm guessing it's the first point you made: the software has been around for decades. Most remaining bugs will have been worked out.

Besides, don't forget fiascos like Y2K. All of the bugs people have stumbled on have been worked out, and in 20 years most situations will probably have occured. But every once in a while, a new situation does manage to come along that makes even 20-year old software stop working.

(Another interesting example of this is the bug found in, I believe, BSD Unix. It was found a year or so ago, and it's been around for 20 years without anyone running into it).

I think programming was just an advanced field that only chosen engineers could work in it. The world of programming now is much much bigger with lower entry barriers in every aspect.

I think it's a few things. First is that the cycle of fix-a-bug-recompile was more expensive in mainframes usually. This meant the programmer couldn't just slop out code and "see if it works". By doing in-your-head compile and runtime simulations you can spot more bugs than letting the compiler catch them.

Second, everybody and their brother wasn't a "programmer." They were usually highly trained specialists. Now programs come from guys sitting in their basement with a High School diploma. Nothing wrong with that!!! but it does tend to have more bugs that the engineer that's been doing it professionally for 20 years.

Third, mainframe programs tend to have less interaction with their neighbors. In Windows, for example, a bad app can crash the one next to it or the entire system. On mainframes they usually have segmented memory so all it can crash is itself. Given the tons of things running on your typical desktop system from all kinds of marginally-reliable sources it tends to make any program flaky to some degree.

Maturity is definitely a factor. A COBOL credit-card processing program that was written 20 years ago and has been refined over and over to eliminate bugs is less likely to have a problem than a 0.1 version of any program. Of course, there is the issue that these old rewritten infinite times programs usually end up spaghetti code that's nearly impossible to maintain.

Like anything, it depends mostly on the programmer(s) and their methodology. Do they do unit testing? Do they document and write clean code? Do they just slop-and-drop code into the compiler to see if there are any bugs (hoping the compiler can catch them all)?

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow