Question

I'm reading the book Clean Code by Uncle Bob. I'm also enrolled in a data structures & performance course and reading several algorithms and data structures books.

One immediately apparent difference is that in Clean Code, Uncle Bob criticizes non-semantic variable names such as "k", "x", "j", etc for general use in a computer program (other than for example a loop counter) whereas in the algorithm books, every single variable is coded with 1 letter names or non-semantic names, which makes it much more difficult for me to comprehend what the code is doing, despite the logic itself being rather simple.

What are some strategies for refactoring code with such non-semantic variable names for the purposes of learning and why the drastic difference in engineering vs. data structure code presentation? Thank you.

Was it helpful?

Solution

why the drastic difference in engineering vs. data structure code presentation?

It is an unfortunate intersection between mathematics and computer science.

In mathematics, each variable is often denoted with a single symbol. There are several reasons for that; among those is to avoid the confusion of associativity and ambiguity of multi-letter variable names.

For example, when one writes abc in a formula, it is unambiguous if one understands it with the assumption that each of a, b, c is their own symbol, instead of abc being a single variable.

Meanwhile, in mathematics one strives for consistency. Each symbol, like x, y, double-struck , has predefined meaning; misuse is disallowed.

Unfortunately there is no such consistency followed in software programming.

With respect to educational materials, it would be trivial for editors to do a search-replace to convert single-letter variable names into more meaningful names. Integrated development environment (IDE) with refactoring support can often do this with simple commands. As a programmer, you can do the same to the source code you write.

What are some strategies for learning such systems

If you are referring to the question of learning about computer algorithms and data structures, this is unfortunately a too broadly scoped question.

OTHER TIPS

Different people have different styles, that's why you get differences in style. People also have different learning styles.

One way of learning algorithms that works for some people is reproducing the algorithm yourself, bit by bit, in your own words. That way you can replace less meaningful variable names with meaningful ones.

A simple reason for x, y, z comes from the past. Once we used punch cards and needed to be as concise as possible when coding. That habit still exists since long names have a high risk for typos and you spend more time typing. So from an economic point of view this is understandable.

However, when you look at maintenance times for code, this habit turns to become more and more uneconomic. Understanding code with comprehensible names is by far easier than code with cryptic names (as you noticed). Further each modern IDE offers code completion which makes it easy to use a long name in first place an re-use that.

Now, when it comes to refactoring, things can get complicated. Replacing cryptic names with readable ones is only a first but important step. If you have a good IDE it can support that by recognizing tokens in the scope of your code to replace the right ones. When giving things the right name, you also learn about their purpose. Or vice versa (better), once you deciphered the meaning of a variable/function, give it a meaningful name.

Btw.: A long time ago I was in the situation where I inherited such a large piece of (FORTRAN) code. I did that by creating a copy of each file one by one, re-writing it readable loop by loop. This was a lot of work, but paid back immediately. How more happy I could have been those days if git had already existed? Probably I would have floated 1cm above the floor all time xD

Licensed under: CC-BY-SA with attribution
scroll top