Question

There exist lots of discussions on variable naming. However, I would like to address a specific aspect. I am a data scientist and deal with dozens of features/variables/columns — however you may call them.

My work benefits from names that:

  • contain information about the variable's ontological background; this means that if a group of variables belongs together, they could be prefixed with a certain marker/symbol.

  • fixed-length variables are absolutely useful because they allow to easily create consistent plots.

Given these thoughts, from a software engineer's perspective who has to deal with these variables in the source code as well, what naming conventions would you suggest?

One example: Let's say you run an ice cream business and have variables for customer data, flavours, etc. Then I would suggest something like: FLAV_CHOCLT, FLAV_MINTXX, FLAV_STRBRY, CUST_PHONEX, CUST_STREET, ...

I would love to hear your thoughts!

Was it helpful?

Solution

Vague apologies if this comes across as a bit harsh. One thing to remember here: MATLAB is a tool developed by engineers for engineers; it has never really had a focus on top-quality software engineering so definitely encourages what would be considered as bad software engineering practices.

My work benefits from names that contain information about the variable's ontological background; this means that if a group of variables belongs together, they could be prefixed with a certain marker/symbol.

From a software engineering point of view, this is a very primitive way to do things. Just about every "serious" software engingeering language since the 1960s(? - my knowledge before the 1970s is pretty sparse) has supported composite types which allow variables which belong together to be actually grouped together, rather than relying on naming conventions. MATLAB seems to supports structures which would on the surface seem to be a much better way of handling this.

fixed-length variables are absolutely useful because they allow to easily create consistent plots.

This is putting the cart before the horse. Don't force your code to use unreadable names because they make your plots look better, find a way to make your plots use labels which are not necessarily directly related to the variable names - every other programming/plotting language I've used has the ability to do this, so I'm sure MATLAB does as well.

Then I would suggest something like: FLAV_CHOCLT, FLAV_MINTXX, FLAV_STRBRY, CUST_PHONEX, CUST_STREET, ...

Put very bluntly: if you submit code like that to me for review, I'll immediately reject it and tell you to write some readable code.

Licensed under: CC-BY-SA with attribution
scroll top