PLF C++ Library - Blog

3-12-2017 - 'auto': Dangerous Gimmick, or merely the innocuous result of too many programmers huffing the 'terse code' paint?

For 30 years, comp sci teachers have been trying to educate programmers out of using meaninglessly terse variable, class and object names. But now, the battle appears to be being lost, as too many programmers are failing to recognise that typenames are not an exception.

For the sake of argument, let's say I'm stupid. I like to write code like "int i = 10;". 'i' has an inherent meaning. Sure, I can look at the code and infer the meaning, but that increases cognitive load - or maybe I'm the kind of programmer that takes a perverse pride in my ability to rapidly infer meaning out of meaningless code. If the latter, I'm even dumber than I thought because I failed to realise that those brain cycles could be better spent on other things.

Let's separate this out into 4 concepts: "terseness", "aesthetics", "readability" and "meaningfulness". Compare and contrast the following two paragraphs:

"Colony is a container with fast insertion and erasure times, which doesn't invalidate pointers to non-erased elements."

versus

"Colony is a container with fast insertion and erasure times, which doesn't invalidate pointers to non-erased elements. This makes it beneficial for programs where there is substantial interlinking between different sets of elements and where elements are being erased and inserted continuously in real time."

The first paragraph is more terse. Aesthetically, there is no substantial difference between the two, unless you aesthetically prefer shorter writings - this is because aesthetics are largely subjective and meaningless, typically based more on familiarity than logic. The readability of both paragraphs is the same - it's just that the second paragraph makes you read more. But the meaningfulness of the second paragraph is greater. The second sentence in the paragraph *can* be inferred from the first, but doing so (a) increases the cognitive load of the reader and (b) increases the chance they might create an incorrect inference in their own minds, purely through the flexibility of the human brain and thinking.

So we can see that readability and aesthetics do not necessarily have anything to do with terseness or meaningfulness, but terseness and meaningfulness can be opposed. And while there are certainly ways of writing which are both more terse and more meaningful at the same time, often there's a tradeoff. Many programmers appear to side on the 'fewer keystrokes' way of thinking, but this is a mistake if you understand that code is always read more than it is written, even if it's a program that only you yourself will ever read. Going over and over your code is made more tedious and more time-consuming if you waste cycles decrypting the semantics of it.

For this reason, teachers of comp sci will tell you to give your variables and classnames real names, ones which indicate their purpose. Even if "i" is a counter in a loop, just calling it "counter" negates other possible meanings of the variable, and makes the code more meaningful as a result. So why, now, are programmers in multiple languages opting for code similar to "auto i = 0;"?

Let's, again, break this down into multiple categories. The two I'm familiar with are "type cascading" and "terseness". We already understand how terseness is problematic, but to go into more depth, look at the following two sections of C++ code:

for (auto i = some_stuff.begin(); i != some_stuff.end(); ++i)
{
   // do stuff with 'i'
}

vs

for (vector<int>::iterator current_stuff = some_stuff.begin(); current_stuff != some_stuff.end(); ++current_stuff)
{
   // do stuff with 'current_stuff'
}

So, we're forcing the reader to infer three things in the first section: the type of 'some_stuff', the type of 'i' (iterator or pointer to an element, , presumably) and the meaning of 'i' in the enclosed code. Now, if the loop is reasonably short and uncomplicated, perhaps the omission of a meaningful name for 'i' is not that important - it's right there, after all. If the loop contains a lot of code however, 'i' makes it less meaningful, particularly if there are other meaningless variable names like 'j'. But the worse of the two is the type inference. Not only does the reader have to infer the type of 'i' by going back through the code and finding out what 'some_stuff' refers to, but they must now also infer whether 'i' is an iterator or a pointer.

So while we save keystrokes with the first section, and from a certain point of view it could be considered more aesthetically " pleasing", it is also much more meaningless, and creates more cognitive load, rather than less. This is bad code: not because it's terse, but because you waste more human energy and brain cycles in the code's lifetime.

Type-cascading is another anti-pattern. If you're using 'auto' across multiple functions which call each other in a heirarchy, where the initial type is specified at point 'A', all that means is that at point 'B' you have to navigate back through multiple function calls to find the actual type - wasting time and human energy. It's something that can be done more meaningfully using typedef's, but overall in terms of code meaningfulness should be avoided, where possible. Templates are an example of useful type cascading, but it can be taken to extremes, and when it does, the signal-to-noise ratio becomes strong.

Now, I'm not saying 'auto' is incredibly worthless or that there couldn't be some scenario where it's even necessary - but the way it's used in general in C++ is wrong - it increases terseness at the cost of meaningfulness, and sometimes it doesn't even increase terseness (see 'int', 'char')! It comes across to me as a gimmick, something to make the language more attractive to programmers who're used to the same feature being present in another language. But it's a mistake to think that in all those years promoting meaningful naming in code, that we should leave type definitions out of that endeavour.

Contact: footer
plf:: library and this page Copyright (c) 2017, Matthew Bentley