Need a value? Pass by value

23rd December 2014

C++ experts have a tendency to over-complicate the language's best practices through rigorous academic discussions about the optimality of every possible line of code. This makes it difficult to learn C++ because the rules are complicated and a consensus is difficult to find.

One particular case that I have come across a lot recently is the choice of whether to take a parameter by value or by const reference. These can both used for the same purpose — to get the value of the passed argument. After all, either will except any object of the correct type (whether temporary or not). For a long time, passing by const reference was popular for anything that wasn't a primitive type, to save from expensive copies. Since C++11 and the introduction of move semantics, the commonly accepted practice is to pass by value only when your function is going to need a copy anyway — allowing the move constructor to be used when the caller passes a temporary object and allowing the function to use std::move internally where appropriate — and pass by const reference otherwise. More recently, Herb Sutter gave evidence that using a const reference might actually be more optimal in certain situations. I couldn't blame a person for being confused by all these different ideas about how to achieve optimal code.

However, there are a few problems with these approaches:

We now have two things, value types and const references, that say the same thing to the caller: I need the value of your object. The choice of which is used often depends on what the function does to that value internally, thereby leaking internal details that the caller shouldn't need to care about. A new C++ programmer has to know this to avoid confusion and when to use each approach. A beginner shouldn't have to know the details of move semantics and copy optimizations just to write some function parameters.
A const reference doesn't succintly express its intention. It gets access to the caller's object and only transitively access to that object's value — it gets more information than is required. A const reference could mean something else. Is it going to keep this reference? Do you need to ensure an extended lifetime of your object? This isn't clear from the parameter types alone. Passing by value, on the other hand, is very expressive — it cares only about the value provided.
These approaches attempt to optimize before we even know that we need to — this is the very definition of premature optimization, which we all know is the "root of all evil". Most copies have minimal costs which are often mitigated by copy elision and move semantics anyway. Unless you're copying very frequently in a time or space critical part of your program, do you really need to complicate your interfaces for the sake of unnecessary optimizations? For the most part, optimization should be the compilers job — the programmer should only step in when necessary.

For these reasons, I consider a const reference argument being used for the sake of optimization a messy optimization. It's an optimization that trades away the simplicity and readability of your interfaces, so it had better be worth it. Simplicity and readability come first — optimize later when you measure that there's a performance problem.

So what if you do measure that there's a problem? Are you copying objects around that don't really need to be copied? Are those copies expensive? If so, I don't think that const references should be your first port of call. First try something like the copy-on-write mechanism, in which the internals of an object are only copied when written to. This way, an expensive copy is only performed when necessary and your function interfaces remain exactly the same (the copy-on-write is hidden from the user). If copy-on-write doesn't help — and there is evidence that it might lead to bad performance when implemented for multi-threaded environments — then try something else. Sure, const references might be appropriate at this point. However, you should only use them when necessary, isolate them to a specific region of your code, and document their use. A reference parameter intrudes on the caller's space. I don't want to have to wonder why your function wants a reference to my object — just tell me.

This effectively makes the only use of const references when we actually require immutable access to the caller's object. That is, when the the caller's object itself, not only its value, is important to us. In these cases, we typically want to track changes to the value of that particular object. Alternatively, we might want access to an object that we really can't copy, perhaps because copying doesn't make sense for that type or you're working in a very low memory environment. These situations occur significantly less often than we use const references today.

So let's keep it simple: Need a value? Pass by value. It's a rule that is easy to understand, simple to teach, and gives us safe interfaces and readable code. Messy optimizations should only be used when necessary.