A tour of the C++ standard
Perhaps more than any other language, C++ demands a certain level of intimacy from its programmers. Anybody who feels that they're getting serious with C++ should probably have a copy of the standard at hand. The are two reasons for this: the C++ language and library are fairly complex, so the standard serves as the ultimate reference (when cppreference.com doesn't suffice); and simply compiling a program and running it is often not enough to tell you whether the program is absolutely correct. Familiarizing yourself with the standard will help you know when to question the correctness of your code.
Getting the standard
Published C++ standards can be purchased from the ISO and ANSI stores. The latest publication, colloquially known as C++11, is formally titled ISO/IEC 14882:2011.
For many, purchasing the standard is both inconvenient and unnecessary. Many papers that come out of the standard committee (known as N-documents), including drafts of the standard, are available for free. The drafts on either side of a publication are usually very similar to it, save for a few small changes. For example, draft N3337 is the same as the C++11 standard but with a few typographical corrections. The official ISO C++ committee website also maintains a link to the latest draft. The LaTeX source is also available on GitHub in the cplusplus/draft repository.
This tour will be based on the final draft of C++14, N3936, which is the latest draft available at the time of writing. Much of this general overview will be the same as for earlier standards.
References
Sections in the C++ standard are both numbered and labelled. Either can be used for referencing, where §5.2.2 and [expr.call], for example, both refer to the “Function call” section. Labels tend to be easier to recognize and stay consistent as the content of the standard changes, but numbers are easier to locate. Sometimes it's useful to provide both. Paragraphs are also numbered, so §5.2.2/3 is typically used to refer to the 3rd paragraph of that section.
Language and library
The C++ standard has 30 chapters, which can seem a little daunting at first. It could, however, easily be split into two major parts. Chapters 2–16 constitute the language itself, defining how a compiled program should behave, while Chapters 17–30 define all the types and functions made available in the standard library.
Chapter 1 is an introduction to the standard, defining a few common terms and introducing some fundamental concepts, such as the C++ object and memory models. Section 1.4 [intro.compliance] describes what it means for an implementation to be compliant. In simple terms, an implementation needs to produce programs that meet all of the language rules defined in Chapters 2–16. The language rules are generally independent of the standard library, so a compliant implementation need only provide a minimal subset of the library (as described in §17.6.1.3 [compliance]). However, most implementations aim to provide C++ in its entirety.
Lexing and preprocessing
Chapter 2 [lex] describes how we take a source file and split the content up into a series of tokens. For example, keywords and literals are all tokens. If you wanted to see a list of keywords or which characters are allowed in an identifier, this is where you'd look.
Sections 2.1 [lex.separate] and 2.2 [lex.phases] give an overview of the full process of compiling and linking a source file. Before compilation, the source file is broken down into preprocessing tokens, which include preprocessing directives (like #include
and #define
). The behavior of these preprocessing directives is described in Chapter 16 [cpp]. I always thought this chapter should go near the beginning but I suppose it's considered a kind of appendix to the language chapters.
The basics
Chapter 3 [basic] introduces some basic concepts, such as the definition of a name, an object, a variable, and so on. It describes the properties of an object, such as its type, linkage, and storage duration. It also describes how a name is looked up to find its declaration.
Chapter 4 [conv] defines the standard conversions between built-in types. For example, if you want to know about implicit conversions between integer types, this is where you want to look. A commonly cited section, due to the mass confusion about how arrays and pointers interact, is §4.2 [conv.array].
Core language
Chapters 5–8 are what I think of as the core of the language. Every bit of code you write is made up of these things. They describe expressions, statements, declarations, and declarators respectively.
After a source file has been preprocessed, at the very basic level it is simply a sequence of declarations, including variables, namespaces, and classes. Some of those declarations will be function definitions. Function definitions have a function body, which is a sequence of statements within curly braces. Statements are {…}
blocks, if
, switch
, while
, for
, etc - all those things that you see in the procedural parts of your code. Some declarations are also valid statements, which is how we're able to declare variables and other things within functions.
An expression is what performs operations on objects. Expressions are made up of operators being applied to the results of subexpressions. For example *p + 5
is an expression which applies the addition operator to the results of the subexpressions *p
and 5
. Expressions are also valid statements, which is why we can write something like x = 5 * y;
in a function body. Section 5 [expr] is where you'll find everything about expressions. If you want to know what an operator does, what it returns, whether an expression is an lvalue or rvalue, this is where you look.
int x; // declaration
struct y; // declaration
void foo() // declaration (a function definition)
{
if (...) { ... } // statement
int z = 5; // a statement which is a declaration
x = z; // a statement which is an expression
}
The difference between a declaration (§7) and a declarator (§8) may not be immediately clear - the terminology certainly doesn't help. Each declaration has at least one declarator. The declaration int x;
has one declarator, while int x, y;
has two. That is, each declarator introduces an entity to your program. You may have seen something like int* p1, p2;
and found out that only p1
is a pointer, while p2
is a plain int
. That's because the *
character is part of the declarator of p1
. If you wanted both to be pointers, you'd need int *p1, *p2;
. In fact, in a more complex variable declaration like const short& (*f)(int, char);
, only const short
belongs to the entire declaration. The rest belongs to the declarator of f
.
Classes
Chapters 9 through 12 cover everything to do with classes. Chapter 9 [class] describes how a class definition is formed. Chapter 10 [class.derived] gives the syntax and rules of inheritance, including how member names are looked up, as this process is complicated by inheritance. Chapter 11 [class.access] describes member access control using private
, protected
, and public
and how to determine whether a particular member is accessible within a particular context. Finally, Chapter 12 [special] describes special member functions, such as constructors, destructors, conversion operators, and so on. The intriguing rules of copy/move elision are described here in §12.8 [class.copy].
Overload resolution
Chapter 13 [over] gives the rules of overloading. That is, it describes when multiple declarations with the same name are valid together, and how a particular declaration will be chosen when more than one is available.
Templates
Chapter 14 [temp] is the largest language chapter in the standard because templates are such a complex topic. This chapter defines the template syntax for functions and classes, describes how template arguments bind to template parameters, how name resolution is affected in the context of a template, template specialization, and more.
Exceptions
Chapter 15 [except] describes the exception handling mechanisms supported by C++. It defines the syntax for throwing and catching exceptions and how the propagation of exceptions is performed.
The Standard Library
As described earlier, the latter half of the standard (Chapters 17 through 30) are devoted to the standard library. That is, these chapters define every class, function, and template you find in the std
namespace. Chapter 17 [library] serves as an introduction and introduces some important requirements that are enforced throughout the library. For example, §17.6.4.2.1 [namespace.std] tells that you cannot add additional declarations to std
. The library also defines syntactic and semantic requirements on types, and §17.6.3 [utility.requirements] introduces some common ones, such as what it means to be EqualityComparable
or to be an Allocator
.
The rest of these chapters are all named “… library” and all have pretty much the same structure. They introduce a few related requirements, a synopsis of the contents of a particular header, and then define the operation of types and functions in that header in terms of effects, preconditions, and postconditions.
Let's look at §21 [strings], the “Strings library” chapter. It first describes character traits and the requirements for a traits type. It then gives an overview of the <string>
header in code form, so you can quickly take a look at all the declarations introduced by this header. The remaining sections in this chapter then define the behavior of these declarations. For example, §21.4 [basic.string] defines the std::basic_string
template (of which std::string
is a particular instantiation) and it says, for example, that a_string.front()
is equivalent to a_string[0]
. This is where you would look if you wanted to know how a particular function behaves, including whether or not it might throw.
The remaining chapters each cover the various other headers in the standard library:
- Language support
- Various headers containing functions and types that may be used implicitly by a program, such as
std::initializer_list
and the storage allocation functions that are used bynew
anddelete
expressions. The language depends on many of these entities, so they are required by all conforming implementations. - Diagnostics
- Entities that provide error handling, exceptions, and assertion functionality.
- General utilities
- Entities that are likely to be useful in general, such as smart pointers,
std::tuple
,std::pair
, and the<chrono>
time utilities. - Strings
- The
<string>
header providesstd::basic_string
, string manipulation functions, and support for usingstd::string
with the I/O streams. - Localization
- The
<locale>
header with support for localizing numerical and time values. - Containers
- Provides the many container headers, such as
<vector>
,<array>
,<list>
, and so on, for storing sequences of elements. - Iterators
- Defines the requirements on different iterator categories and some useful iterator types in
<iterator>
. - Algorithms
- The
<algorithm>
header contains various generic algorithms that can be applied to sequences of elements, such asstd::for_each
,std::find
, andstd::transform
. - Numerics
- Defines the
<complex>
header for complex numbers,<random>
for high quality random number generation, and other useful numeric utilities. Also includes the<cmath>
C library header for common math functions. - Input/output
- Provides various I/O stream headers, including the formatted streams (like
std::cin
andstd::cout
) and file streams. - Regular expressions
- The
<regex>
header provides functionality for matching and searching for regular expressions within strings. - Atomic operations
- Types and functions in
<atomic>
are used to ensure atomic access, particularly useful with the threads library. - Thread support
- Provides the
<thread>
header for platform-independent thread support for multitasking and concurrent execution.
Finding your way around
You hopefully have a reasonable idea of how the C++ standard is structured now, but it's still going to be quite hard to navigate your way around. It can sometimes to be pretty difficult to find what you're looking for.
For example, let's say you want to find out why you can't have a std::vector
of reference type elements (T&
). You might be inclined to look at §23.3.6 [vector], which defines std::vector
, but you won't find the answer there. Perhaps it's a requirement on a particular member function? Or maybe it's a requirement on containers in general? If you follow for long enough, what you'll find is that std::vector
meets the requirements of an allocator-aware container, which means that it allocates its contents using an allocator. If you look at the requirements for an allocator (§17.6.3.5 [allocator.requirements]), you'll see that an allocator is only defined for non-const object types. If you then take a look at §3.9 [basic.types], you'll see that object types do not include references. There's your answer. Phew.
The standard can sometimes be a bit maze-like. The only advice I have is to get used to it by reading it often. You'll soon figure out the best way to find what you want. If you can't find something, you can always write some example code that exhibits whatever you don't understand and ask on Stack Overflow.
Keeping up to date
Before and after each C++ committee meeting, papers that have been written recently are mailed out to everybody. These mailings are available on the committee website. Every now and then, one of those papers is a working draft of the C++ standard. Subscribe to the Standard C++ RSS feed to find out when new papers are mailed out.