C++ Compilation and Linkage
- Description: A note on how C++ programs are built — the compilation pipeline, translation units, the One Definition Rule, linkage, headers, the preprocessor, and static vs dynamic linking
- My Notion Note ID: K2A-B1-2
- Created: 2018-09-15
- Updated: 2026-02-28
- License: Reuse is very welcome. Please credit Yu Zhang and link back to the original on yuzhang.io
Table of Contents
- 1. The Compilation Pipeline
- 2. Translation Units
- 3. The One Definition Rule (ODR)
- 4. Linkage: Internal vs External
- 5. Header Files and Header Guards
- 6. The Preprocessor
- 7. Static vs Dynamic Linking
- 8. Name Mangling
1. The Compilation Pipeline
- 4 phases from
.cppto executable:
- Preprocessing —
#include,#define, conditional compilation. Output: single text translation unit (macros expanded, headers inlined). - Compilation — TU → object file (
.o/.obj). Each TU compiled independently; compiler can't see other TUs. - Assembly — asm text → machine code (usually integrated with compile).
- Linking — object files + libraries → executable / shared lib. Resolves cross-TU symbol references.
foo.cpp ──[preprocess]──► foo.i ──[compile]──► foo.o ──┐
bar.cpp ──[preprocess]──► bar.i ──[compile]──► bar.o ──┼─[link]─► program
stdlib.a ──┘
- Each step matters — macro hygiene (preprocessor),
inline/templates (compile),static/extern/lib order (link).
2. Translation Units
- TU = preprocessed source — source + all
#included headers, macros expanded. The unit the compiler sees.
Key consequences:
- Each TU compiled independently. Compiler doesn't know what's in other TUs.
- Headers
#included in many TUs are reparsed each time → slow C++ builds (modules fix this). staticat namespace scope = internal to this TU. Invisible to other TUs.- Templates instantiated in a TU live in that TU → template definitions go in headers.
3. The One Definition Rule (ODR)
- Most important linking rule:
Every variable, function, class type, enumeration, and template must have exactly one definition across the entire program.
Nuances:
- Declarations unrestricted — declare in 100 headers, define once.
inlinefunctions/variables (C++17), classes, templates — multi-TU definitions OK, but every definition must be token-for-token identical. Linker picks one.- Anonymous namespaces and
static— TU-local. Not subject to cross-TU "exactly one" rule.
Common violations:
// foo.h
int x = 42; // BUG: definition in header
inline int safe_x = 42; // OK: 'inline' allows multi-TU definition
void f() { /* ... */ } // BUG: non-inline definition in header
inline void g() { /* ... */ } // OK
- 2 non-
inlinedefs off()→ linker "multiple definition" error. - 2 different
inlinedefinitions (e.g., compiled with different macros) → UB; linker won't detect.
4. Linkage: Internal vs External
- Linkage = whether a name refers to the same entity from a different scope. 3 kinds:
| Linkage | Visibility | How to declare |
|---|---|---|
| No linkage | Block scope only (locals) | Default for local variables |
| Internal | One TU only | static at namespace scope, anonymous namespace, const namespace-scope variables (without extern) |
| External | All TUs in the program | Default for non-static namespace-scope names; extern is implicit; functions are external by default |
// translation unit 1
static int counter = 0; // internal — invisible to TU 2
namespace { void helper(); } // also internal (preferred modern style)
int g_count = 0; // external — TU 2 can extern-declare it
void g_init(); // external — declarations propagate via extern
- Modern idiom for internal linkage: unnamed namespace, not file-scope
static:
namespace {
int helper_count = 0;
void helper() { /* ... */ }
}
- Unnamed namespaces also work for types —
staticis a storage-class specifier (objects/functions only, not types).
5. Header Files and Header Guards
- Headers hold declarations + inline/template/class definitions shared across TUs.
- Without guards → double-include causes redefinition errors.
Two equivalent guards:
// header.h — using #pragma once (non-standard but widely supported)
#pragma once
void foo();
// header.h — using include guards (portable, standard)
#ifndef MYPROJECT_HEADER_H
#define MYPROJECT_HEADER_H
void foo();
#endif // MYPROJECT_HEADER_H
#pragma once— shorter, no macro collision risk. All major compilers support it. Use unless targeting exotic toolchain.
What goes in headers
| Goes in headers | Stays in .cpp |
|---|---|
| Function declarations | Function definitions (non-inline) |
| Class definitions | Implementation details |
inline functions |
Static globals (file-scope state) |
| Templates | Anonymous namespace contents |
inline variables (C++17) |
Mutable globals |
constexpr functions and variables |
|
Type aliases (using, typedef) |
6. The Preprocessor
- Runs before the compiler. Pure text substitution. No concept of types/scope.
#include <header> // include another file
#include "local.h"
#define MAX 100 // macro: text substitution
#define SQUARE(x) ((x) * (x)) // function-like macro (note the parens!)
#ifdef DEBUG // conditional compilation
log("debug");
#elif defined(RELEASE)
log("release");
#else
log("unknown");
#endif
#if __cplusplus >= 202002L
// C++20 and later
#endif
#error "unsupported config" // compile-time error
#warning "deprecated" // compile-time warning (standardized in C++23; widely supported as an extension before)
#pragma once
Predefined macros
| Macro | Meaning |
|---|---|
__cplusplus |
C++ standard version (e.g. 202002L for C++20) |
__FILE__, __LINE__ |
Current file path and line |
__func__ |
Current function name (C99/C++11) |
__DATE__, __TIME__ |
Build timestamp |
_WIN32, __linux__, __APPLE__ |
Platform |
__GNUC__, _MSC_VER, __clang__ |
Compiler |
Macros are dangerous
- No type safety —
SQUARE(x++)evaluatesx++twice. - No scoping —
#define MINin a header pollutes every TU including it. - Hard to debug — debugger sees post-expansion text, not the macro name.
- Modern C++ replaces macros with
constexpr(constants),inline(functions), templates (generics). Reserve macros for include guards, conditional compilation, platform abstraction.
7. Static vs Dynamic Linking
| Aspect | Static linking | Dynamic linking |
|---|---|---|
| File extension | .a (Unix), .lib (Windows) |
.so (Linux), .dylib (macOS), .dll (Windows) |
| What gets into your binary | Library code is copied in | Just a reference; OS loads the shared library at runtime |
| Binary size | Larger | Smaller |
| Startup time | Faster (no DSO load) | Slower (linker resolves at startup) |
| Updates | Need to relink to upgrade | Drop-in replacement of .so |
| Symbol conflicts | Can hide internal symbols | Whole-library symbol table exposed |
| Distribution | Self-contained | Need to ship/install the .so |
- Static — CLI tools, small binaries.
- Dynamic — OS-shipped libs (libc, OpenSSL), plugin systems.
8. Name Mangling
- Linker knows only by name → compiler mangles to make overloads/namespaces unique. Encodes signature into symbol name.
namespace ns {
int add(int, int);
double add(double, double);
}
// gcc/clang mangled names:
// _ZN2ns3addEii ns::add(int, int)
// _ZN2ns3addEdd ns::add(double, double)
Demangling:
echo "_ZN2ns3addEii" | c++filt # → ns::add(int, int)
nm --demangle obj.o # list demangled symbols
- Mangling = why C++ libs aren't directly C-callable.
extern "C"→ stable un-mangled name:
extern "C" {
void my_api(int); // mangled as: my_api (unchanged)
}
- Cost: no overloading, no namespaces, no name collisions. Standard way to provide a C-callable interface to a C++ library.