- Description: A note on C-style strings (
char*), std::string, std::string_view (C++17), string conversions, and modern formatting (std::format, std::print)
- My Notion Note ID: K2A-B1-4
- Created: 2020-04-10
- Updated: 2026-02-28
- License: Reuse is very welcome. Please credit Yu Zhang and link back to the original on yuzhang.io
Table of Contents
1. C-Style Strings (char*)
- C-style string = contiguous
chars terminated by '\0'. Null terminator distinguishes from generic char* — tells C string fns where it ends.
const char* s = "hello";
char buf[10] = "hi";
#include <cstring>
strlen(s);
strcmp(s, "hi");
strcpy(buf, "world");
char* = pointer to one or more chars. Whether it's "a string" depends on null termination:
nullptr — no string. Just a null pointer.
char* pointing to bytes with '\0' inside — C string. Length = position of '\0'.
char* pointing to bytes with no '\0' — not a string. strlen on it = UB.
Issues with C-style strings
- No size info — every op scans to null.
strlen is O(n) every time.
- No bounds checking —
strcpy, strcat, gets infamous for buffer overflows.
- No ownership — who owns the buffer? Manual tracking.
- String literals read-only —
char* p = "hello"; p[0] = 'H'; is UB. Use const char*.
- Dynamic strings need manual
malloc/free.
- Still essential for C API interop. Use
std::string::c_str() to get null-terminated const char* from std::string.
2. std::string
- C++ way to handle text. Owns memory, knows size, grows dynamically.
#include <string>
std::string s = "hello";
s += " world";
s.size();
s.length();
s.empty();
s.substr(6, 5);
s.find("world");
s.replace(6, 5, "C++");
for (char c : s) std::cout << c;
const char* cstr = s.c_str();
std::string s2 = cstr;
if (s == "hello C++") { }
Small String Optimization (SSO)
- Most impls store short strings (~15–23 chars) inline within the string object, no heap allocation.
- Passing small string by value is cheap.
- For very short strings → prefer
std::string over char*. SSO makes it nearly as efficient + you get safety, ownership, length tracking.
3. std::string_view (C++17)
std::string_view = non-owning view of a string. (const char*, size_t) pair.
- Modern way to write "takes any kind of string" param without copying.
#include <string_view>
void print(std::string_view sv) {
std::cout << sv;
}
print("literal");
print(std::string("dynamic"));
char buf[] = "buffer";
print(buf);
std::string_view sv = "abcdef";
print(sv.substr(1, 3));
Lifetime trap
string_view doesn't own underlying chars. Storing one beyond source's lifetime = use-after-free:
std::string_view make_view() {
std::string s = "temporary";
return s;
}
Rules of thumb:
string_view for parameters.
- Don't store as class member / return from fn unless lifetime obvious + documented.
std::string for owned text storage.
4. String Conversions
#include <string>
#include <charconv>
std::to_string(42);
std::to_string(3.14);
int n = std::stoi("42");
double d = std::stod("3.14");
size_t pos;
int x = std::stoi("42abc", &pos);
const char* str = "42";
int value;
auto [ptr, ec] = std::from_chars(str, str + 2, value);
if (ec == std::errc{}) {
}
char buf[16];
auto [end, ec2] = std::to_chars(buf, buf + sizeof buf, 42);
*end = '\0';
from_chars / to_chars — fastest string ↔ number in stdlib. Use for hot paths + anywhere you'd reach for sprintf/atoi.
std::format (C++20, <format>) — type-safe, Python-like formatter.
std::print (C++23, <print>) — prints directly to a stream.
#include <format>
#include <print>
std::string s = std::format("Hello, {}!", name);
std::string t = std::format("{:>10}", 42);
std::string u = std::format("{:.3f}", 3.14159);
std::string v = std::format("{0} and {0}", "twice");
std::print("Hello, {}!\n", name);
std::println("count = {}", n);
- Format specifiers loosely follow Python:
| Spec |
Meaning |
{} |
Default formatting |
{:>10} / {:<10} / {:^10} |
Right / left / center align in width 10 |
{:0>5} |
Zero-pad to width 5 |
{:.3f} |
3 decimal places |
{:#x} |
Hex with 0x prefix |
{:b} |
Binary |
{0}, {1} |
Positional arguments |
- Custom types — specialize
std::formatter<T>:
struct Point { int x, y; };
template <>
struct std::formatter<Point> : std::formatter<std::string> {
auto format(Point p, format_context& ctx) const {
return std::formatter<std::string>::format(
std::format("({}, {})", p.x, p.y), ctx);
}
};
std::print("p = {}\n", Point{1, 2});
- vs
printf: type-safe (no format-string mismatches), extensible (custom formatters).
- vs
<iostream>: faster, less verbose, supports positional args.
6. Wide and Unicode Strings
- Several string types for different encodings. Avoid in modern code unless doing platform-specific work — use
std::string (UTF-8).
| Type |
Underlying char |
Typical use |
std::string |
char (8-bit) |
UTF-8 (recommended), or platform default |
std::wstring |
wchar_t (16-bit on Windows, 32-bit on Linux) |
Windows API interop |
std::u8string (C++20) |
char8_t |
Explicit UTF-8 |
std::u16string |
char16_t |
UTF-16 |
std::u32string |
char32_t |
UTF-32 (each element is one Unicode code point) |
- stdlib historically weak on Unicode-aware text handling (case folding, normalization, segmentation, collation). Use ICU or dedicated lib for real Unicode ops.