Description: A note on std::stringstream, std::ostringstream, std::istringstream for in-memory I/O, and the <regex> library for pattern matching
My Notion Note ID: K2A-B1-21
Created: 2018-12-30
Updated: 2026-02-28
License: Reuse is very welcome. Please credit Yu Zhang and link back to the original on yuzhang.io

1. String Streams Overview

The <sstream> header provides three stream types backed by std::string:

Type	Direction	Use for
`std::ostringstream`	Output (write)	Building a string from heterogeneous values
`std::istringstream`	Input (read)	Parsing a string into typed values
`std::stringstream`	Both	Bidirectional in-memory buffer

They have the same << / >> interface as std::cout / std::cin, plus .str() to access the underlying string.

2. `std::ostringstream` — Building Strings

#include <sstream>
#include <iomanip>
#include <string>

std::ostringstream oss;
oss << "x=" << 42 << ", y=" << 3.14;
std::string s = oss.str();           // "x=42, y=3.14"

// With manipulators
std::ostringstream oss2;
oss2 << std::hex << std::uppercase << 255 << " "
     << std::fixed << std::setprecision(2) << 3.14159;
// "FF 3.14"

// Reset for reuse
oss.str("");                         // clear contents
oss.clear();                         // clear error flags
oss << "fresh";

In modern C++, std::format (C++20, see K2A-B1-4 § 5) is usually a cleaner choice for building strings:

auto s = std::format("x={}, y={}", 42, 3.14);   // shorter, type-safe, faster

ostringstream remains useful when:

The composition is conditional (write some pieces only if a condition holds).
You need fine-grained stream-state control (locale, manipulators).
Pre-C++20 codebases.

3. `std::istringstream` — Parsing Strings

#include <sstream>
#include <string>

std::istringstream iss{"42 3.14 hello"};

int    n;
double d;
std::string word;

iss >> n >> d >> word;       // n=42, d=3.14, word="hello"

// Read all words
std::istringstream lines{"alpha beta gamma"};
std::string token;
while (lines >> token) {
    std::cout << token << "\n";
}

// Line-by-line parsing
std::istringstream multi{"line one\nline two\n"};
std::string line;
while (std::getline(multi, line)) {
    // process line
}

// Detect parse failure
std::istringstream bad{"not a number"};
int v;
if (!(bad >> v)) {
    std::cerr << "parse failed\n";
}

For high-performance number parsing, std::from_chars (C++17) is faster and locale-independent (see K2A-B1-4 § 4). istringstream is more flexible but heavier.

4. `std::stringstream` — Bidirectional

#include <sstream>

std::stringstream ss;

ss << 42 << " " << 3.14;        // write

int    n;
double d;
ss >> n >> d;                    // read

ss.str();                        // current contents

stringstream is rarely the right choice — bidirectional buffering is awkward, and the read/write positions interact in subtle ways. Pick ostringstream or istringstream for clarity.

5. `<regex>` Basics

The <regex> library (C++11) provides pattern matching with an ECMAScript-like dialect by default.

#include <regex>
#include <string>
#include <iostream>

std::string s = "[email protected]";
std::regex pattern{R"((\w+)@(\w+\.\w+))"};   // raw string for backslashes

// Test if any match exists
if (std::regex_search(s, pattern)) {
    std::cout << "found\n";
}

// Extract submatches
std::smatch m;
if (std::regex_search(s, m, pattern)) {
    std::cout << "full: "   << m[0] << "\n";   // [email protected]
    std::cout << "user: "   << m[1] << "\n";   // user42
    std::cout << "domain: " << m[2] << "\n";   // example.com
}

// Whole-string match (not just contains)
std::regex_match(s, m, pattern);

// Replace
std::string masked = std::regex_replace(s, pattern, "[REDACTED]");
// "[REDACTED]"

// Iterate all matches
auto begin = std::sregex_iterator{s.begin(), s.end(), pattern};
auto end   = std::sregex_iterator{};
for (auto it = begin; it != end; ++it) {
    std::cout << it->str() << "\n";
}

Functions

Function	Purpose
`std::regex_search`	Find first match anywhere in the string
`std::regex_match`	Match the entire string
`std::regex_replace`	Substitute matches with a replacement
`std::sregex_iterator`	Iterate all matches
`std::sregex_token_iterator`	Tokenize (split by pattern or capture)

Match types

Type	Holds
`std::smatch`	Match results over a `std::string`
`std::cmatch`	Match results over a C-string
`std::wsmatch` / `std::wcmatch`	Wide-string variants

6. Regex Patterns

The default ECMAScript dialect supports the usual constructs:

Pattern	Matches
`.`	Any character (except newline by default)
`\d` `\D`	Digit / non-digit
`\w` `\W`	Word char (`[A-Za-z0-9_]`) / non-word
`\s` `\S`	Whitespace / non-whitespace
`[abc]`	Any of `a`, `b`, `c`
`[^abc]`	Anything except `a`, `b`, `c`
`[a-z]`	Range
`*` `+` `?`	0+, 1+, 0-or-1 of preceding
`{n}` `{n,}` `{n,m}`	Exactly `n`, `n+`, `n` to `m`
`*?` `+?` `??`	Lazy (non-greedy) variants
`^` `$`	Start / end of string (or line in multiline mode)
`\b`	Word boundary
`(...)`	Capture group
`(?:...)`	Non-capturing group
`\|`	Alternation
`\1` `\2`	Backreference to capture group N

Use raw string literals

Always wrap patterns in R"(...)" so backslashes don't need to be doubled:

std::regex bad{"(\\d+)\\.(\\d+)"};    // hard to read
std::regex good{R"((\d+)\.(\d+))"};   // much better

Other dialects

std::regex p1{"a.b", std::regex::extended};       // POSIX extended
std::regex p2{"a.b", std::regex::basic};          // POSIX basic
std::regex p3{"a.b", std::regex::ECMAScript};     // default
std::regex p4{"a.b", std::regex::icase};          // case-insensitive

7. Regex Performance and When to Avoid

Standard <regex> has a reputation for being slow. Major implementations compile patterns into NFA-based matchers, which are correct but several times slower than re2 or PCRE2.

Don't use regex when:

The pattern is fixed and simple. A raw find / starts_with / ends_with is much faster.
You're scanning a large file. Use a faster engine — re2 (Google), boost::regex, or ctre (compile-time-compiled regex).
Performance is critical. A handwritten state machine or std::ranges filter often beats regex.

Do use regex when:

The pattern is genuinely complex (alternations, groups, anchors).
The pattern needs to be configurable at runtime (read from config / user input).
You need a quick prototype and the throughput isn't a concern.

Common gotcha: pattern construction cost

Compiling a std::regex is expensive. Cache it; don't construct it inside a loop.

// BAD: recompiles regex on every call
bool is_email(const std::string& s) {
    return std::regex_match(s, std::regex{R"(\w+@\w+\.\w+)"});
}

// GOOD: compile once
bool is_email(const std::string& s) {
    static const std::regex pattern{R"(\w+@\w+\.\w+)"};
    return std::regex_match(s, pattern);
}

C++ String Streams and Regex

Table of Contents

1. String Streams Overview

2. `std::ostringstream` — Building Strings

3. `std::istringstream` — Parsing Strings

4. `std::stringstream` — Bidirectional

5. `<regex>` Basics

Functions

Match types

6. Regex Patterns

Use raw string literals

Other dialects

7. Regex Performance and When to Avoid

Don't use regex when:

Do use regex when:

Common gotcha: pattern construction cost

Table of Contents

1. String Streams Overview

2. std::ostringstream — Building Strings

3. std::istringstream — Parsing Strings

4. std::stringstream — Bidirectional

5. <regex> Basics

Functions

Match types

6. Regex Patterns

Use raw string literals

Other dialects

7. Regex Performance and When to Avoid

Don't use regex when:

Do use regex when:

Common gotcha: pattern construction cost

2. `std::ostringstream` — Building Strings

3. `std::istringstream` — Parsing Strings

4. `std::stringstream` — Bidirectional

5. `<regex>` Basics