diff --git a/doc/research/dispatch.md b/doc/research/dispatch.md index 428ada27..6dc76db3 100644 --- a/doc/research/dispatch.md +++ b/doc/research/dispatch.md @@ -1,617 +1,81 @@ -# The Dispatch Problem: Symmetric Transfer, Stack Overflow, and Async Mutex Correctness +# Dispatch Design: Symmetric Transfer for Coroutine Resumption -## Executive Summary +## Principle -Corosio's `executor_type::dispatch()` returns a `std::coroutine_handle<>`, enabling symmetric transfer from I/O completion paths. This design causes: +Every coroutine resumption must go through either symmetric transfer or the scheduler queue -- never through an inline `resume()` or `dispatch()` that creates a frame below the resumed coroutine. -1. **Stack overflow** (`STATUS_STACK_BUFFER_OVERRUN` on Windows) when I/O operations complete synchronously in tight loops -2. **Async mutex correctness failures** where coroutine chains holding mutexes break due to improper stack unwinding -3. **Non-returning dispatch calls** when symmetric transfer chains don't terminate properly +## Design -The solution is to change `dispatch(coro)` to return `void` and call `h.resume()` as a normal function call when running in the same thread. This aligns with Boost.Asio's proven approach while preserving symmetric transfer for coroutine composition (task-to-task transfers via `final_suspend`). - ---- - -## Table of Contents - -1. [Background: Coroutine Resumption Models](#background-coroutine-resumption-models) -2. [How Asio Handles Coroutine Resumption](#how-asio-handles-coroutine-resumption) -3. [How Corosio Gets It Wrong](#how-corosio-gets-it-wrong) -4. [The Stack Overflow Problem](#the-stack-overflow-problem) -5. [The Async Mutex Problem](#the-async-mutex-problem) -6. [The Solution](#the-solution) -7. [Why We Don't Need Asio's Pump](#why-we-dont-need-asios-pump) -8. [Implementation Changes Required](#implementation-changes-required) -9. [Verification Criteria](#verification-criteria) - ---- - -## Background: Coroutine Resumption Models - -### The Coroutine Pump - -The "pump" is the event loop that drives coroutine execution: - -```cpp -// Simplified io_context::run() -while (has_work()) { - auto completion = dequeue_completion(); // Wait on IOCP/epoll - completion.handler(); // Resume suspended coroutine -} -``` - -When an I/O operation completes, the suspended coroutine must be resumed. The question is *how* that resumption happens. - -### Symmetric Transfer - -C++20 coroutines support **symmetric transfer**: when `await_suspend` returns a `coroutine_handle`, the compiler generates a tail call to that handle's `resume()`. This avoids stack growth: - -```cpp -auto await_suspend(std::coroutine_handle<> h) { - return other_handle; // Tail call to other_handle.resume() -} -``` - -The key property: a tail call **replaces** the current stack frame rather than pushing a new one. - -### Normal Function Calls - -A normal function call pushes a new stack frame: - -```cpp -void dispatch(coro h) { - h.resume(); // Normal call - pushes frame, will return here -} -``` - -The call will return when the coroutine suspends (returns `noop_coroutine` from its next `await_suspend`). - ---- - -## How Asio Handles Coroutine Resumption - -### Asio's Architecture - -Asio uses a **completion token** model where asynchronous operations accept a token that determines how completions are delivered. For coroutines, `use_awaitable` transforms operations into awaitables. - -### The `awaitable_thread` and Explicit Frame Stack - -Asio maintains an **explicit stack of coroutine frames** in `awaitable_thread`: - -```cpp -// From boost/asio/impl/awaitable.hpp -class awaitable_thread { - awaitable_frame_base* top_of_stack_; - // ... - - void pump() { - do - bottom_of_stack_.frame_->top_of_stack_->resume(); - while (bottom_of_stack_.frame_ && bottom_of_stack_.frame_->top_of_stack_); - } -}; -``` - -Key observations: - -1. **`pump()` calls `resume()` as a normal function call** - not symmetric transfer -2. **The loop continues** until the stack is empty or coroutine suspends for I/O -3. **`final_suspend` doesn't transfer** - it just pops the frame and returns - -### Asio's `final_suspend` +`dispatch` returns `std::coroutine_handle<>`: ```cpp -// Asio's awaitable_frame final_suspend -auto await_suspend(coroutine_handle<>) noexcept { - this->this_->pop_frame(); // Adjust stack pointers - return noop_coroutine(); // Don't transfer anywhere -} -``` - -When a child coroutine completes: -1. `final_suspend` pops itself from the explicit stack -2. Returns `noop_coroutine()` (suspend, don't transfer) -3. `resume()` returns to the pump loop -4. Pump loop sees parent is now on top, calls `resume()` on parent - -### Why This Works - -Asio **never uses symmetric transfer** for I/O completions or coroutine composition. Everything goes through the pump loop as normal function calls. This guarantees: - -- Stack always unwinds properly -- No unbounded stack growth -- Nested dispatch calls return correctly -- Async mutex operations work correctly - ---- - -## How Corosio Gets It Wrong - -### Corosio's Current `dispatch` Signature - -```cpp -// basic_io_context.hpp, executor_type::dispatch -capy::coro dispatch(capy::coro h) const { +std::coroutine_handle<> +dispatch(std::coroutine_handle<> h) const +{ if (running_in_this_thread()) - return h; // Return handle for symmetric transfer - ctx_->sched_->post(h); - return std::noop_coroutine(); -} -``` - -This returns `h` when running in the same thread, enabling the caller to use it for symmetric transfer. - -### Usage in I/O Completion Paths - -When an I/O operation completes immediately, the completion handler does: - -```cpp -// overlapped_op.hpp -std::coroutine_handle<> complete_immediate() { - // ... setup ... - return d.dispatch(h); // Returns h for symmetric transfer -} -``` - -Or in `await_suspend`: - -```cpp -auto await_suspend(std::coroutine_handle<> h) { - initiate_io(...); - if (immediate_completion) - return dispatch(h); // Symmetric transfer back to h + return h; // symmetric transfer + post(h); return std::noop_coroutine(); } ``` -### The Fundamental Problem - -When `dispatch` returns `h` and the caller uses it for symmetric transfer, the compiler generates: - -```cpp -// What the compiler generates for await_suspend returning h: -goto h.resume(); // Tail call - doesn't push frame, doesn't return -``` - -This creates several problems detailed below. - ---- - -## The Stack Overflow Problem - -### Scenario: Tight Loop with Immediate Completions - -```cpp -task<> client(tcp_socket& socket) { - for (int i = 0; i < 1000000; i++) { - co_await socket.async_read(...); // Completes immediately - } -} -``` - -### What Happens (Current Implementation) - -1. Coroutine does `co_await async_read()` -2. `await_suspend` initiates I/O, completes immediately -3. `await_suspend` returns `dispatch(h)` which returns `h` -4. Compiler generates tail call to `h.resume()` -5. **But if the compiler doesn't generate a proper tail call...** - -If the compiler generates a normal call instead of a tail call: - -``` -coroutine frame - -> await_suspend returns h - -> h.resume() // NOT a tail call - pushes frame! - -> coroutine continues to next iteration - -> await_suspend returns h - -> h.resume() // Another frame pushed! - -> next iteration - -> h.resume() // Stack grows unboundedly - ... STATUS_STACK_BUFFER_OVERRUN -``` - -### Why Tail Calls Fail - -Symmetric transfer requires the compiler to generate an actual tail call. This can fail due to: - -1. **Compiler limitations** - older compilers may not optimize correctly -2. **Debug builds** - optimizations disabled -3. **ABI constraints** - calling conventions may prevent tail calls -4. **Inlining decisions** - complex call chains may prevent optimization - -### Observed Symptom - -On Windows: `STATUS_STACK_BUFFER_OVERRUN` - the /GS security check detects stack corruption when the stack grows into the guard page or overwrites the security cookie. - ---- - -## The Async Mutex Problem - -### Scenario: Coroutine Holds Mutex During I/O - -```cpp -task<> worker(async_mutex& mutex, tcp_socket& socket) { - auto lock = co_await mutex.lock(); - co_await socket.async_write(data); // Completes immediately - // lock released here -} -``` - -### What Should Happen - -1. Worker A holds mutex -2. Worker B waiting for mutex -3. A's write completes immediately -4. A continues, releases mutex -5. Mutex wakes B -6. A continues to completion -7. B runs - -### What Actually Happens (Current Implementation) - -1. Worker A holds mutex -2. A's write completes immediately -3. Completion path does `dispatch(A)` returning A's handle -4. Symmetric transfer to A (tail call - no return!) -5. A continues, releases mutex -6. Mutex calls `dispatch(B)` to wake B -7. **dispatch returns B's handle for symmetric transfer** -8. Tail call to B.resume() - **A's stack frame is gone** -9. B runs, but A never gets to continue! - -### The Core Issue - -When `dispatch(B)` returns B's handle and the caller does symmetric transfer: - -```cpp -void async_mutex::unlock() { - auto next_waiter = waiters_.pop(); - dispatch(next_waiter).resume(); // If dispatch returns handle... - // This line never executes if .resume() is a tail call! -} -``` - -The symmetric transfer replaces the current frame. The code after the transfer never runs. - -### Current Workaround in Corosio - -The codebase has explicit comments about this: - -```cpp -// sockets.cpp -// Immediate error - must use post(), not complete_immediate(). -// Using symmetric transfer (complete_immediate) here breaks -// coroutine chains that hold async mutexes: the resumed -// coroutine releases its lock and tries to wake the next -// waiter, but the symmetric transfer chain doesn't return -// control to io_context properly. -``` - -This workaround (always posting) is pessimistic and adds unnecessary latency. - ---- - -## The Solution - -### Change `dispatch` to Return `void` - -```cpp -// NEW: basic_io_context.hpp, executor_type::dispatch -void dispatch(capy::coro h) const { - if (running_in_this_thread()) - h.resume(); // Normal function call - will return - else - ctx_->sched_->post(h); -} -``` - -### Why This Works - -**Normal function calls return.** When `dispatch` calls `h.resume()`: - -1. Coroutine runs until it suspends -2. Coroutine's `await_suspend` returns `noop_coroutine()` -3. `resume()` returns -4. `dispatch()` returns -5. Caller continues - -The stack unwinds properly. Nested dispatch calls work correctly: - -```cpp -void async_mutex::unlock() { - auto next_waiter = waiters_.pop(); - dispatch(next_waiter); // Normal call - returns when waiter suspends - // This line DOES execute! -} -``` - -### Stack Overflow Prevention - -With immediate completions: - -``` -io_context::run() - -> dequeue completion - -> dispatch(h) - -> h.resume() // Normal call - -> coroutine runs one iteration - -> co_await next I/O - -> await_suspend returns noop_coroutine() - <- h.resume() RETURNS - <- dispatch() returns - -> dequeue next completion (or same one if it completed immediately) - ... stack stays flat -``` - -Each iteration returns to the run loop. No unbounded stack growth. - ---- - -## Why We Don't Need Asio's Pump - -### Asio's Pump Exists Because Asio Doesn't Use Symmetric Transfer - -Asio's `awaitable_thread::pump()` maintains an explicit stack because: - -1. `final_suspend` doesn't transfer - just pops frame and returns -2. Pump must manually resume the parent -3. Everything goes through the pump loop - -### Corosio Can Keep Symmetric Transfer for Task Composition - -With dispatch returning void, we can still use symmetric transfer for **task-to-task composition**: - -```cpp -// task's final_suspend - UNCHANGED -auto await_suspend(coroutine_handle<>) noexcept { - return continuation_; // Symmetric transfer to parent task -} -``` - -This is safe because: - -1. `final_suspend` transfers to exactly one place (the parent) -2. No "wake B AND continue A" scenario -3. Parent continues, may do I/O, returns `noop_coroutine()` -4. The chain terminates - -### The Key Insight: `noop_coroutine()` Terminates Chains - -When a coroutine's `await_suspend` returns `noop_coroutine()`: - -1. Compiler generates "tail call" to `noop_coroutine().resume()` -2. `noop_coroutine().resume()` is a no-op that returns immediately -3. The symmetric transfer chain terminates -4. Control returns to whoever called `resume()` (i.e., dispatch) - -Trace: -``` -dispatch(parent) - [1] parent.resume() // Normal call from dispatch - [2] parent -> child (symmetric transfer via await_suspend) - [3] child -> grandchild (symmetric transfer) - [4] grandchild does I/O, returns noop_coroutine() - [4] "transfer" to noop - returns immediately - [3] returns - [2] returns - [1] parent.resume() returns -dispatch returns -``` - ---- - -## Implementation Changes Required - -### 1. Change `executor_type::dispatch` Signature - -**File:** `include/boost/corosio/basic_io_context.hpp` - -```cpp -// BEFORE -capy::coro dispatch(capy::coro h) const { - if (running_in_this_thread()) - return h; - ctx_->sched_->post(h); - return std::noop_coroutine(); -} +- Same thread: returns `h` for symmetric transfer +- Different thread: posts to queue, returns `std::noop_coroutine()` +- Never calls `h.resume()` internally -// AFTER -void dispatch(capy::coro h) const { - if (running_in_this_thread()) - h.resume(); - else - ctx_->sched_->post(h); -} -``` +`post` returns `void` -- it always queues. -### 2. Update `resume_coro` Helper +## Call Site Patterns -**File:** `src/corosio/src/detail/resume_coro.hpp` +### From coroutine machinery (await_suspend, final_suspend) -The `resume_coro` helper includes a **memory barrier** that must be preserved. This acquire fence ensures that I/O results (buffer contents, error codes, bytes transferred) written by other threads are visible to the resumed coroutine before it continues execution. +Return the handle for symmetric transfer: ```cpp -// BEFORE -inline void -resume_coro(capy::executor_ref d, capy::coro h) +std::coroutine_handle<> +await_suspend(std::coroutine_handle<> h) noexcept { - std::atomic_thread_fence(std::memory_order_acquire); // KEEP THIS - auto resume_h = d.dispatch(h); - if (resume_h.address() == h.address()) - resume_h.resume(); -} - -// AFTER -inline void -resume_coro(capy::executor_ref d, capy::coro h) -{ - std::atomic_thread_fence(std::memory_order_acquire); // PRESERVED - d.dispatch(h); // dispatch now handles resume internally -} -``` - -**Why the fence matters:** - -When an I/O operation completes: -1. The OS (or an internal worker thread) writes results to buffers -2. The completion is signaled to the `io_context` thread -3. `resume_coro` is called to resume the waiting coroutine -4. The coroutine reads the results from those buffers - -Without the acquire fence, the coroutine might see stale data due to CPU memory reordering. The fence ensures all writes from step 1 are visible before step 4. - -**Note:** The fence is conservative — it always executes even when not strictly necessary (e.g., same-thread immediate completions, or when IOCP/epoll already provides synchronization). This is intentional for safety. - -### 3. Update `complete_immediate` - -**File:** `src/corosio/src/detail/iocp/overlapped_op.hpp` - -```cpp -// BEFORE -std::coroutine_handle<> complete_immediate() { // ... - return d.dispatch(h); -} - -// AFTER -void complete_immediate() { - // ... - d.dispatch(h); // Returns void, resumes inline -} -``` - -### 4. Update All I/O Awaitable `await_suspend` Methods - -Any `await_suspend` that currently returns `dispatch(h)` must change to return `noop_coroutine()` and let the completion handler path call `dispatch`. - -**Example pattern:** - -```cpp -// BEFORE -auto await_suspend(std::coroutine_handle<> h) { - initiate_io(); - if (immediate_completion) - return ex_.dispatch(h); - return std::noop_coroutine(); -} - -// AFTER -auto await_suspend(std::coroutine_handle<> h) { - initiate_io(); - // Immediate completions go through completion handler - // which will call dispatch(h) - return std::noop_coroutine(); + return caller_env.executor.dispatch(cont); } ``` -### 5. Remove Pessimistic `post()` Workarounds - -**File:** `src/corosio/src/detail/iocp/sockets.cpp` +### From the event loop pump (scheduler/reactor handlers) -Remove comments and code that forces `post()` for immediate completions: +The one place where `.resume()` is called directly: ```cpp -// BEFORE (pessimistic) -// Immediate error - must use post(), not complete_immediate() -op->post(); - -// AFTER (can use dispatch) -op->complete_immediate(); // Now safe +// In scheduler completion handler +dispatch_coro(ex, h).resume(); ``` -### 6. Update Capy's Executor Concept (if applicable) - -If Capy defines an executor concept that requires `dispatch` to return a handle, that concept needs updating to allow `void` return. - ---- +### Launching concurrent work (when_all, when_any) -## Verification Criteria - -### 1. Stack Overflow Test +Use `post` instead of `dispatch` since you cannot symmetric-transfer to multiple handles: ```cpp -task<> stack_test(tcp_socket& socket) { - std::array buf; - for (int i = 0; i < 1000000; i++) { - // Use loopback socket that completes immediately - co_await socket.async_read(buffer(buf)); - } -} +// Launch all runners via post +for (auto& handle : runner_handles) + caller_env.executor.post(handle); ``` -**Pass criteria:** No stack overflow, no `STATUS_STACK_BUFFER_OVERRUN` +## dispatch_coro Helper -### 2. Async Mutex Correctness Test +Corosio provides `dispatch_coro` as an optimized wrapper that skips executor dispatch overhead for the native `io_context` executor: ```cpp -async_mutex mutex; -int counter = 0; - -task<> increment(async_mutex& m, tcp_socket& s) { - for (int i = 0; i < 1000; i++) { - auto lock = co_await m.lock(); - counter++; - co_await s.async_write(...); // May complete immediately - } +inline std::coroutine_handle<> +dispatch_coro( + capy::executor_ref ex, + std::coroutine_handle<> h) +{ + if (&ex.type_id() == &capy::detail::type_id< + basic_io_context::executor_type>()) + return h; + return ex.dispatch(h); } - -// Run N concurrent incrementers -// Verify counter == N * 1000 ``` -**Pass criteria:** Final counter value is exactly N * 1000 - -### 3. Nested Dispatch Test - -```cpp -task<> nested_test() { - async_mutex m1, m2; - - auto lock1 = co_await m1.lock(); - { - auto lock2 = co_await m2.lock(); - co_await async_op(); // Immediate completion - } // lock2 released, may wake waiter - co_await async_op(); -} // lock1 released -``` - -**Pass criteria:** All waiters wake correctly, no hangs - -### 4. Performance Comparison - -Measure latency of immediate completions: -- Before: Always `post()` (queue + context switch overhead) -- After: Inline `resume()` (direct execution) - -**Expected improvement:** Significant latency reduction for immediate completions - ---- - -## Summary - -| Aspect | Asio | Corosio (Current) | Corosio (Fixed) | -|--------|------|-------------------|-----------------| -| `dispatch` returns | N/A (uses handlers) | `coroutine_handle` | `void` | -| I/O resumption | Handler invocation | Symmetric transfer | Normal call | -| Task composition | Explicit pump | Symmetric transfer | Symmetric transfer | -| Stack behavior | Always unwinds | Can overflow | Always unwinds | -| Async mutex | Works | Broken | Works | -| Immediate completions | Handler path | Can inline (broken) | Can inline (fixed) | -| Memory barrier | In handler path | In `resume_coro` | In `resume_coro` (preserved) | - -The fix is conceptually simple: **dispatch must be a normal function call, not an enabler of symmetric transfer.** Symmetric transfer remains available for task-to-task composition via `final_suspend`, where it's safe and efficient. - ---- - -## References +## Audience -- Boost.Asio source: `boost/asio/impl/awaitable.hpp` -- Lewis Baker: "Understanding Symmetric Transfer" -- P2300: `std::execution` (senders/receivers) -- Corosio source files: - - `include/boost/corosio/basic_io_context.hpp` - - `src/corosio/src/detail/resume_coro.hpp` - - `src/corosio/src/detail/iocp/overlapped_op.hpp` - - `src/corosio/src/detail/iocp/sockets.cpp` +Ordinary users writing coroutine tasks do not interact with `dispatch` and `post` directly. These operations are used by authors of coroutine machinery -- `promise_type` implementations, awaitables, `await_transform` -- to implement asynchronous algorithms such as `when_all`, `when_any`, `async_mutex`, channels, and similar primitives. diff --git a/doc/research/tcp-ip-tutorial.md b/doc/research/tcp-ip-tutorial.md deleted file mode 100644 index f8c1fa34..00000000 --- a/doc/research/tcp-ip-tutorial.md +++ /dev/null @@ -1,957 +0,0 @@ -# Learn TCP/IP Networking From the Ground Up - -A complete beginner's guide to understanding how computers talk to each other. - ---- - -## Part 1: The Big Picture - -### What Is Networking? - -Imagine you want to send a letter to a friend in another city. You write the letter, put it in an envelope, add your friend's address, drop it in a mailbox, and somehow—through a system you don't fully understand—it arrives at your friend's door days later. - -Computer networking works the same way. When you load a webpage, your computer sends a "letter" (a request) to another computer somewhere in the world, and that computer sends back another "letter" (the webpage content). This happens billions of times per second across the planet. - -TCP/IP is the set of rules that makes this possible. It's the postal system of the internet. - -### Why Should You Care? - -If you're building software that communicates over a network—a web server, a chat application, a game, or anything that sends data between computers—understanding TCP/IP helps you: - -- Debug mysterious connection problems -- Write faster, more efficient code -- Understand why things sometimes fail -- Make better design decisions - -Let's start from the very beginning. - ---- - -## Chapter 1: Introduction to Networking Concepts - -### 1.1 Layering: Dividing a Complex Problem - -Sending data across a network is complicated. Rather than trying to solve everything at once, engineers divided the problem into layers. Each layer handles one specific job and relies on the layer below it. - -Think of it like mailing a package internationally: - -1. **You** write a letter and put it in a box -2. **The shipping company** adds tracking labels and handles domestic transport -3. **Customs** handles crossing borders -4. **The local postal service** delivers to the final address - -Each step doesn't need to know how the others work. The customs officer doesn't care what's in your letter. The local mail carrier doesn't know it crossed an ocean. This separation makes the system manageable. - -```mermaid -graph TB - subgraph "Your Computer" - A[Application Layer
Your Program] - B[Transport Layer
TCP or UDP] - C[Network Layer
IP] - D[Link Layer
Ethernet/WiFi] - end - - D --> E[Physical Network
Cables, Radio Waves] - - subgraph "Remote Computer" - F[Link Layer] - G[Network Layer] - H[Transport Layer] - I[Application Layer] - end - - E --> F - F --> G - G --> H - H --> I - - style A fill:#e1f5fe - style I fill:#e1f5fe - style E fill:#fff9c4 -``` - -### 1.2 The TCP/IP Layer Model - -TCP/IP uses four layers. From top to bottom: - -| Layer | What It Does | Real-World Analogy | -|-------|--------------|-------------------| -| **Application** | Your program's logic | Writing the letter's content | -| **Transport** | Reliable or fast delivery | Choosing registered mail vs. postcard | -| **Network (IP)** | Routing across the internet | The postal routing system | -| **Link** | Physical transmission | The mail truck driving down the street | - -When you send data: -- Your application creates the message -- The transport layer packages it for delivery -- The network layer addresses it for routing -- The link layer puts it on the wire - -When you receive data, the process reverses. - -### 1.3 Internet Addresses (IP Addresses) - -Every device on a network needs a unique address, just like every house needs a street address for mail delivery. - -An **IPv4 address** looks like this: `192.168.1.100` - -It's four numbers (0-255) separated by dots. That's 32 bits total, allowing for about 4.3 billion unique addresses. - -Some addresses are special: -- `127.0.0.1` — "localhost," your own computer talking to itself -- `192.168.x.x`, `10.x.x.x` — private addresses used inside homes and offices -- `8.8.8.8` — Google's public DNS server (we'll explain DNS later) - -**IPv6** addresses are longer (128 bits) and look like `2001:0db8:85a3:0000:0000:8a2e:0370:7334`. They exist because we're running out of IPv4 addresses, but IPv4 is still dominant. - -### 1.4 Encapsulation: Wrapping Data in Layers - -When data travels down through the layers, each layer wraps the data in its own header—like putting a letter in an envelope, then putting that envelope in a shipping box, then putting that box in a cargo container. - -```mermaid -graph LR - subgraph "Application Layer" - A[Data] - end - - subgraph "Transport Layer" - B[TCP Header] --> A2[Data] - end - - subgraph "Network Layer" - C[IP Header] --> B2[TCP Header] --> A3[Data] - end - - subgraph "Link Layer" - D[Ethernet Header] --> C2[IP Header] --> B3[TCP Header] --> A4[Data] --> E[Ethernet Trailer] - end - - style A fill:#c8e6c9 - style A2 fill:#c8e6c9 - style A3 fill:#c8e6c9 - style A4 fill:#c8e6c9 -``` - -Each header contains information that layer needs: -- **Ethernet header**: Which device on the local network? -- **IP header**: Which computer on the internet? -- **TCP header**: Which application on that computer? -- **Data**: The actual content - -When the data arrives, each layer strips off its header, reads the information, and passes the rest upward. - -### 1.5 Demultiplexing: Delivering to the Right Application - -Your computer might be running a web browser, an email client, a chat program, and a game—all at once, all using the network. When a packet arrives, how does the operating system know which program should receive it? - -This is **demultiplexing**. Each layer uses its header information to make routing decisions: - -1. The link layer checks: "Is this for my hardware address?" -2. The IP layer checks: "Is this for my IP address?" -3. The transport layer checks: "Which port number?" - -The port number is the key. It identifies which application gets the data. - -### 1.6 The Client-Server Model - -Most network communication follows a pattern: - -- A **server** waits, listening for incoming requests -- A **client** initiates contact, sending a request -- The server responds - -When you browse the web: -- Your browser is the client -- The website's computer is the server - -```mermaid -sequenceDiagram - participant Client as Your Browser
(Client) - participant Server as Web Server - - Client->>Server: "Please send me the homepage" - Server->>Client: "Here's the HTML content" - Client->>Server: "Now send me the logo image" - Server->>Client: "Here's the image data" -``` - -A single server can handle thousands of clients simultaneously. Your home computer can be a client to many different servers at once. - -### 1.7 Port Numbers: Apartment Numbers for Computers - -If an IP address is like a street address, a port number is like an apartment number. The IP address gets the data to the right building (computer), and the port number gets it to the right unit (application). - -Port numbers range from 0 to 65535. Some are well-known: - -| Port | Service | -|------|---------| -| 80 | HTTP (web) | -| 443 | HTTPS (secure web) | -| 22 | SSH (remote login) | -| 25 | SMTP (email sending) | -| 53 | DNS (name lookups) | - -When your browser connects to `www.example.com`, it's really connecting to something like `93.184.216.34:443`—an IP address plus a port number. - -Ports 0-1023 are "privileged" and typically reserved for system services. Your applications usually get assigned random high-numbered ports (like 52431) for outgoing connections. - -### 1.8 Application Programming Interfaces (APIs) - -You don't need to understand every detail of TCP/IP to use it. Operating systems provide **sockets**—a programming interface that hides the complexity. - -With sockets, your code simply says: -- "Connect to this address and port" -- "Send this data" -- "Receive data" -- "Close the connection" - -The operating system handles all the layering, headers, routing, and retransmission. You just work with a stream of bytes. - ---- - -## Chapter 2: The Link Layer - -### 2.1 MTU: Maximum Transmission Unit - -Different types of networks have different limits on how much data can be sent in a single frame. This limit is called the **Maximum Transmission Unit (MTU)**. - -For Ethernet (the most common wired network), the MTU is typically **1500 bytes**. - -Why does this matter? If you try to send a 5000-byte message, it won't fit in one frame. Something has to break it into smaller pieces—this is called **fragmentation**. - -Think of it like shipping a grand piano. It doesn't fit through a standard doorway, so you might need to disassemble it, ship the pieces separately, and reassemble at the destination. - -### 2.2 Path MTU: The Smallest Link in the Chain - -Data often travels through many networks to reach its destination. Each network might have a different MTU. - -```mermaid -graph LR - A[Your Computer
MTU: 1500] --> B[Home Router
MTU: 1500] - B --> C[ISP Network
MTU: 1500] - C --> D[VPN Tunnel
MTU: 1400] - D --> E[Data Center
MTU: 9000] - E --> F[Web Server] - - style D fill:#ffcdd2 -``` - -The **Path MTU** is the smallest MTU along the entire route. In the diagram above, even though most links support 1500 bytes, the VPN tunnel only supports 1400. That becomes the Path MTU. - -If you send packets larger than the Path MTU, they'll need to be fragmented somewhere along the way—which adds overhead and can cause problems. Modern systems try to discover the Path MTU and avoid fragmentation entirely. - ---- - -## Chapter 3: The Internet Protocol (IP) - -### 3.1 The IP Header: Your Packet's Shipping Label - -Every IP packet has a header containing routing information. The most important fields: - -| Field | Purpose | -|-------|---------| -| Version | IPv4 or IPv6? | -| Total Length | Size of the entire packet | -| TTL (Time To Live) | How many hops before giving up | -| Protocol | What's inside? (TCP=6, UDP=17) | -| Source Address | Where this came from | -| Destination Address | Where it's going | - -The **TTL** field prevents packets from circling forever if there's a routing loop. Each router decrements it by one. When it hits zero, the packet is discarded. - -```mermaid -graph LR - subgraph "IP Packet" - direction TB - H[Header
Version, Length, TTL
Source IP, Dest IP] - P[Payload
TCP/UDP + Data] - end -``` - -### 3.2 Subnet Addressing: Organizing Networks - -Large organizations don't give every computer a completely different address. Instead, they divide their address space into **subnets**. - -Think of it like a large apartment complex. The building has one main address (123 Main Street), but inside there are separate wings (A, B, C) and apartments within each wing. - -An IP address has two parts: -- **Network portion**: Which network is this? -- **Host portion**: Which computer on that network? - -### 3.3 Subnet Masks: Drawing the Line - -A **subnet mask** defines where the network portion ends and the host portion begins. - -For example: -- IP Address: `192.168.1.100` -- Subnet Mask: `255.255.255.0` - -The mask `255.255.255.0` means the first three numbers (`192.168.1`) identify the network, and the last number (`100`) identifies the specific host. - -You'll often see this written as `192.168.1.100/24`—the `/24` means the first 24 bits are the network portion. - -This helps routers make decisions quickly: "Is this address on my local network, or do I need to forward it elsewhere?" - ---- - -## Chapter 4: UDP—The Simple, Fast Protocol - -### 4.1 What UDP Does - -**UDP (User Datagram Protocol)** is the simpler of the two main transport protocols. It offers: - -- Fast delivery -- No connection setup -- No guarantee of delivery -- No guarantee of order - -It's like sending a postcard: quick and easy, but if it gets lost, nobody tells you. - -### 4.2 The UDP Header - -UDP's header is tiny—just 8 bytes: - -| Field | Size | Purpose | -|-------|------|---------| -| Source Port | 2 bytes | Which app sent this | -| Destination Port | 2 bytes | Which app should receive | -| Length | 2 bytes | Total size of UDP packet | -| Checksum | 2 bytes | Error detection | - -That's it. No sequence numbers, no acknowledgments, no flow control. - -```mermaid -graph LR - subgraph "UDP Packet" - A[Source Port
2 bytes] - B[Dest Port
2 bytes] - C[Length
2 bytes] - D[Checksum
2 bytes] - E[Data
Variable] - end -``` - -### 4.3 The UDP Checksum - -The checksum catches transmission errors. The sender computes a mathematical summary of the data; the receiver computes the same thing and compares. If they don't match, the packet was corrupted and gets discarded. - -But UDP doesn't request retransmission—it just throws away bad packets. - -### 4.4 IP Fragmentation - -What happens when you send a UDP datagram larger than the Path MTU? - -The IP layer **fragments** it—breaks it into smaller pieces that each fit within the MTU. Each fragment travels separately and gets reassembled at the destination. - -```mermaid -graph TB - A[Original UDP Packet
3000 bytes] --> B[Fragment 1
1500 bytes] - A --> C[Fragment 2
1500 bytes] - - B --> D[Reassembled Packet
3000 bytes] - C --> D -``` - -Fragmentation has downsides: -- If any fragment is lost, the entire datagram is lost -- More overhead (each fragment needs its own IP header) -- Some firewalls block fragments - -Modern applications try to avoid fragmentation by keeping their UDP datagrams under the Path MTU. - -### 4.5 Path MTU Discovery with UDP - -Your application can discover the Path MTU by: -1. Sending packets with the "Don't Fragment" flag set -2. Listening for "Fragmentation Needed" error messages -3. Reducing packet size until they get through - -This lets you send the largest possible packets without fragmentation. - -### 4.6 Maximum UDP Datagram Size - -Theoretically, a UDP datagram can be up to 65,535 bytes (limited by the 16-bit length field). - -Practically, you should stay much smaller: -- Over the internet: ~1472 bytes (1500 MTU minus headers) -- For reliability: even smaller, around 512-1400 bytes - -Larger datagrams have a higher chance of fragmentation and loss. - -### 4.7 UDP Server Design - -A UDP server is simple: -1. Create a socket bound to a port -2. Wait for incoming datagrams -3. Process each one independently -4. Send responses back - -Because there's no connection state, a single UDP socket can communicate with thousands of clients simultaneously. - -```mermaid -graph TB - subgraph "UDP Server" - S[Single Socket
Port 53] - end - - C1[Client A] -->|Request| S - C2[Client B] -->|Request| S - C3[Client C] -->|Request| S - - S -->|Response| C1 - S -->|Response| C2 - S -->|Response| C3 -``` - -UDP is great for: -- DNS lookups (fast, short queries) -- Video streaming (old data is useless, just show the latest) -- Gaming (quick updates more important than perfect reliability) -- Voice/video calls (latency matters more than occasional glitches) - ---- - -## Chapter 5: DNS—Turning Names into Addresses - -### 5.1 DNS Basics - -Humans remember names like `www.google.com`. Computers need numbers like `142.250.80.4`. **DNS (Domain Name System)** translates between them. - -When you type a URL in your browser: -1. Your computer asks: "What's the IP address for www.google.com?" -2. A DNS server responds: "It's 142.250.80.4" -3. Your browser connects to that IP address - -```mermaid -sequenceDiagram - participant B as Your Browser - participant R as DNS Resolver - participant D as DNS Server - participant W as Web Server - - B->>R: What's the IP for www.example.com? - R->>D: Query: www.example.com - D->>R: Answer: 93.184.216.34 - R->>B: It's 93.184.216.34 - B->>W: Connect to 93.184.216.34 -``` - -DNS is hierarchical. To find `www.example.com`: -1. Ask a root server: "Who handles `.com`?" -2. Ask the `.com` server: "Who handles `example.com`?" -3. Ask the `example.com` server: "What's `www`?" - -In practice, caching makes this much faster. - -### 5.2 DNS Caching - -DNS answers include a **TTL (Time To Live)**—how long the answer can be cached. - -Your computer caches DNS responses. Your home router caches them. Your ISP caches them. This dramatically reduces DNS traffic and speeds up lookups. - -The downside: if a website changes its IP address, old cached entries might point to the wrong place until they expire. - -### 5.3 DNS: UDP or TCP? - -DNS typically uses **UDP** on port 53. Most queries and responses fit in a single packet, so UDP's speed advantage outweighs TCP's reliability. - -DNS switches to **TCP** when: -- The response is too large for one UDP packet (over ~512 bytes) -- Zone transfers between DNS servers -- Higher reliability is required - -Modern DNS extensions (like DNSSEC) often produce larger responses, so TCP is becoming more common. - ---- - -## Chapter 6: TCP—The Reliable Workhorse - -### 6.1 What TCP Provides - -**TCP (Transmission Control Protocol)** is the backbone of most internet communication. It provides: - -- **Reliable delivery**: Lost data is automatically retransmitted -- **Ordered delivery**: Data arrives in the order it was sent -- **Flow control**: Sender won't overwhelm a slow receiver -- **Congestion control**: Won't flood the network - -It's like sending registered mail with tracking and confirmation. - -### 6.2 The TCP Header - -TCP's header is more complex than UDP's (20 bytes minimum): - -| Field | Purpose | -|-------|---------| -| Source Port | Sender's application | -| Destination Port | Receiver's application | -| Sequence Number | Position of this data in the stream | -| Acknowledgment Number | What we've received so far | -| Flags | SYN, ACK, FIN, RST, etc. | -| Window Size | How much data we can accept | -| Checksum | Error detection | - -The sequence and acknowledgment numbers enable reliability. The window size enables flow control. The flags control connection state. - ---- - -## Chapter 7: TCP Connections—Starting and Stopping - -### 7.1 The Three-Way Handshake - -TCP is **connection-oriented**. Before sending data, both sides must agree to communicate. This is the three-way handshake: - -```mermaid -sequenceDiagram - participant C as Client - participant S as Server - - Note over S: Listening on port 80 - - C->>S: SYN (seq=100)
"I'd like to connect" - S->>C: SYN-ACK (seq=300, ack=101)
"OK, I acknowledge" - C->>S: ACK (seq=101, ack=301)
"Great, let's go" - - Note over C,S: Connection Established -``` - -1. **Client sends SYN**: "I want to start a conversation. My starting sequence number is 100." -2. **Server sends SYN-ACK**: "I heard you. My starting sequence number is 300. I'm ready for your byte 101." -3. **Client sends ACK**: "I heard you. I'm ready for your byte 301." - -Now both sides know the other is listening and have agreed on sequence numbers. - -### 7.2 Connection Establishment Timeout - -What if the server doesn't respond? The client retries the SYN packet several times, with increasing delays between attempts: - -- First retry: 3 seconds -- Second retry: 6 seconds -- Third retry: 12 seconds -- ...and so on - -If all retries fail, the connection attempt times out. This typically takes about 75 seconds. - -### 7.3 Maximum Segment Size (MSS) - -During the handshake, each side advertises its **Maximum Segment Size**—the largest chunk of data it can receive in one TCP segment. - -This is usually MTU minus 40 bytes (for IP and TCP headers). On Ethernet: 1500 - 40 = **1460 bytes**. - -By knowing each other's MSS, both sides can send appropriately-sized segments and avoid fragmentation. - -### 7.4 TCP Half-Close - -TCP connections are **bidirectional**—data flows both ways. Each direction can be closed independently. - -A **half-close** means: "I'm done sending, but I'll still accept your data." - -This is useful when a client sends a complete request and then waits for a lengthy response. The client can close its sending direction while keeping the receiving direction open. - -```mermaid -sequenceDiagram - participant C as Client - participant S as Server - - C->>S: Data (request) - C->>S: FIN "I'm done sending" - S->>C: ACK - Note over C: Can't send anymore
Still receiving - S->>C: Data (response part 1) - S->>C: Data (response part 2) - S->>C: FIN "I'm done too" - C->>S: ACK - Note over C,S: Fully Closed -``` - -### 7.5 TCP State Transition Diagram - -A TCP connection moves through states: - -```mermaid -stateDiagram-v2 - [*] --> CLOSED - CLOSED --> LISTEN: Server starts listening - LISTEN --> SYN_RECEIVED: Receive SYN - SYN_RECEIVED --> ESTABLISHED: Receive ACK - - CLOSED --> SYN_SENT: Client sends SYN - SYN_SENT --> ESTABLISHED: Receive SYN-ACK - - ESTABLISHED --> FIN_WAIT_1: Send FIN - FIN_WAIT_1 --> FIN_WAIT_2: Receive ACK - FIN_WAIT_2 --> TIME_WAIT: Receive FIN - TIME_WAIT --> CLOSED: Timeout (2×MSL) - - ESTABLISHED --> CLOSE_WAIT: Receive FIN - CLOSE_WAIT --> LAST_ACK: Send FIN - LAST_ACK --> CLOSED: Receive ACK -``` - -Understanding these states helps when debugging connection problems. If you see many connections stuck in `TIME_WAIT`, for example, you might be opening and closing connections too rapidly. - -### 7.6 Reset Segments (RST) - -Sometimes a connection needs to be terminated immediately, without the normal graceful close. A **RST (Reset)** segment says: "This connection is invalid. Stop immediately." - -RST is sent when: -- A packet arrives for a connection that doesn't exist -- An application crashes without closing properly -- A firewall decides to kill a connection -- Something seriously wrong is detected - -When you receive RST, the connection is dead. No acknowledgment is sent. - -### 7.7 TCP Options - -TCP's header can include options for additional features: - -| Option | Purpose | -|--------|---------| -| MSS | Maximum Segment Size | -| Window Scale | Larger window sizes | -| Timestamp | Better RTT measurement | -| SACK | Selective acknowledgment | - -These are negotiated during the handshake. Both sides must support an option to use it. - -### 7.8 TCP Server Design - -A TCP server follows this pattern: - -```mermaid -graph TB - A[Create Socket] --> B[Bind to Port] - B --> C[Listen for Connections] - C --> D{Connection Request?} - D -->|Yes| E[Accept Connection] - E --> F[Handle Client
in New Thread/Coroutine] - F --> G[Read/Write Data] - G --> H[Close Connection] - D -->|No| D - H --> D -``` - -Unlike UDP, each TCP client gets its own connection. The server must manage multiple simultaneous connections, typically using threads, processes, or asynchronous I/O. - ---- - -## Chapter 8: TCP Interactive Data Flow - -### 8.1 Delayed Acknowledgments - -TCP requires acknowledgment of received data. But sending an ACK for every single packet would be wasteful. - -**Delayed acknowledgments** wait briefly (typically 200ms) before sending an ACK, hoping to combine it with outgoing data. If the application sends a response, the ACK piggybacks on it for free. - -```mermaid -sequenceDiagram - participant C as Client - participant S as Server - - C->>S: Data "GET /page" - Note over S: Wait 200ms for app response
or more data - S->>C: ACK + Data "Here's the page" -``` - -This reduces traffic but adds slight latency for one-way data flows. - -### 8.2 The Nagle Algorithm - -When an application sends data byte-by-byte (like typing in a terminal), sending each byte in its own packet would be incredibly wasteful—a 1-byte payload with 40 bytes of headers! - -The **Nagle algorithm** buffers small writes: -- If there's unacknowledged data in flight, hold new small writes -- Combine them into a larger segment -- Send when an ACK arrives or enough data accumulates - -```mermaid -graph TB - subgraph "Without Nagle" - A1[H] --> B1[Packet 1] - A2[e] --> B2[Packet 2] - A3[l] --> B3[Packet 3] - A4[l] --> B4[Packet 4] - A5[o] --> B5[Packet 5] - end - - subgraph "With Nagle" - C1[H] --> D1[Buffer] - C2[e] --> D1 - C3[l] --> D1 - C4[l] --> D1 - C5[o] --> D1 - D1 --> E1[Single Packet: Hello] - end -``` - -Nagle is great for interactive traffic but can add latency. Applications that need every byte sent immediately (like games) often disable it. - -### 8.3 Window Size Advertisements - -The receiver tells the sender how much buffer space is available using the **window size** field. This prevents the sender from overwhelming a slow receiver. - -If the receiver's application is slow to read data, the window shrinks. The sender must slow down. When the receiver catches up, the window grows again. - ---- - -## Chapter 9: TCP Bulk Data Flow - -### 9.1 Normal Data Flow - -When transferring large files, TCP sends many segments in a row without waiting for individual acknowledgments. The sender maintains a "window" of unacknowledged data in flight. - -### 9.2 Sliding Windows - -The **sliding window** is the core of TCP's efficiency: - -```mermaid -graph LR - subgraph "Sender's View" - A[Sent & ACKed] - B[Sent, waiting ACK] - C[Can send] - D[Can't send yet] - end - - style A fill:#c8e6c9 - style B fill:#fff9c4 - style C fill:#bbdefb - style D fill:#ffcdd2 -``` - -- **Green**: Data sent and acknowledged. Done. -- **Yellow**: Data sent, waiting for acknowledgment. -- **Blue**: Space available to send more data. -- **Red**: Must wait for acknowledgments before sending. - -As ACKs arrive, the window "slides" forward, allowing more data to be sent. - -### 9.3 Window Size - -The window size is the smaller of: -- **Receiver's advertised window**: How much buffer space they have -- **Congestion window**: How much the network can handle (more on this later) - -Larger windows mean more data in flight, which means higher throughput—especially on high-latency connections. - -### 9.4 The PUSH Flag - -The **PSH (Push)** flag tells the receiver: "Don't buffer this—deliver it to the application immediately." - -TCP typically buffers incoming data for efficiency. PSH overrides this for latency-sensitive data. - -### 9.5 Slow Start - -TCP doesn't blast data at full speed immediately. It starts slowly and ramps up. - -**Slow start** begins with a small congestion window (typically 1-10 segments). Each acknowledged segment doubles the window. This exponential growth quickly finds the available bandwidth. - -```mermaid -graph LR - A[Start: 1 segment] --> B[ACK: 2 segments] - B --> C[ACK: 4 segments] - C --> D[ACK: 8 segments] - D --> E[ACK: 16 segments] - E --> F[...and so on] -``` - -Once packet loss occurs (indicating congestion), TCP backs off and switches to more conservative growth. - -### 9.6 Bulk Data Throughput - -For large transfers, throughput depends on: -- **Bandwidth**: How fast the link is -- **Latency**: Round-trip time affects how fast the window can grow -- **Window size**: Limits data in flight -- **Packet loss**: Triggers slowdowns - -The formula: `Throughput ≤ Window Size / Round-Trip Time` - -A 64KB window with 100ms RTT: `65536 / 0.1 = 655KB/s` maximum, regardless of bandwidth. - ---- - -## Chapter 10: TCP Timeout and Retransmission - -### 10.1 Round-Trip Time Measurement - -TCP must decide how long to wait before assuming a packet was lost. Too short: unnecessary retransmissions. Too long: wasted time. - -TCP continuously measures **Round-Trip Time (RTT)**—how long until an ACK returns. It maintains a smoothed average and variance, adapting to changing network conditions. - -### 10.2 Congestion - -When too many packets flood a network, routers start dropping them. This is **congestion**. - -Signs of congestion: -- Packets being dropped -- RTT increasing dramatically -- Timeout-based retransmissions - -TCP interprets packet loss as a signal to slow down. - -### 10.3 Congestion Avoidance - -After slow start detects the network's limit, TCP switches to **congestion avoidance**—linear growth instead of exponential: - -```mermaid -graph LR - subgraph "Slow Start" - A[1] --> B[2] --> C[4] --> D[8] --> E[16] - end - - E -->|"Loss detected"| F[Threshold set to 8] - - subgraph "Congestion Avoidance" - F --> G[9] --> H[10] --> I[11] --> J[12] - end -``` - -When loss occurs: -1. The threshold is set to half the current window -2. The window drops dramatically -3. Growth continues linearly - -### 10.4 Fast Retransmit and Fast Recovery - -Waiting for a timeout is slow. **Fast retransmit** detects loss earlier using duplicate ACKs. - -If the receiver gets packet 1, then packet 3, it knows 2 is missing. It sends a duplicate ACK for 1 (what it's still waiting for). Three duplicate ACKs trigger immediate retransmission without waiting for timeout. - -**Fast recovery** avoids resetting to slow start. Instead, the window is halved and growth continues. - -```mermaid -sequenceDiagram - participant S as Sender - participant R as Receiver - - S->>R: Packet 1 - S->>R: Packet 2 (lost!) - S->>R: Packet 3 - R->>S: ACK 2 (duplicate) - S->>R: Packet 4 - R->>S: ACK 2 (duplicate) - S->>R: Packet 5 - R->>S: ACK 2 (duplicate) - Note over S: 3 duplicate ACKs!
Retransmit immediately - S->>R: Packet 2 (retransmit) - R->>S: ACK 6 (caught up!) -``` - ---- - -## Chapter 11: TCP Persist Timer - -### 11.1 The Silly Window Syndrome - -Imagine a receiver's buffer is full. It advertises a zero window: "Stop sending!" The sender waits. Eventually the receiver reads one byte and advertises a window of 1. The sender sends 1 byte. Now the buffer is full again... - -This is **Silly Window Syndrome**—sending tiny packets because of constantly-full buffers. Horrendously inefficient. - -Solutions: -- **Receiver**: Don't advertise tiny windows. Wait until at least half the buffer is free or a full MSS is available. -- **Sender**: Don't send tiny segments. Wait until enough data accumulates (Nagle algorithm). - -The **persist timer** handles the case where the receiver's window is zero. The sender periodically sends tiny "probe" segments to check if the window has opened. This prevents deadlock where the sender waits forever and the receiver's "window open" message was lost. - ---- - -## Chapter 12: TCP Keepalive Timer - -### 12.1 Why Keepalive? - -TCP connections can sit idle indefinitely. Neither side sends data, but the connection remains "open." - -What if the other side crashes without closing properly? Or the network path fails? You'd never know—you'd wait forever for data that will never come. - -**Keepalive** solves this by periodically probing idle connections. - -### 12.2 How Keepalive Works - -After a connection has been idle for a while (typically 2 hours): -1. Send an empty probe segment -2. If ACK received: connection is alive -3. If no response: retry several times -4. If still no response: declare the connection dead - -This catches: -- Crashed peers -- Failed network paths -- Unplugged cables - -Keepalive is optional and often disabled by default. Many applications implement their own heartbeat mechanisms instead. - ---- - -## Chapter 13: TCP Performance and Modern Extensions - -### 13.1 Path MTU Discovery - -Remember Path MTU from the link layer? TCP uses it too. - -TCP discovers the Path MTU by: -1. Sending segments with the "Don't Fragment" flag -2. If a router can't forward the packet, it sends an error message -3. TCP reduces its segment size and retries - -This avoids fragmentation, which improves performance and reliability. - -### 13.2 Long Fat Pipes - -A **Long Fat Pipe** is a high-bandwidth, high-latency link. Think of a satellite connection: huge capacity, but 500ms round-trip time. - -The problem: TCP's window is limited to 65,535 bytes (16-bit field). On a 10 Gbps link with 100ms RTT, you could have 125MB in flight—but the window only allows 64KB! - -`Throughput ≤ 65535 / 0.1 = 655 KB/s` - -That's 0.005% of the available bandwidth. Unacceptable. - -### 13.3 Window Scale Option - -The **Window Scale** option multiplies the window size. A scale factor of 7 means the window field is shifted left 7 bits, allowing windows up to 1 GB. - -Negotiated during the handshake, window scaling enables TCP to fully utilize high-bandwidth, high-latency links. - -### 13.4 Timestamp Option - -**Timestamps** improve RTT measurement and enable a protection mechanism. - -Each segment includes a timestamp. The receiver echoes it back. This gives precise RTT measurements even when multiple segments are in flight. - -### 13.5 PAWS: Protection Against Wrapped Sequence Numbers - -TCP sequence numbers are 32-bit, giving about 4 billion unique values. At 10 Gbps, you burn through all sequence numbers in about 3 seconds! - -What if old, delayed packets arrive with sequence numbers that have "wrapped around" and now look valid? - -**PAWS (Protection Against Wrapped Sequence Numbers)** uses timestamps to detect this. A segment with an old timestamp is rejected, even if the sequence number looks valid. - -### 13.6 TCP Performance Summary - -For best TCP performance: - -1. **Enable window scaling** for high-bandwidth links -2. **Enable timestamps** for accurate RTT and PAWS protection -3. **Tune buffer sizes** appropriately -4. **Minimize latency** where possible (it directly limits throughput) -5. **Avoid packet loss** (it triggers slowdowns) -6. **Use appropriate MSS** to avoid fragmentation - -Modern operating systems configure most of this automatically, but understanding these mechanisms helps you diagnose problems and optimize critical applications. - ---- - -## Conclusion - -You now understand the fundamental concepts of TCP/IP networking: - -- **Layering** divides the complex problem into manageable pieces -- **IP** routes packets across the internet -- **UDP** provides fast, simple, unreliable delivery -- **TCP** provides reliable, ordered delivery with flow and congestion control -- **DNS** translates names to addresses - -When your code sends data across a network, all of this machinery springs into action—handshakes negotiating, windows sliding, timers ticking, packets routing through a maze of interconnected networks. - -Understanding this foundation will help you write better networked software, debug mysterious failures, and appreciate the engineering marvel that makes the modern internet possible. diff --git a/doc/scheduler.md b/doc/scheduler.md index d08d39fe..490e7336 100644 --- a/doc/scheduler.md +++ b/doc/scheduler.md @@ -62,11 +62,10 @@ Corosio's `task` returns `coroutine_handle` from `await_suspend`, enabling co ```cpp // task::await_suspend - returns coroutine_handle -coro await_suspend(coro cont, executor_ref caller_ex, std::stop_token token) +std::coroutine_handle<> await_suspend(std::coroutine_handle<> cont, io_env const& env) { - h_.promise().set_continuation(cont, caller_ex); - h_.promise().set_executor(caller_ex); - h_.promise().set_stop_token(token); + h_.promise().set_continuation(cont, env.executor); + h_.promise().set_environment(env); return h_; // compiler tail-calls this handle } ``` @@ -78,7 +77,7 @@ auto final_suspend() noexcept { struct awaiter { - coro await_suspend(coro) const noexcept + std::coroutine_handle<> await_suspend(std::coroutine_handle<>) const noexcept { return p_->complete(); // returns continuation } @@ -121,10 +120,10 @@ auto transform_awaitable(Awaitable&& a) } ``` -The `await_suspend` signature accepts additional context parameters: +The `await_suspend` signature accepts the execution environment: ```cpp -coro await_suspend(coro cont, executor_ref caller_ex, std::stop_token token) +std::coroutine_handle<> await_suspend(std::coroutine_handle<> cont, io_env const& env) ``` This design allows third-party awaitable types to integrate with Corosio's I/O system by satisfying the `IoAwaitable` concept. diff --git a/include/boost/corosio/basic_io_context.hpp b/include/boost/corosio/basic_io_context.hpp index 5046e5b2..56431f8b 100644 --- a/include/boost/corosio/basic_io_context.hpp +++ b/include/boost/corosio/basic_io_context.hpp @@ -12,7 +12,7 @@ #include #include -#include +#include #include #include @@ -341,29 +341,21 @@ class basic_io_context::executor_type /** Dispatch a coroutine handle. - If called from within `run()`, resumes the coroutine inline - by calling `h.resume()`. The call returns when the coroutine - suspends or completes. Otherwise posts the coroutine for - later execution. - - After this function returns, the state of `h` is unspecified. - The coroutine may have completed, been destroyed, or suspended - at a different suspension point. Callers must not assume `h` - remains valid after calling `dispatch`. - - @note Because this function may call `h.resume()` before - returning, it cannot be used to implement symmetric transfer - from `await_suspend`. + Returns a handle for symmetric transfer. If called from + within `run()`, returns `h`. Otherwise posts the coroutine + for later execution and returns `std::noop_coroutine()`. @param h The coroutine handle to dispatch. + + @return A handle for symmetric transfer or `std::noop_coroutine()`. */ - void - dispatch(capy::coro h) const + std::coroutine_handle<> + dispatch(std::coroutine_handle<> h) const { if (running_in_this_thread()) - h.resume(); - else - ctx_->sched_->post(h); + return h; + ctx_->sched_->post(h); + return std::noop_coroutine(); } /** Post a coroutine for deferred execution. @@ -374,7 +366,7 @@ class basic_io_context::executor_type @param h The coroutine handle to post. */ void - post(capy::coro h) const + post(std::coroutine_handle<> h) const { ctx_->sched_->post(h); } diff --git a/include/boost/corosio/detail/scheduler.hpp b/include/boost/corosio/detail/scheduler.hpp index b3b5aea8..fc9635cd 100644 --- a/include/boost/corosio/detail/scheduler.hpp +++ b/include/boost/corosio/detail/scheduler.hpp @@ -11,7 +11,7 @@ #define BOOST_COROSIO_DETAIL_SCHEDULER_HPP #include -#include +#include #include @@ -22,7 +22,7 @@ class scheduler_op; struct scheduler { virtual ~scheduler() = default; - virtual void post(capy::coro) const = 0; + virtual void post(std::coroutine_handle<>) const = 0; virtual void post(scheduler_op*) const = 0; /** Notify scheduler of pending work (for executor use). diff --git a/include/boost/corosio/io_stream.hpp b/include/boost/corosio/io_stream.hpp index 1326997d..67fb5511 100644 --- a/include/boost/corosio/io_stream.hpp +++ b/include/boost/corosio/io_stream.hpp @@ -15,6 +15,7 @@ #include #include #include +#include #include #include @@ -228,11 +229,10 @@ class BOOST_COROSIO_DECL io_stream : public io_object auto await_suspend( std::coroutine_handle<> h, - capy::executor_ref ex, - std::stop_token token) -> std::coroutine_handle<> + capy::io_env const& env) -> std::coroutine_handle<> { - token_ = std::move(token); - return ios_.get().read_some(h, ex, buffers_, token_, &ec_, &bytes_transferred_); + token_ = env.stop_token; + return ios_.get().read_some(h, env.executor, buffers_, token_, &ec_, &bytes_transferred_); } }; @@ -268,11 +268,10 @@ class BOOST_COROSIO_DECL io_stream : public io_object auto await_suspend( std::coroutine_handle<> h, - capy::executor_ref ex, - std::stop_token token) -> std::coroutine_handle<> + capy::io_env const& env) -> std::coroutine_handle<> { - token_ = std::move(token); - return ios_.get().write_some(h, ex, buffers_, token_, &ec_, &bytes_transferred_); + token_ = env.stop_token; + return ios_.get().write_some(h, env.executor, buffers_, token_, &ec_, &bytes_transferred_); } }; diff --git a/include/boost/corosio/resolver.hpp b/include/boost/corosio/resolver.hpp index 90b9975d..4127d166 100644 --- a/include/boost/corosio/resolver.hpp +++ b/include/boost/corosio/resolver.hpp @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -238,22 +239,12 @@ class BOOST_COROSIO_DECL resolver : public io_object return {ec_, std::move(results_)}; } - template auto await_suspend( std::coroutine_handle<> h, - Ex const& ex) -> std::coroutine_handle<> + capy::io_env const& env) -> std::coroutine_handle<> { - return r_.get().resolve(h, ex, host_, service_, flags_, token_, &ec_, &results_); - } - - template - auto await_suspend( - std::coroutine_handle<> h, - Ex const& ex, - std::stop_token token) -> std::coroutine_handle<> - { - token_ = std::move(token); - return r_.get().resolve(h, ex, host_, service_, flags_, token_, &ec_, &results_); + token_ = env.stop_token; + return r_.get().resolve(h, env.executor, host_, service_, flags_, token_, &ec_, &results_); } }; @@ -288,22 +279,12 @@ class BOOST_COROSIO_DECL resolver : public io_object return {ec_, std::move(result_)}; } - template - auto await_suspend( - std::coroutine_handle<> h, - Ex const& ex) -> std::coroutine_handle<> - { - return r_.get().reverse_resolve(h, ex, ep_, flags_, token_, &ec_, &result_); - } - - template auto await_suspend( std::coroutine_handle<> h, - Ex const& ex, - std::stop_token token) -> std::coroutine_handle<> + capy::io_env const& env) -> std::coroutine_handle<> { - token_ = std::move(token); - return r_.get().reverse_resolve(h, ex, ep_, flags_, token_, &ec_, &result_); + token_ = env.stop_token; + return r_.get().reverse_resolve(h, env.executor, ep_, flags_, token_, &ec_, &result_); } }; diff --git a/include/boost/corosio/signal_set.hpp b/include/boost/corosio/signal_set.hpp index 33839af6..99c05f1f 100644 --- a/include/boost/corosio/signal_set.hpp +++ b/include/boost/corosio/signal_set.hpp @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -188,14 +189,12 @@ class BOOST_COROSIO_DECL signal_set : public io_object return {ec_, signal_number_}; } - template auto await_suspend( std::coroutine_handle<> h, - Ex const& ex, - std::stop_token token) -> std::coroutine_handle<> + capy::io_env const& env) -> std::coroutine_handle<> { - token_ = std::move(token); - return s_.get().wait(h, ex, token_, &ec_, &signal_number_); + token_ = env.stop_token; + return s_.get().wait(h, env.executor, token_, &ec_, &signal_number_); } }; diff --git a/include/boost/corosio/tcp_acceptor.hpp b/include/boost/corosio/tcp_acceptor.hpp index b9acc155..3071980b 100644 --- a/include/boost/corosio/tcp_acceptor.hpp +++ b/include/boost/corosio/tcp_acceptor.hpp @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -100,22 +101,12 @@ class BOOST_COROSIO_DECL tcp_acceptor : public io_object return {ec_}; } - template auto await_suspend( std::coroutine_handle<> h, - Ex const& ex) -> std::coroutine_handle<> + capy::io_env const& env) -> std::coroutine_handle<> { - return acc_.get().accept(h, ex, token_, &ec_, &peer_impl_); - } - - template - auto await_suspend( - std::coroutine_handle<> h, - Ex const& ex, - std::stop_token token) -> std::coroutine_handle<> - { - token_ = std::move(token); - return acc_.get().accept(h, ex, token_, &ec_, &peer_impl_); + token_ = env.stop_token; + return acc_.get().accept(h, env.executor, token_, &ec_, &peer_impl_); } }; diff --git a/include/boost/corosio/tcp_server.hpp b/include/boost/corosio/tcp_server.hpp index e5541233..ebb55335 100644 --- a/include/boost/corosio/tcp_server.hpp +++ b/include/boost/corosio/tcp_server.hpp @@ -21,6 +21,8 @@ #include #include #include +#include +#include #include #include @@ -224,15 +226,16 @@ class BOOST_COROSIO_DECL { struct promise_type { - Ex ex; // Stored directly in frame, no allocation - std::stop_token st; + Ex ex; // Executor stored directly in frame (outlives child tasks) + capy::io_env env_; // For regular coroutines: first arg is executor, second is stop token template requires capy::Executor> promise_type(E e, S s, Args&&...) : ex(std::move(e)) - , st(std::move(s)) + , env_{capy::executor_ref(ex), std::move(s), + capy::current_frame_allocator()} { } @@ -242,7 +245,8 @@ class BOOST_COROSIO_DECL capy::Executor>) promise_type(Closure&&, E e, S s, Args&&...) : ex(std::move(e)) - , st(std::move(s)) + , env_{capy::executor_ref(ex), std::move(s), + capy::current_frame_allocator()} { } @@ -254,37 +258,25 @@ class BOOST_COROSIO_DECL void return_void() noexcept {} void unhandled_exception() { std::terminate(); } - // Pass through simple awaitables, inject executor/stop_token for IoAwaitable - template + // Inject io_env for IoAwaitable + template auto await_transform(Awaitable&& a) { using AwaitableT = std::decay_t; - // Simple awaitable: has await_suspend(coroutine_handle<>) but not IoAwaitable - if constexpr ( - requires { a.await_suspend(std::declval>()); } && - !capy::IoAwaitable) + struct adapter { - return std::forward(a); - } - else - { - struct adapter + AwaitableT aw; + capy::io_env const* env; + + bool await_ready() { return aw.await_ready(); } + decltype(auto) await_resume() { return aw.await_resume(); } + + auto await_suspend(std::coroutine_handle h) { - AwaitableT aw; - Ex* ex_ptr; - std::stop_token* st_ptr; - - bool await_ready() { return aw.await_ready(); } - decltype(auto) await_resume() { return aw.await_resume(); } - - auto await_suspend(std::coroutine_handle h) - { - static_assert(capy::IoAwaitable); - return aw.await_suspend(h, *ex_ptr, *st_ptr); - } - }; - return adapter{std::forward(a), &ex, &st}; - } + return aw.await_suspend(h, *env); + } + }; + return adapter{std::forward(a), &env_}; } }; @@ -324,7 +316,7 @@ class BOOST_COROSIO_DECL { // Executor and stop token stored in promise via constructor co_await std::move(t); - co_await self->push(*wp); + co_await self->push(*wp); // worker goes back to idle list } }; @@ -347,15 +339,13 @@ class BOOST_COROSIO_DECL return false; } - template std::coroutine_handle<> await_suspend( std::coroutine_handle<> h, - Ex const&, std::stop_token) noexcept + capy::io_env const&) noexcept { - // Dispatch to server's executor before touching shared state - self_.ex_.dispatch(h); - return std::noop_coroutine(); + // Symmetric transfer to server's executor + return self_.ex_.dispatch(h); } void await_resume() noexcept @@ -394,11 +384,10 @@ class BOOST_COROSIO_DECL return !self_.idle_empty(); } - template bool await_suspend( std::coroutine_handle<> h, - Ex const&, std::stop_token) noexcept + capy::io_env const&) noexcept { // Running on server executor (do_accept runs there) wait_.h = h; diff --git a/include/boost/corosio/tcp_socket.hpp b/include/boost/corosio/tcp_socket.hpp index abd47cb5..55d347d5 100644 --- a/include/boost/corosio/tcp_socket.hpp +++ b/include/boost/corosio/tcp_socket.hpp @@ -19,6 +19,7 @@ #include #include #include +#include #include #include @@ -162,22 +163,12 @@ class BOOST_COROSIO_DECL tcp_socket : public io_stream return {ec_}; } - template auto await_suspend( std::coroutine_handle<> h, - Ex const& ex) -> std::coroutine_handle<> + capy::io_env const& env) -> std::coroutine_handle<> { - return s_.get().connect(h, ex, endpoint_, token_, &ec_); - } - - template - auto await_suspend( - std::coroutine_handle<> h, - Ex const& ex, - std::stop_token token) -> std::coroutine_handle<> - { - token_ = std::move(token); - return s_.get().connect(h, ex, endpoint_, token_, &ec_); + token_ = env.stop_token; + return s_.get().connect(h, env.executor, endpoint_, token_, &ec_); } }; diff --git a/include/boost/corosio/timer.hpp b/include/boost/corosio/timer.hpp index 3250d1ab..b092a11f 100644 --- a/include/boost/corosio/timer.hpp +++ b/include/boost/corosio/timer.hpp @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -69,22 +70,12 @@ class BOOST_COROSIO_DECL timer : public io_object return {ec_}; } - template auto await_suspend( std::coroutine_handle<> h, - Ex const& ex) -> std::coroutine_handle<> + capy::io_env const& env) -> std::coroutine_handle<> { - return t_.get().wait(h, ex, token_, &ec_); - } - - template - auto await_suspend( - std::coroutine_handle<> h, - Ex const& ex, - std::stop_token token) -> std::coroutine_handle<> - { - token_ = std::move(token); - return t_.get().wait(h, ex, token_, &ec_); + token_ = env.stop_token; + return t_.get().wait(h, env.executor, token_, &ec_); } }; diff --git a/src/corosio/src/detail/dispatch_coro.hpp b/src/corosio/src/detail/dispatch_coro.hpp new file mode 100644 index 00000000..84140327 --- /dev/null +++ b/src/corosio/src/detail/dispatch_coro.hpp @@ -0,0 +1,47 @@ +// +// Copyright (c) 2026 Vinnie Falco (vinnie.falco@gmail.com) +// +// Distributed under the Boost Software License, Version 1.0. (See accompanying +// file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) +// +// Official repository: https://github.com/cppalliance/corosio +// + +#ifndef BOOST_COROSIO_DETAIL_DISPATCH_CORO_HPP +#define BOOST_COROSIO_DETAIL_DISPATCH_CORO_HPP + +#include +#include +#include +#include + +namespace boost::corosio::detail { + +/** Returns a handle for symmetric transfer on I/O completion. + + If the executor is io_context::executor_type, returns `h` + directly (fast path). Otherwise dispatches through the + executor, which returns `h` or `noop_coroutine()`. + + Callers in coroutine machinery should return the result + for symmetric transfer. Callers at the scheduler pump + level should call `.resume()` on the result. + + @param ex The executor to dispatch through. + @param h The coroutine handle to resume. + + @return A handle for symmetric transfer or `std::noop_coroutine()`. +*/ +inline std::coroutine_handle<> +dispatch_coro( + capy::executor_ref ex, + std::coroutine_handle<> h) +{ + if (&ex.type_id() == &capy::detail::type_id()) + return h; + return ex.dispatch(h); +} + +} // namespace boost::corosio::detail + +#endif diff --git a/src/corosio/src/detail/epoll/acceptors.cpp b/src/corosio/src/detail/epoll/acceptors.cpp index f3737e00..bd42ecb0 100644 --- a/src/corosio/src/detail/epoll/acceptors.cpp +++ b/src/corosio/src/detail/epoll/acceptors.cpp @@ -14,6 +14,7 @@ #include "src/detail/epoll/acceptors.hpp" #include "src/detail/epoll/sockets.hpp" #include "src/detail/endpoint_convert.hpp" +#include "src/detail/dispatch_coro.hpp" #include "src/detail/make_err.hpp" #include @@ -131,9 +132,9 @@ operator()() // Move to stack before resuming. See epoll_op::operator()() for rationale. capy::executor_ref saved_ex( std::move( ex ) ); - capy::coro saved_h( std::move( h ) ); + std::coroutine_handle<> saved_h( std::move( h ) ); auto prevent_premature_destruction = std::move(impl_ptr); - saved_ex.dispatch( saved_h ); + dispatch_coro(saved_ex, saved_h).resume(); } epoll_acceptor_impl:: diff --git a/src/corosio/src/detail/epoll/op.hpp b/src/corosio/src/detail/epoll/op.hpp index f25f90bb..f28f1be2 100644 --- a/src/corosio/src/detail/epoll/op.hpp +++ b/src/corosio/src/detail/epoll/op.hpp @@ -18,12 +18,12 @@ #include #include #include -#include +#include #include #include #include "src/detail/make_err.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" #include "src/detail/scheduler_op.hpp" #include "src/detail/endpoint_convert.hpp" @@ -155,7 +155,7 @@ struct epoll_op : scheduler_op void operator()() const noexcept; }; - capy::coro h; + std::coroutine_handle<> h; capy::executor_ref ex; std::error_code* ec_out = nullptr; std::size_t* bytes_out = nullptr; @@ -214,9 +214,9 @@ struct epoll_op : scheduler_op // use-after-free. Moving to local ensures destruction happens at // function exit, after all member accesses are complete. capy::executor_ref saved_ex( std::move( ex ) ); - capy::coro saved_h( std::move( h ) ); + std::coroutine_handle<> saved_h( std::move( h ) ); auto prevent_premature_destruction = std::move(impl_ptr); - resume_coro(saved_ex, saved_h); + dispatch_coro(saved_ex, saved_h).resume(); } virtual bool is_read_operation() const noexcept { return false; } diff --git a/src/corosio/src/detail/epoll/scheduler.cpp b/src/corosio/src/detail/epoll/scheduler.cpp index 42cb7d87..57d824a6 100644 --- a/src/corosio/src/detail/epoll/scheduler.cpp +++ b/src/corosio/src/detail/epoll/scheduler.cpp @@ -388,15 +388,15 @@ shutdown() void epoll_scheduler:: -post(capy::coro h) const +post(std::coroutine_handle<> h) const { struct post_handler final : scheduler_op { - capy::coro h_; + std::coroutine_handle<> h_; explicit - post_handler(capy::coro h) + post_handler(std::coroutine_handle<> h) : h_(h) { } diff --git a/src/corosio/src/detail/epoll/scheduler.hpp b/src/corosio/src/detail/epoll/scheduler.hpp index ecdf7f0d..71e09777 100644 --- a/src/corosio/src/detail/epoll/scheduler.hpp +++ b/src/corosio/src/detail/epoll/scheduler.hpp @@ -78,7 +78,7 @@ class epoll_scheduler epoll_scheduler& operator=(epoll_scheduler const&) = delete; void shutdown() override; - void post(capy::coro h) const override; + void post(std::coroutine_handle<> h) const override; void post(scheduler_op* h) const override; void on_work_started() noexcept override; void on_work_finished() noexcept override; diff --git a/src/corosio/src/detail/epoll/sockets.cpp b/src/corosio/src/detail/epoll/sockets.cpp index 1b696ee6..94de6cc7 100644 --- a/src/corosio/src/detail/epoll/sockets.cpp +++ b/src/corosio/src/detail/epoll/sockets.cpp @@ -14,7 +14,7 @@ #include "src/detail/epoll/sockets.hpp" #include "src/detail/endpoint_convert.hpp" #include "src/detail/make_err.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" #include #include @@ -103,9 +103,9 @@ operator()() // Move to stack before resuming. See epoll_op::operator()() for rationale. capy::executor_ref saved_ex( std::move( ex ) ); - capy::coro saved_h( std::move( h ) ); + std::coroutine_handle<> saved_h( std::move( h ) ); auto prevent_premature_destruction = std::move(impl_ptr); - resume_coro(saved_ex, saved_h); + dispatch_coro(saved_ex, saved_h).resume(); } epoll_socket_impl:: diff --git a/src/corosio/src/detail/iocp/overlapped_op.hpp b/src/corosio/src/detail/iocp/overlapped_op.hpp index 37c18ffe..ac67f5c6 100644 --- a/src/corosio/src/detail/iocp/overlapped_op.hpp +++ b/src/corosio/src/detail/iocp/overlapped_op.hpp @@ -16,12 +16,11 @@ #include #include -#include #include #include #include "src/detail/make_err.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" #include "src/detail/scheduler_op.hpp" #include @@ -60,7 +59,7 @@ struct overlapped_op /** Function pointer type for cancellation hook. */ using cancel_func_type = void(*)(overlapped_op*) noexcept; - capy::coro h; + std::coroutine_handle<> h; capy::executor_ref ex; std::error_code* ec_out = nullptr; std::size_t* bytes_out = nullptr; @@ -143,13 +142,20 @@ struct overlapped_op if (bytes_out) *bytes_out = static_cast(bytes_transferred); - resume_coro(ex, h); + dispatch_coro(ex, h).resume(); } - /** Cleanup without invoking handler (for destroy path). */ + /** Cleanup without invoking handler (for destroy/shutdown path). + Destroys the waiting coroutine frame to prevent leaks. + */ void cleanup_only() { stop_cb.reset(); + if(h) + { + h.destroy(); + h = {}; + } } }; diff --git a/src/corosio/src/detail/iocp/resolver_service.cpp b/src/corosio/src/detail/iocp/resolver_service.cpp index ea90ef41..fd9e21b6 100644 --- a/src/corosio/src/detail/iocp/resolver_service.cpp +++ b/src/corosio/src/detail/iocp/resolver_service.cpp @@ -15,7 +15,7 @@ #include "src/detail/iocp/scheduler.hpp" #include "src/detail/endpoint_convert.hpp" #include "src/detail/make_err.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" #include #include @@ -260,7 +260,7 @@ resolve_op::do_complete( op->cancel_handle = nullptr; - resume_coro(op->ex, op->h); + dispatch_coro(op->ex, op->h).resume(); } //------------------------------------------------------------------------------ @@ -305,7 +305,7 @@ reverse_resolve_op::do_complete( op->ep, std::move(op->stored_host), std::move(op->stored_service)); } - resume_coro(op->ex, op->h); + dispatch_coro(op->ex, op->h).resume(); } //------------------------------------------------------------------------------ @@ -329,7 +329,7 @@ release() std::coroutine_handle<> win_resolver_impl:: resolve( - capy::coro h, + std::coroutine_handle<> h, capy::executor_ref d, std::string_view host, std::string_view service, @@ -395,7 +395,7 @@ resolve( std::coroutine_handle<> win_resolver_impl:: reverse_resolve( - capy::coro h, + std::coroutine_handle<> h, capy::executor_ref d, endpoint const& ep, reverse_flags flags, diff --git a/src/corosio/src/detail/iocp/scheduler.cpp b/src/corosio/src/detail/iocp/scheduler.cpp index 11dfb8df..84c6d675 100644 --- a/src/corosio/src/detail/iocp/scheduler.cpp +++ b/src/corosio/src/detail/iocp/scheduler.cpp @@ -164,11 +164,11 @@ shutdown() void win_scheduler:: -post(capy::coro h) const +post(std::coroutine_handle<> h) const { struct post_handler final : scheduler_op { - capy::coro h_; + std::coroutine_handle<> h_; static void do_complete( void* owner, @@ -179,7 +179,9 @@ post(capy::coro h) const auto* self = static_cast(base); if (!owner) { - // Destroy path + // Destroy path: destroy the coroutine frame, then self + if (self->h_) + self->h_.destroy(); delete self; return; } @@ -189,7 +191,7 @@ post(capy::coro h) const coro.resume(); } - explicit post_handler(capy::coro coro) + explicit post_handler(std::coroutine_handle<> coro) : scheduler_op(&do_complete) , h_(coro) { diff --git a/src/corosio/src/detail/iocp/scheduler.hpp b/src/corosio/src/detail/iocp/scheduler.hpp index 5a3b85cc..ae44320f 100644 --- a/src/corosio/src/detail/iocp/scheduler.hpp +++ b/src/corosio/src/detail/iocp/scheduler.hpp @@ -51,7 +51,7 @@ class win_scheduler win_scheduler& operator=(win_scheduler const&) = delete; void shutdown() override; - void post(capy::coro h) const override; + void post(std::coroutine_handle<> h) const override; void post(scheduler_op* h) const override; void on_work_started() noexcept override; void on_work_finished() noexcept override; diff --git a/src/corosio/src/detail/iocp/signals.cpp b/src/corosio/src/detail/iocp/signals.cpp index 9b510fa4..36fb4935 100644 --- a/src/corosio/src/detail/iocp/signals.cpp +++ b/src/corosio/src/detail/iocp/signals.cpp @@ -13,7 +13,7 @@ #include "src/detail/iocp/signals.hpp" #include "src/detail/iocp/scheduler.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" #include #include @@ -170,7 +170,7 @@ signal_op::do_complete( auto* service = op->svc; op->svc = nullptr; - resume_coro(op->d, op->h); + dispatch_coro(op->d, op->h).resume(); if (service) service->work_finished(); @@ -220,7 +220,7 @@ wait( *ec = make_error_code(capy::error::canceled); if (signal_out) *signal_out = 0; - resume_coro(d, h); + dispatch_coro(d, h).resume(); // completion is always posted to scheduler queue, never inline. return std::noop_coroutine(); } @@ -502,7 +502,7 @@ cancel_wait(win_signal_impl& impl) *op->ec_out = make_error_code(capy::error::canceled); if (op->signal_out) *op->signal_out = 0; - resume_coro(op->d, op->h); + dispatch_coro(op->d, op->h).resume(); sched_.on_work_finished(); } } diff --git a/src/corosio/src/detail/iocp/signals.hpp b/src/corosio/src/detail/iocp/signals.hpp index b87ec21f..c598c073 100644 --- a/src/corosio/src/detail/iocp/signals.hpp +++ b/src/corosio/src/detail/iocp/signals.hpp @@ -69,7 +69,7 @@ enum { max_signal_number = 32 }; /** Signal wait operation state. */ struct signal_op : scheduler_op { - capy::coro h; + std::coroutine_handle<> h; capy::executor_ref d; std::error_code* ec_out = nullptr; int* signal_out = nullptr; diff --git a/src/corosio/src/detail/iocp/sockets.cpp b/src/corosio/src/detail/iocp/sockets.cpp index 11fde301..4ade08b3 100644 --- a/src/corosio/src/detail/iocp/sockets.cpp +++ b/src/corosio/src/detail/iocp/sockets.cpp @@ -15,7 +15,7 @@ #include "src/detail/iocp/scheduler.hpp" #include "src/detail/endpoint_convert.hpp" #include "src/detail/make_err.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" /* Windows IOCP Socket Implementation @@ -75,6 +75,7 @@ void connect_op::do_cancel_impl(overlapped_op* base) noexcept void read_op::do_cancel_impl(overlapped_op* base) noexcept { auto* op = static_cast(base); + op->cancelled.store(true, std::memory_order_release); if (op->internal.is_open()) { ::CancelIoEx( @@ -86,6 +87,7 @@ void read_op::do_cancel_impl(overlapped_op* base) noexcept void write_op::do_cancel_impl(overlapped_op* base) noexcept { auto* op = static_cast(base); + op->cancelled.store(true, std::memory_order_release); if (op->internal.is_open()) { ::CancelIoEx( @@ -190,7 +192,7 @@ accept_op::do_complete( auto saved_ex = op->ex; auto prevent_premature_destruction = std::move(op->acceptor_ptr); - resume_coro(saved_ex, saved_h); + dispatch_coro(saved_ex, saved_h).resume(); } //------------------------------------------------------------------------------ @@ -307,7 +309,7 @@ release_internal() std::coroutine_handle<> win_socket_impl_internal:: connect( - capy::coro h, + std::coroutine_handle<> h, capy::executor_ref d, endpoint ep, std::stop_token token, @@ -412,7 +414,11 @@ do_read_io() return; } } - // Synchronous completion: IOCP will deliver the completion packet + // I/O is now pending. If stop was requested before WSARecv + // started, the CancelIoEx in the stop callback had nothing + // to cancel. Re-check and cancel the now-pending operation. + if (op.cancelled.load(std::memory_order_acquire)) + ::CancelIoEx(reinterpret_cast(socket_), &op); } void @@ -437,14 +443,16 @@ do_write_io() DWORD err = ::WSAGetLastError(); if (err != WSA_IO_PENDING) { - // Immediate error - must use post(). See do_read_io for explanation. + // Immediate error - must use post(). svc_.work_finished(); op.dwError = err; svc_.post(&op); return; } } - // Synchronous completion: IOCP will deliver the completion packet + // Re-check cancellation after I/O is pending + if (op.cancelled.load(std::memory_order_acquire)) + ::CancelIoEx(reinterpret_cast(socket_), &op); } //------------------------------------------------------------------------------ @@ -452,7 +460,7 @@ do_write_io() std::coroutine_handle<> win_socket_impl_internal:: read_some( - capy::coro h, + std::coroutine_handle<> h, capy::executor_ref d, io_buffer_param param, std::stop_token token, @@ -499,7 +507,7 @@ read_some( std::coroutine_handle<> win_socket_impl_internal:: write_some( - capy::coro h, + std::coroutine_handle<> h, capy::executor_ref d, io_buffer_param param, std::stop_token token, @@ -912,7 +920,7 @@ release_internal() std::coroutine_handle<> win_acceptor_impl_internal:: accept( - capy::coro h, + std::coroutine_handle<> h, capy::executor_ref d, std::stop_token token, std::error_code* ec, diff --git a/src/corosio/src/detail/iocp/sockets.hpp b/src/corosio/src/detail/iocp/sockets.hpp index 28900078..46cc7fad 100644 --- a/src/corosio/src/detail/iocp/sockets.hpp +++ b/src/corosio/src/detail/iocp/sockets.hpp @@ -145,14 +145,14 @@ class win_socket_impl_internal void release_internal(); std::coroutine_handle<> connect( - capy::coro, + std::coroutine_handle<>, capy::executor_ref, endpoint, std::stop_token, std::error_code*); std::coroutine_handle<> read_some( - capy::coro, + std::coroutine_handle<>, capy::executor_ref, io_buffer_param, std::stop_token, @@ -160,7 +160,7 @@ class win_socket_impl_internal std::size_t*); std::coroutine_handle<> write_some( - capy::coro, + std::coroutine_handle<>, capy::executor_ref, io_buffer_param, std::stop_token, @@ -426,7 +426,7 @@ class win_acceptor_impl_internal void release_internal(); std::coroutine_handle<> accept( - capy::coro, + std::coroutine_handle<>, capy::executor_ref, std::stop_token, std::error_code*, diff --git a/src/corosio/src/detail/kqueue/acceptors.cpp b/src/corosio/src/detail/kqueue/acceptors.cpp index 5e3409e2..7d6152c8 100644 --- a/src/corosio/src/detail/kqueue/acceptors.cpp +++ b/src/corosio/src/detail/kqueue/acceptors.cpp @@ -15,7 +15,7 @@ #include "src/detail/kqueue/sockets.hpp" #include "src/detail/endpoint_convert.hpp" #include "src/detail/make_err.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" #include @@ -191,9 +191,9 @@ operator()() // Move to stack before resuming. See kqueue_op::operator()() for rationale. capy::executor_ref saved_ex( std::move( ex ) ); - capy::coro saved_h( std::move( h ) ); + std::coroutine_handle<> saved_h( std::move( h ) ); auto prevent_premature_destruction = std::move(impl_ptr); - resume_coro(saved_ex, saved_h); + dispatch_coro(saved_ex, saved_h).resume(); } kqueue_acceptor_impl:: diff --git a/src/corosio/src/detail/kqueue/op.hpp b/src/corosio/src/detail/kqueue/op.hpp index ebac0e8b..3fe293ad 100644 --- a/src/corosio/src/detail/kqueue/op.hpp +++ b/src/corosio/src/detail/kqueue/op.hpp @@ -18,12 +18,12 @@ #include #include #include -#include +#include #include #include #include "src/detail/make_err.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" #include "src/detail/scheduler_op.hpp" #include "src/detail/endpoint_convert.hpp" @@ -169,7 +169,7 @@ struct kqueue_op : scheduler_op void operator()() const noexcept; }; - capy::coro h; + std::coroutine_handle<> h; capy::executor_ref ex; std::error_code* ec_out = nullptr; std::size_t* bytes_out = nullptr; @@ -228,9 +228,9 @@ struct kqueue_op : scheduler_op // use-after-free. Moving to local ensures destruction happens at // function exit, after all member accesses are complete. capy::executor_ref saved_ex( std::move( ex ) ); - capy::coro saved_h( std::move( h ) ); + std::coroutine_handle<> saved_h( std::move( h ) ); auto prevent_premature_destruction = std::move(impl_ptr); - resume_coro(saved_ex, saved_h); + dispatch_coro(saved_ex, saved_h).resume(); } virtual bool is_read_operation() const noexcept { return false; } diff --git a/src/corosio/src/detail/kqueue/scheduler.cpp b/src/corosio/src/detail/kqueue/scheduler.cpp index 5470c0db..6be398dc 100644 --- a/src/corosio/src/detail/kqueue/scheduler.cpp +++ b/src/corosio/src/detail/kqueue/scheduler.cpp @@ -422,15 +422,15 @@ shutdown() void kqueue_scheduler:: -post(capy::coro h) const +post(std::coroutine_handle<> h) const { struct post_handler final : scheduler_op { - capy::coro h_; + std::coroutine_handle<> h_; explicit - post_handler(capy::coro h) + post_handler(std::coroutine_handle<> h) : h_(h) { } diff --git a/src/corosio/src/detail/kqueue/scheduler.hpp b/src/corosio/src/detail/kqueue/scheduler.hpp index af8e0406..67cd6344 100644 --- a/src/corosio/src/detail/kqueue/scheduler.hpp +++ b/src/corosio/src/detail/kqueue/scheduler.hpp @@ -92,7 +92,7 @@ class kqueue_scheduler kqueue_scheduler& operator=(kqueue_scheduler const&) = delete; void shutdown() override; - void post(capy::coro h) const override; + void post(std::coroutine_handle<> h) const override; void post(scheduler_op* h) const override; // scheduler::on_work_started / on_work_finished — non-const, for executors. // Tracks work that keeps run() alive; the scheduler stops when the diff --git a/src/corosio/src/detail/kqueue/sockets.cpp b/src/corosio/src/detail/kqueue/sockets.cpp index fc0a7f82..21d04f33 100644 --- a/src/corosio/src/detail/kqueue/sockets.cpp +++ b/src/corosio/src/detail/kqueue/sockets.cpp @@ -14,7 +14,7 @@ #include "src/detail/kqueue/sockets.hpp" #include "src/detail/endpoint_convert.hpp" #include "src/detail/make_err.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" #include #include @@ -141,9 +141,9 @@ operator()() // Move to stack before resuming. See kqueue_op::operator()() for rationale. capy::executor_ref saved_ex( std::move( ex ) ); - capy::coro saved_h( std::move( h ) ); + std::coroutine_handle<> saved_h( std::move( h ) ); auto prevent_premature_destruction = std::move(impl_ptr); - resume_coro(saved_ex, saved_h); + dispatch_coro(saved_ex, saved_h).resume(); } kqueue_socket_impl:: diff --git a/src/corosio/src/detail/posix/resolver_service.cpp b/src/corosio/src/detail/posix/resolver_service.cpp index bed871e5..455f61f7 100644 --- a/src/corosio/src/detail/posix/resolver_service.cpp +++ b/src/corosio/src/detail/posix/resolver_service.cpp @@ -14,13 +14,13 @@ #include "src/detail/posix/resolver_service.hpp" #include "src/detail/endpoint_convert.hpp" #include "src/detail/intrusive.hpp" -#include "src/detail/resume_coro.hpp" +#include "src/detail/dispatch_coro.hpp" #include "src/detail/scheduler_op.hpp" #include #include #include -#include +#include #include #include @@ -303,7 +303,7 @@ class posix_resolver_impl }; // Coroutine state - capy::coro h; + std::coroutine_handle<> h; capy::executor_ref ex; posix_resolver_impl* impl = nullptr; @@ -346,7 +346,7 @@ class posix_resolver_impl }; // Coroutine state - capy::coro h; + std::coroutine_handle<> h; capy::executor_ref ex; posix_resolver_impl* impl = nullptr; @@ -499,7 +499,7 @@ operator()() *out = std::move(stored_results); impl->svc_.work_finished(); - resume_coro(ex, h); + dispatch_coro(ex, h).resume(); } void @@ -571,7 +571,7 @@ operator()() } impl->svc_.work_finished(); - resume_coro(ex, h); + dispatch_coro(ex, h).resume(); } void diff --git a/src/corosio/src/detail/posix/signals.cpp b/src/corosio/src/detail/posix/signals.cpp index f9b4f765..7f42e154 100644 --- a/src/corosio/src/detail/posix/signals.cpp +++ b/src/corosio/src/detail/posix/signals.cpp @@ -15,7 +15,6 @@ #include #include -#include #include #include #include @@ -143,7 +142,7 @@ enum { max_signal_number = 64 }; struct signal_op : scheduler_op { - capy::coro h; + std::coroutine_handle<> h; capy::executor_ref d; std::error_code* ec_out = nullptr; int* signal_out = nullptr; diff --git a/src/corosio/src/detail/resume_coro.hpp b/src/corosio/src/detail/resume_coro.hpp index 0b138db8..a5ab7d3d 100644 --- a/src/corosio/src/detail/resume_coro.hpp +++ b/src/corosio/src/detail/resume_coro.hpp @@ -13,7 +13,7 @@ #include #include #include -#include +#include namespace boost::corosio::detail { @@ -28,7 +28,7 @@ namespace boost::corosio::detail { @param h The coroutine handle to resume. */ inline void -resume_coro(capy::executor_ref d, capy::coro h) +resume_coro(capy::executor_ref d, std::coroutine_handle<> h) { // Fast path: resume directly for io_context executor if (&d.type_id() == &capy::detail::type_id()) diff --git a/src/corosio/src/detail/select/acceptors.cpp b/src/corosio/src/detail/select/acceptors.cpp index fb3634b8..8aa28a01 100644 --- a/src/corosio/src/detail/select/acceptors.cpp +++ b/src/corosio/src/detail/select/acceptors.cpp @@ -14,6 +14,7 @@ #include "src/detail/select/acceptors.hpp" #include "src/detail/select/sockets.hpp" #include "src/detail/endpoint_convert.hpp" +#include "src/detail/dispatch_coro.hpp" #include "src/detail/make_err.hpp" #include @@ -119,9 +120,9 @@ operator()() // Move to stack before destroying the frame capy::executor_ref saved_ex( std::move( ex ) ); - capy::coro saved_h( std::move( h ) ); + std::coroutine_handle<> saved_h( std::move( h ) ); impl_ptr.reset(); - saved_ex.dispatch( saved_h ); + dispatch_coro(saved_ex, saved_h).resume(); } select_acceptor_impl:: diff --git a/src/corosio/src/detail/select/op.hpp b/src/corosio/src/detail/select/op.hpp index 3ca0388e..ba716857 100644 --- a/src/corosio/src/detail/select/op.hpp +++ b/src/corosio/src/detail/select/op.hpp @@ -18,11 +18,12 @@ #include #include #include -#include +#include #include #include #include "src/detail/make_err.hpp" +#include "src/detail/dispatch_coro.hpp" #include "src/detail/scheduler_op.hpp" #include "src/detail/endpoint_convert.hpp" @@ -114,7 +115,7 @@ struct select_op : scheduler_op void operator()() const noexcept; }; - capy::coro h; + std::coroutine_handle<> h; capy::executor_ref ex; std::error_code* ec_out = nullptr; std::size_t* bytes_out = nullptr; @@ -169,9 +170,9 @@ struct select_op : scheduler_op // Move to stack before destroying the frame capy::executor_ref saved_ex( std::move( ex ) ); - capy::coro saved_h( std::move( h ) ); + std::coroutine_handle<> saved_h( std::move( h ) ); impl_ptr.reset(); - saved_ex.dispatch( saved_h ); + dispatch_coro(saved_ex, saved_h).resume(); } virtual bool is_read_operation() const noexcept { return false; } diff --git a/src/corosio/src/detail/select/scheduler.cpp b/src/corosio/src/detail/select/scheduler.cpp index 3be76a75..96f7ccaf 100644 --- a/src/corosio/src/detail/select/scheduler.cpp +++ b/src/corosio/src/detail/select/scheduler.cpp @@ -192,15 +192,15 @@ shutdown() void select_scheduler:: -post(capy::coro h) const +post(std::coroutine_handle<> h) const { struct post_handler final : scheduler_op { - capy::coro h_; + std::coroutine_handle<> h_; explicit - post_handler(capy::coro h) + post_handler(std::coroutine_handle<> h) : h_(h) { } diff --git a/src/corosio/src/detail/select/scheduler.hpp b/src/corosio/src/detail/select/scheduler.hpp index 06794e95..0c003daf 100644 --- a/src/corosio/src/detail/select/scheduler.hpp +++ b/src/corosio/src/detail/select/scheduler.hpp @@ -81,7 +81,7 @@ class select_scheduler select_scheduler& operator=(select_scheduler const&) = delete; void shutdown() override; - void post(capy::coro h) const override; + void post(std::coroutine_handle<> h) const override; void post(scheduler_op* h) const override; void on_work_started() noexcept override; void on_work_finished() noexcept override; diff --git a/src/corosio/src/detail/select/sockets.cpp b/src/corosio/src/detail/select/sockets.cpp index 63b506a6..c22ee209 100644 --- a/src/corosio/src/detail/select/sockets.cpp +++ b/src/corosio/src/detail/select/sockets.cpp @@ -13,6 +13,7 @@ #include "src/detail/select/sockets.hpp" #include "src/detail/endpoint_convert.hpp" +#include "src/detail/dispatch_coro.hpp" #include "src/detail/make_err.hpp" #include @@ -99,9 +100,9 @@ operator()() // Move to stack before destroying the frame capy::executor_ref saved_ex( std::move( ex ) ); - capy::coro saved_h( std::move( h ) ); + std::coroutine_handle<> saved_h( std::move( h ) ); impl_ptr.reset(); - saved_ex.dispatch( saved_h ); + dispatch_coro(saved_ex, saved_h).resume(); } select_socket_impl:: diff --git a/src/corosio/src/tcp_server.cpp b/src/corosio/src/tcp_server.cpp index 6b5b1158..c00649f1 100644 --- a/src/corosio/src/tcp_server.cpp +++ b/src/corosio/src/tcp_server.cpp @@ -74,8 +74,8 @@ tcp_server::operator=(tcp_server&& o) noexcept capy::task tcp_server::do_accept(tcp_acceptor& acc) { - auto st = co_await capy::this_coro::stop_token; - while(! st.stop_requested()) + auto const& env = co_await capy::this_coro::environment; + while(! env.stop_token.stop_requested()) { // Wait for an idle worker before blocking on accept auto& w = co_await pop(); diff --git a/test/unit/io_context.cpp b/test/unit/io_context.cpp index 2ceeffe7..00368b1b 100644 --- a/test/unit/io_context.cpp +++ b/test/unit/io_context.cpp @@ -48,7 +48,7 @@ struct counter_coro std::coroutine_handle h; - operator capy::coro() const { return h; } + operator std::coroutine_handle<>() const { return h; } }; inline counter_coro make_coro(int& counter) @@ -84,7 +84,7 @@ struct atomic_counter_coro std::coroutine_handle h; - operator capy::coro() const { return h; } + operator std::coroutine_handle<>() const { return h; } }; inline atomic_counter_coro make_atomic_coro(std::atomic& counter) @@ -121,7 +121,7 @@ struct check_coro std::coroutine_handle h; - operator capy::coro() const { return h; } + operator std::coroutine_handle<>() const { return h; } }; inline check_coro make_check_coro(bool& result, io_context::executor_type& ex) diff --git a/test/unit/socket_stress.cpp b/test/unit/socket_stress.cpp index 6fcafc60..01cb7eda 100644 --- a/test/unit/socket_stress.cpp +++ b/test/unit/socket_stress.cpp @@ -201,6 +201,16 @@ struct stop_token_stress_test (void)co_await t.wait(); } + if (!read_done.load(std::memory_order_acquire)) + { + std::fprintf(stderr, " stop_token_stress: read hung on case %d, iter %d\n", i % 3, i); + BOOST_TEST(read_done.load(std::memory_order_acquire)); + stop_src.request_stop(); + timer t(ioc); + t.expires_after(std::chrono::milliseconds(100)); + (void)co_await t.wait(); + } + ++iterations; if (read_ec == capy::cond::canceled) ++cancellations; @@ -381,10 +391,16 @@ struct cancel_close_stress_test switch (i % 3) { case 0: + { + // Yield to let the posted read_coro start + timer yield_t(ioc); + yield_t.expires_after(std::chrono::microseconds(1)); + (void)co_await yield_t.wait(); // Cancel via tcp_socket.cancel() s2.cancel(); ++cancels; break; + } case 1: // Write data to complete the read normally { @@ -421,6 +437,7 @@ struct cancel_close_stress_test if (!read_done.load(std::memory_order_acquire)) { std::fprintf(stderr, " cancel_close_stress: read hung on case %d, iter %d\n", i % 3, i); + BOOST_TEST(read_done.load(std::memory_order_acquire)); // Force cancel s2.cancel(); timer t(ioc);