Skip to content
Closed

work #108

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 25 additions & 27 deletions doc/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
@@ -1,30 +1,28 @@
* xref:index.adoc[Introduction]
* xref:quick-start.adoc[Quick Start]
* Tutorials
** xref:tutorials/echo-server.adoc[Echo Server]
** xref:tutorials/http-client.adoc[HTTP Client]
** xref:tutorials/dns-lookup.adoc[DNS Lookup]
** xref:tutorials/tls-context.adoc[TLS Context Configuration]
* Guide
** xref:guide/tcp-networking.adoc[TCP/IP Networking]
** xref:guide/concurrent-programming.adoc[Concurrent Programming]
** xref:guide/io-context.adoc[I/O Context]
** xref:guide/sockets.adoc[Sockets]
** xref:guide/tcp_acceptor.adoc[Acceptors]
** xref:guide/endpoints.adoc[Endpoints]
** xref:guide/composed-operations.adoc[Composed Operations]
** xref:guide/timers.adoc[Timers]
** xref:guide/signals.adoc[Signal Handling]
** xref:guide/resolver.adoc[Name Resolution]
** xref:guide/tcp-server.adoc[TCP Server]
** xref:guide/tls.adoc[TLS Encryption]
** xref:guide/error-handling.adoc[Error Handling]
** xref:guide/buffers.adoc[Buffer Sequences]
* Concepts
** xref:reference/design-rationale.adoc[Design Rationale]
** xref:concepts/affine-awaitables.adoc[Affine Awaitables]
* Testing
** xref:testing/mocket.adoc[Mock Sockets]
* xref:2.networking-tutorial/2.intro.adoc[Networking Tutorial]
* xref:3.tutorials/3.intro.adoc[Tutorials]
** xref:3.tutorials/3a.echo-server.adoc[Echo Server]
** xref:3.tutorials/3b.http-client.adoc[HTTP Client]
** xref:3.tutorials/3c.dns-lookup.adoc[DNS Lookup]
** xref:3.tutorials/3d.tls-context.adoc[TLS Context Configuration]
* xref:4.guide/4.intro.adoc[Guide]
** xref:4.guide/4a.tcp-networking.adoc[TCP/IP Networking]
** xref:4.guide/4b.concurrent-programming.adoc[Concurrent Programming]
** xref:4.guide/4c.io-context.adoc[I/O Context]
** xref:4.guide/4d.sockets.adoc[Sockets]
** xref:4.guide/4e.tcp-acceptor.adoc[Acceptors]
** xref:4.guide/4f.endpoints.adoc[Endpoints]
** xref:4.guide/4g.composed-operations.adoc[Composed Operations]
** xref:4.guide/4h.timers.adoc[Timers]
** xref:4.guide/4i.signals.adoc[Signal Handling]
** xref:4.guide/4j.resolver.adoc[Name Resolution]
** xref:4.guide/4k.tcp-server.adoc[TCP Server]
** xref:4.guide/4l.tls.adoc[TLS Encryption]
** xref:4.guide/4m.error-handling.adoc[Error Handling]
** xref:4.guide/4n.buffers.adoc[Buffer Sequences]
* xref:5.testing/5.intro.adoc[Testing]
** xref:5.testing/5a.mocket.adoc[Mock Sockets]
* xref:reference:boost/corosio.adoc[Reference]
* xref:reference/glossary.adoc[Glossary]
* xref:reference/benchmark-report.adoc[Benchmarks]
* xref:glossary.adoc[Glossary]
* xref:benchmark-report.adoc[Benchmarks]
18 changes: 18 additions & 0 deletions doc/modules/ROOT/pages/2.networking-tutorial/2.intro.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
//
// Copyright (c) 2025 Vinnie Falco (vinnie.falco@gmail.com)
//
// Distributed under the Boost Software License, Version 1.0. (See accompanying
// file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
//
// Official repository: https://github.com/cppalliance/corosio
//

= Networking Tutorial

Every network application starts with a conversation between two programs. One program asks a question, the other answers, and useful work happens across the wire. This section walks you through that conversation from the ground up, building your understanding of networked programming with Corosio one concept at a time.

You will begin with the fundamentals: what happens when your program opens a connection, how bytes travel between machines, and how the operating system manages all of it on your behalf. From there you will progress to writing real I/O code -- sending data, receiving responses, and handling the inevitable errors that arise when communicating over an unreliable medium.

As your confidence grows, the material advances into the patterns that distinguish production-quality network code from toy examples. You will see how asynchronous I/O lets a single thread juggle thousands of connections without blocking, how coroutines make that concurrency feel sequential and natural, and how the event loop ties it all together.

By the end, you will have a practical understanding of TCP networking sufficient to build clients, servers, and everything in between -- using modern C++ and the coroutine-first abstractions that Corosio provides.
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
//
// Copyright (c) 2025 Vinnie Falco (vinnie.falco@gmail.com)
//
// Distributed under the Boost Software License, Version 1.0. (See accompanying
// file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
//
// Official repository: https://github.com/cppalliance/corosio
//

= What Happens When You Connect

Your program calls `connect`, and a moment later bytes arrive on a machine halfway around the world. Between those two events, a handful of protocols cooperate to make it happen. Understanding what each one does -- and where your code fits in the picture -- is the foundation for everything that follows.

== The Journey of a Message

Suppose your program wants to send the string `"hello"` to a server at address `192.0.2.50` on port `8080`. Here is what happens, roughly, from the moment you call `send` to the moment the server's application reads the data:

. Your application hands `"hello"` to TCP through the operating system's socket interface.
. TCP prepends its own header -- containing the destination port (`8080`), a sequence number to track byte position, and a checksum. The five bytes of `"hello"` are now wrapped inside a TCP *segment*.
. The TCP segment is passed down to IP, which prepends another header -- containing the destination address (`192.0.2.50`), the source address of your machine, and a time-to-live field that prevents the packet from circling the network forever. The whole thing is now an IP *packet*.
. The IP packet is handed to the network hardware, which frames it for whatever physical medium connects your machine to the next hop -- Ethernet, Wi-Fi, or something else. This frame goes out on the wire.

At each stage, the original data gets wrapped in another header, like putting a letter in an envelope, then putting that envelope in a shipping box. This is called *encapsulation*: each protocol adds its own metadata around the payload it received from above.

== Arriving at the Other End

When the packet reaches the destination machine, the process runs in reverse:

. The network hardware strips the link-level framing and hands the IP packet to the operating system.
. IP examines its header, confirms the packet is addressed to this machine, and looks at the *protocol* field to determine whether the payload belongs to TCP or UDP. It strips the IP header and passes the segment up.
. TCP examines the destination port number in its header. Port `8080` identifies which application should receive the data. TCP strips its own header and delivers `"hello"` to the server's socket.

This reverse process is called *demultiplexing*. The destination machine receives a stream of raw packets from the network and uses the headers to sort each one to the correct protocol, then to the correct application. Without demultiplexing, every program on the machine would see every packet, which would be chaos.

== Only a Few Protocols Matter

The full roster of internet protocols is enormous, but for application development you only need to know a handful:

*IP* (Internet Protocol):: Handles addressing and routing. Every packet carries a source and destination IP address, and routers use these addresses to forward the packet toward its destination one hop at a time. IP makes no guarantees about delivery -- packets can arrive out of order, arrive twice, or not arrive at all.

*TCP* (Transmission Control Protocol):: Builds a reliable, ordered byte stream on top of IP. Your program writes bytes into one end and they come out in the same order at the other end, even if the underlying packets took different routes or needed to be retransmitted. TCP also handles flow control so a fast sender does not overwhelm a slow receiver.

*UDP* (User Datagram Protocol):: A thinner alternative to TCP. Each send produces exactly one datagram, and each receive consumes exactly one. There are no guarantees about delivery or ordering. UDP is the right choice when speed matters more than completeness -- things like real-time audio, video, or DNS lookups.

*DNS* (Domain Name System):: Translates human-readable names like `www.example.com` into IP addresses like `93.184.215.14`. Almost every connection your program makes begins with a DNS query, even if you never see it explicitly.

These four, plus the hardware underneath, account for nearly everything that happens when your program communicates over the internet. Later sections examine each one in detail.

== Encapsulation in Practice

To make this concrete, consider the bytes on the wire when your program sends `"hello"` over TCP. The final packet contains, from outermost to innermost:

[cols="1,3"]
|===
| Component | Contents

| Ethernet header
| Source and destination MAC addresses, EtherType field indicating IP

| IP header
| Version, header length, total length, TTL, protocol (TCP), source IP, destination IP

| TCP header
| Source port, destination port, sequence number, acknowledgment number, flags, window size, checksum

| Application data
| The five bytes: `h`, `e`, `l`, `l`, `o`
|===

Each protocol only reads and removes its own header. TCP never inspects the IP header. IP never inspects the Ethernet header. This separation is what allows protocols to be mixed and matched independently -- you can run TCP over Wi-Fi or over a fiber-optic link without changing a single line of your application code.

== Where Your Code Lives

As an application developer, you operate at the top of this stack. You open a socket, connect to an address and port, and read or write data. The operating system handles TCP segmentation, IP routing, and hardware framing on your behalf. You never construct an IP header or compute a TCP checksum by hand.

This is a good thing. The protocols below your application are mature, heavily optimized, and implemented in the kernel. Your job is to use them correctly -- to understand what TCP promises and what it does not, to know when UDP is a better fit, and to handle the errors that arise when the network does not cooperate.

That understanding begins with the next section: how machines are identified on the internet.
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
//
// Copyright (c) 2025 Vinnie Falco (vinnie.falco@gmail.com)
//
// Distributed under the Boost Software License, Version 1.0. (See accompanying
// file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
//
// Official repository: https://github.com/cppalliance/corosio
//

= Addressing Machines on the Internet

Every packet traveling the internet carries two addresses: where it came from and where it is going. These addresses are not arbitrary labels. They have internal structure that routers use to make forwarding decisions, and understanding that structure explains why networks behave the way they do.

== IPv4 Addresses

An IPv4 address is a 32-bit number, written as four decimal values separated by dots. Each value represents one byte, so it ranges from 0 to 255:

----
192.168.1.42
----

That is the human-readable form. Under the hood it is just 32 bits: `11000000.10101000.00000001.00101010`. Four billion possible addresses sounds like a lot until you consider that every phone, laptop, thermostat, and security camera on the planet needs one. The world ran out of fresh IPv4 addresses years ago, and workarounds like network address translation (NAT) keep things functioning -- but the shortage is real.

== Structure Inside the Address

An IP address is not a flat identifier like a serial number. It is split into two parts: a *network* portion and a *host* portion. The network portion identifies which network the machine belongs to. The host portion identifies the specific machine within that network.

Consider a company with the address range `10.4.0.0` through `10.4.255.255`. The first two bytes (`10.4`) identify the company's network. The last two bytes identify individual machines. A router does not need to know about every machine -- it only needs to know how to reach the `10.4` network, and then the local network handles the rest.

This division is what makes routing scalable. The internet has billions of devices, but routers only need a few hundred thousand routing entries because they route to *networks*, not to individual machines.

== Subnet Masks

The *subnet mask* tells you where the network portion ends and the host portion begins. It is a 32-bit value where all the network bits are set to 1 and all the host bits are set to 0:

----
Address: 192.168.1.42
Subnet mask: 255.255.255.0
----

In this example, the first 24 bits are the network (`192.168.1`) and the last 8 bits are the host (`42`). This is often written in shorthand as `192.168.1.42/24`, where `/24` means "the first 24 bits are the network."

A `/16` mask (`255.255.0.0`) gives you a larger network with up to 65,534 usable host addresses. A `/28` mask (`255.255.255.240`) gives you a tiny network with just 14 usable hosts. Network administrators choose the mask based on how many machines they need to support.

Subnet masks also let a single organization divide its address space into smaller pieces. A company allocated `172.20.0.0/16` might carve it into subnets: `172.20.1.0/24` for the engineering floor, `172.20.2.0/24` for the sales office, and so on. Each subnet functions as its own small network with its own router, reducing broadcast traffic and improving security.

== Special Addresses

Several address ranges have reserved meanings:

`127.0.0.1` (loopback):: Traffic sent here never leaves the machine. It goes down through the protocol stack, turns around, and comes back up. This is how your program talks to a server running on the same computer. The entire `127.0.0.0/8` range is reserved for loopback, but `127.0.0.1` is the one you will see in practice.

`0.0.0.0`:: When used as a source address, it means "this machine, but I don't know my address yet." When used to bind a socket, it means "listen on every available network interface."

Private address ranges:: Three ranges are set aside for internal networks that do not route on the public internet:
* `10.0.0.0/8` -- roughly 16 million addresses
* `172.16.0.0/12` -- roughly 1 million addresses
* `192.168.0.0/16` -- roughly 65,000 addresses

If you have ever connected to a home Wi-Fi network and received an address like `192.168.0.105`, you were using a private address. Your router translates it to a public address using NAT before forwarding your traffic to the internet.

Broadcast:: The address `255.255.255.255` sends a packet to every machine on the local network. Subnet-specific broadcasts exist too -- on the `192.168.1.0/24` network, the broadcast address is `192.168.1.255`.

== IPv6

IPv6 replaces the 32-bit address with a 128-bit one, written as eight groups of four hexadecimal digits separated by colons:

----
2001:0db8:85a3:0000:0000:8a2e:0370:7334
----

Leading zeros within a group can be dropped, and a single run of consecutive all-zero groups can be replaced with `::`:

----
2001:db8:85a3::8a2e:370:7334
----

128 bits gives roughly 3.4 x 10^38^ addresses -- enough to assign one to every atom on the surface of the earth and still have room left over. The exhaustion problem goes away entirely.

The concepts carry over directly. IPv6 addresses still have network and host portions, determined by a prefix length (typically `/64` for a single subnet). Routers still forward based on the network prefix. Loopback is `::1`. The mechanics are the same; only the size changed.

For the rest of this tutorial, examples use IPv4 notation for brevity. Everything discussed applies equally to IPv6 unless noted otherwise.

== Why This Matters to You

When your program connects to `192.168.1.42`, the operating system does not search the entire internet for that address. It examines the destination, compares it against the subnet masks of the local interfaces, and determines whether the target is on the local network or reachable through a gateway. That decision -- local or remote -- is the first routing choice, and it happens because addresses carry structural information.

Understanding addresses also explains common errors. Connecting to `127.0.0.1` always reaches your own machine. Connecting to a `192.168.x` address from outside the local network fails because private addresses do not route publicly. Binding a server to `0.0.0.0` makes it reachable on all interfaces; binding to `127.0.0.1` restricts it to local connections only.

The next section covers how you avoid typing IP addresses altogether, using the Domain Name System to turn human-readable names into the numbers that machines actually use.
Loading
Loading