|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +nav-class: dark |
| 4 | +categories: ruben |
| 5 | +title: "Levelling up Boost.Redis" |
| 6 | +author-id: ruben |
| 7 | +author-name: Rubén Pérez Hidalgo |
| 8 | +--- |
| 9 | + |
| 10 | + |
| 11 | +I've really come to appreciate Boost.Redis design. With only |
| 12 | +three asynchronous primitives it exposes all the power of Redis, |
| 13 | +with features like automatic pipelining that make it pretty unique. |
| 14 | +Boost.Redis 1.90 will ship with some new exciting features that I'll |
| 15 | +cover in this post. |
| 16 | + |
| 17 | +## Cancelling requests with asio::cancel_after |
| 18 | + |
| 19 | +Boost.Redis implements a number of reliability measures, including reconnection. |
| 20 | +Suppose that you attempt to execute a request using `async_exec`, |
| 21 | +but the Redis server can't be contacted (for example, because of a temporary network error). |
| 22 | +Boost.Redis will try to re-establish the connection to the failed server, |
| 23 | +and `async_exec` will suspend until the server is healthy again. |
| 24 | + |
| 25 | +This is a great feature if the outage is transitory. But what would happen if |
| 26 | +the Redis server is permanently down - for example, because of deployment issue that |
| 27 | +must be manually solved? The user will see that `async_exec` never completes. |
| 28 | +If new requests continue to be issued, the program will end up consuming an |
| 29 | +unbound amount of resources. |
| 30 | + |
| 31 | +Starting with Boost 1.90, you can use `asio::cancel_after` to set |
| 32 | +a timeout to your requests, preventing this from happening: |
| 33 | + |
| 34 | +```cpp |
| 35 | +// Compose your request |
| 36 | +redis::request req; |
| 37 | +req.push("SET", "my_key", 42); |
| 38 | + |
| 39 | +// If the request doesn't complete within 30s, consider it as failed |
| 40 | +co_await conn.async_exec(req, redis::ignore, asio::cancel_after(30s)); |
| 41 | +``` |
| 42 | + |
| 43 | +For this to work, `async_exec` must properly support |
| 44 | +[per-operation cancellation](https://www.boost.org/doc/libs/latest/doc/html/boost_asio/overview/core/cancellation.html). |
| 45 | +This is tricky because Boost.Redis allows executing several requests concurrently, |
| 46 | +which are merged into a single pipeline before being sent. |
| 47 | +For the above to useful, cancelling one request shouldn't affect other requests. |
| 48 | +In Asio parlance, `async_exec` should support partial cancellation, at least. |
| 49 | + |
| 50 | +Cancelling a request that hasn't been sent yet is trivial - you just remove it from |
| 51 | +the queue and call it a day. Cancelling requests that are in progress is more involved. |
| 52 | +We've solved this by using "tombstones". If a response encounters a tombstone, |
| 53 | +it will get ignored. This way, cancelling `async_exec` has always an immediate |
| 54 | +effect, but the connection is kept in a well-defined state. |
| 55 | + |
| 56 | + |
| 57 | +## Custom setup requests |
| 58 | + |
| 59 | +Redis talks the RESP3 protocol. But it's not the only database system that speaks it. |
| 60 | +We've recently learnt that other systems, like [Tarantool DB](https://www.tarantool.io/en/tarantooldb/), |
| 61 | +are also capable of speaking RESP3. This means that Boost.Redis can be used to |
| 62 | +interact with these systems. |
| 63 | + |
| 64 | +At least in theory. In Boost 1.89, the library uses the [`HELLO`](https://redis.io/docs/latest/commands/hello/) |
| 65 | +command to upgrade to RESP3 (Redis' default is using the less powerful RESP2). |
| 66 | +The command is issued as part of the reconnection loop, without user intervention. |
| 67 | +It happens that systems like Tarantool DB don't support `HELLO` because they |
| 68 | +don't speak RESP2 at all, so there is nothing to upgrade. |
| 69 | + |
| 70 | +This is part of a larger problem: users might want to run arbitrary commands |
| 71 | +when the connection is established, to perform setup tasks. |
| 72 | +This might include [`AUTH`](https://redis.io/docs/latest/commands/auth/) to provide |
| 73 | +credentials or [`SELECT`](https://redis.io/docs/latest/commands/select/) to choose |
| 74 | +a database index. |
| 75 | + |
| 76 | +Until now, all you could do is configure the parameters used by the `HELLO` command. |
| 77 | +Starting with Boost 1.90, you can run arbitrary commands at connection startup: |
| 78 | + |
| 79 | +```cpp |
| 80 | +// At startup, don't send any HELLO, but set up authentication and select a database |
| 81 | +redis::request setup_request; |
| 82 | +setup_request.push("AUTH", "my_user", "my_password"); |
| 83 | +setup_request.push("SELECT", 2); |
| 84 | + |
| 85 | +redis::config cfg { |
| 86 | + .use_setup = true, // use the custom setup request, rather than the default HELLO command |
| 87 | + .setup = std::move(setup_request), // will be run every time a connection is established |
| 88 | +}; |
| 89 | + |
| 90 | +conn.async_run(cfg, asio::detached); |
| 91 | +``` |
| 92 | + |
| 93 | +This opens the door simplifying code using PubSub. At the moment, such code needs |
| 94 | +to issue a `SUBSCRIBE` command every time a reconnection happens, which implies |
| 95 | +some tricks around `async_receive`. With this feature, you can just add a `SUBSCRIBE` |
| 96 | +command to your setup request and forget. |
| 97 | + |
| 98 | +This will be further explored in the next months, since `async_receive` is currently |
| 99 | +aware of reconnections, so it might need some extra changes to see real benefits. |
| 100 | + |
| 101 | + |
| 102 | +## Valkey support |
| 103 | + |
| 104 | +[Valkey](https://valkey.io/) is a fork from Redis v7.3. At the time of writing, |
| 105 | +both databases are mostly interoperable in terms of protocol features, but |
| 106 | +they are being developed separately (as happened with MySQL and MariaDB). |
| 107 | + |
| 108 | +In Boost.Redis we've committed to supporting both long-term |
| 109 | +(at the moment, by deploying CI builds to test both). |
| 110 | + |
| 111 | + |
| 112 | +## Race-free cancellation |
| 113 | + |
| 114 | +It is very easy to introduce race conditions in cancellation with Asio. |
| 115 | +Consider the following code, which is typical in libraries that |
| 116 | +predate per-operation cancellation: |
| 117 | + |
| 118 | +```cpp |
| 119 | +struct connection |
| 120 | +{ |
| 121 | + asio::ip::tcp::socket sock; |
| 122 | + std::string buffer; |
| 123 | + |
| 124 | + struct echo_op |
| 125 | + { |
| 126 | + connection* obj; |
| 127 | + asio::coroutine coro{}; |
| 128 | + |
| 129 | + template <class Self> |
| 130 | + void operator()(Self& self, error_code ec = {}, std::size_t = {}) |
| 131 | + { |
| 132 | + BOOST_ASIO_CORO_REENTER(coro) |
| 133 | + { |
| 134 | + while (true) |
| 135 | + { |
| 136 | + // Read from the socket |
| 137 | + BOOST_ASIO_CORO_YIELD |
| 138 | + asio::async_read_until(obj->sock, asio::dynamic_buffer(obj->buffer), "\n", std::move(self)); |
| 139 | + |
| 140 | + // Check for errors |
| 141 | + if (ec) |
| 142 | + self.complete(ec); |
| 143 | + |
| 144 | + // Write back |
| 145 | + BOOST_ASIO_CORO_YIELD |
| 146 | + asio::async_write(obj->sock, asio::buffer(obj->buffer), std::move(self)); |
| 147 | + |
| 148 | + // Done |
| 149 | + self.complete(ec); |
| 150 | + } |
| 151 | + } |
| 152 | + } |
| 153 | + }; |
| 154 | + |
| 155 | + template <class CompletionToken> |
| 156 | + auto async_echo(CompletionToken&& token) |
| 157 | + { |
| 158 | + return asio::async_compose<CompletionToken, void(error_code)>(echo_op{this}, token, sock); |
| 159 | + } |
| 160 | + |
| 161 | + void cancel() { sock.cancel(); } |
| 162 | +}; |
| 163 | +``` |
| 164 | +
|
| 165 | +There is a race condition here. `cancel()` may actually not cancel a running `async_echo`. |
| 166 | +After a read or write completes, the respective handler may not be called immediately, |
| 167 | +but queued for execution. If `cancel()` is called within that time frame, the cancellation |
| 168 | +will be ignored. |
| 169 | +
|
| 170 | +The proper way to handle this is using per-operation cancellation, rather than a `cancel()` method. |
| 171 | +`async_compose` knows about this problem and keeps state about received cancellations, so you can write: |
| 172 | +
|
| 173 | +```cpp |
| 174 | +// Read from the socket |
| 175 | +BOOST_ASIO_CORO_YIELD |
| 176 | +asio::async_read_until(obj->sock, asio::dynamic_buffer(obj->buffer), "\n", std::move(self)); |
| 177 | +
|
| 178 | +// Check for errors |
| 179 | +if (ec) |
| 180 | + self.complete(ec); |
| 181 | +
|
| 182 | +// Check for cancellations |
| 183 | +if (!!(self.get_cancellation_state().cancelled() & asio::cancellation_type_t::terminal)) |
| 184 | + self.complete(asio::error::operation_aborted); |
| 185 | +``` |
| 186 | + |
| 187 | +In 1.90, the library uses this approach everywhere, so cancellation is reliable. |
| 188 | +Keeping the `cancel()` method is a challenge, as it involves re-wiring cancellation |
| 189 | +slots, so I won't show it here - but we've managed to do it. |
| 190 | + |
| 191 | + |
| 192 | +## Next steps |
| 193 | + |
| 194 | +I've got plans to keep working on Boost.Redis for a time. You can expect |
| 195 | +more features in 1.91, like [Sentinel](https://redis.io/docs/latest/operate/oss_and_stack/management/sentinel/) |
| 196 | +support and [more reliable health checks](https://github.com/boostorg/redis/issues/104). |
0 commit comments