diff --git a/modules/sql/pages/get-started/deploy-sql-cluster.adoc b/modules/sql/pages/get-started/deploy-sql-cluster.adoc new file mode 100644 index 000000000..630c413eb --- /dev/null +++ b/modules/sql/pages/get-started/deploy-sql-cluster.adoc @@ -0,0 +1,176 @@ += Enable Redpanda SQL on a BYOC cluster +:description: Enable the Redpanda SQL engine on a BYOC cluster so that users can query streaming data with standard PostgreSQL syntax. +:page-topic-type: how-to + +Enable Redpanda SQL on a BYOC cluster to give your team the ability to query streaming data in Redpanda topics using standard PostgreSQL syntax. + +== Prerequisites + +To enable Redpanda SQL engine, you need: + +* Admin permissions in your Redpanda Cloud organization. +* If using the link:/api/doc/cloud-controlplane/topic/topic-cloud-api-overview[Cloud API] to enable SQL, a valid bearer token for the API. See link:/api/doc/cloud-controlplane/authentication[Authenticate to the Cloud API]. + +== Enable Redpanda SQL + +You can enable Redpanda SQL when you create a new BYOC cluster or on an existing cluster. + +=== On a new cluster + +[tabs] +===== +Cloud Console:: ++ +-- +. Log in to https://cloud.redpanda.com[Redpanda Cloud^]. +. Start creating a new BYOC cluster on AWS. For details and prerequisites, see xref:get-started:cluster-types/byoc/aws/create-byoc-cluster-aws.adoc[]. +. In the cluster creation form, select the option to enable SQL. +// TODO: Confirm guidance to provide on selecting number of nodes +. Choose the number of SQL nodes to deploy. ++ +The minimum is one node to enable SQL. You can scale up (maximum nine nodes) or down later as needed, but the cluster must have at least one SQL node to run the engine. +. Complete the remaining cluster configuration and deploy. +-- + +Cloud API:: ++ +-- +. Authenticate to the link:/api/doc/cloud-controlplane/topic/topic-cloud-api-overview[Cloud API]. For details, see link:/api/doc/cloud-controlplane/authentication[Authenticate to the API]. +// TODO: confirm field name change to rpsql +// Is selecting the number of nodes available with this endpoint? +. Make a link:/api/doc/cloud-controlplane/operation/operation-clusterservice_createcluster[`POST /v1/clusters`] request with `oxla.enabled` set to `true` in the cluster spec: ++ +[,bash] +---- +curl -X POST "https://api.redpanda.com/v1/clusters" \ + -H "Authorization: Bearer $AUTH_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "cluster": { + "name": "", + "cloud_provider": "CLOUD_PROVIDER_AWS", + "type": "TYPE_BYOC", + "region": "", + "zones": [ ], + "throughput_tier": "", + "resource_group_id": "", + "oxla": { + "enabled": true + } + } + }' +---- ++ +For the full request body and field reference, see the link:/api/doc/cloud-controlplane/operation/operation-clusterservice_createcluster[Create Cluster API]. +. The request returns the ID of a long-running operation. Poll the link:/api/doc/cloud-controlplane/operation/operation-operationservice_getoperation[`GET /v1/operations/{operation.id}`] endpoint until the operation completes. +-- +===== + +=== On an existing cluster + +To enable, scale, or disable SQL on an existing cluster, you also need the cluster ID, which you can find in the *Details* section of the cluster overview in the Cloud Console. + +// TODO: Confirm UI functionality + +. Authenticate to the link:/api/doc/cloud-controlplane/topic/topic-cloud-api-overview[Cloud API]. For details, see link:/api/doc/cloud-controlplane/authentication[Authenticate to the API]. +. Make a link:/api/doc/cloud-controlplane/operation/operation-clusterservice_updatecluster[`PATCH /v1/clusters/{cluster.id}`] request, replacing `{cluster.id}` with your cluster ID: ++ +[,bash] +---- +curl -X PATCH "https://api.redpanda.com/v1/clusters/{cluster.id}" \ + -H "Authorization: Bearer $AUTH_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"oxla":{"enabled":true}}' +---- ++ +The request returns the ID of a long-running operation. Poll the link:/api/doc/cloud-controlplane/operation/operation-operationservice_getoperation[`GET /v1/operations/{operation.id}`] endpoint until the operation completes: ++ +[,bash] +---- +curl -X GET "https://api.redpanda.com/v1/operations/{operation.id}" \ + -H "Authorization: Bearer $AUTH_TOKEN" \ + -H "Content-Type: application/json" +---- ++ +When the operation is complete, the response shows `"state": "STATE_COMPLETED"`. + +== Scale Redpanda SQL + +Redpanda SQL supports horizontal scaling from 1 to 9 nodes per cluster. Scaling to 0 is not supported. To remove Redpanda SQL from a cluster, disable the SQL engine instead. + +// TODO: Confirm UI functionality + +Make a link:/api/doc/cloud-controlplane/operation/operation-clusterservice_updatecluster[`PATCH /v1/clusters/{cluster.id}`] request with the new replica count. Replace `{cluster.id}` with your cluster ID and `` with a value between 1 and 9: + +[,bash] +---- +curl -X PATCH "https://api.redpanda.com/v1/clusters/{cluster.id}" \ + -H "Authorization: Bearer $AUTH_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"oxla":{"replicas":}}' +---- + +The request returns the ID of a long-running operation. Poll link:/api/doc/cloud-controlplane/operation/operation-operationservice_getoperation[`GET /v1/operations/{operation.id}`] until the operation completes. + +== Verify the SQL engine is running + +After you enable Redpanda SQL, the cluster overview page in the Cloud console shows the *SQL* tab. The *Details* pane also displays the number of SQL nodes deployed with the cluster. Provisioning may take several minutes. + +To verify the SQL engine is running, use the connection details on the *SQL* tab to connect with a PostgreSQL client, such as `psql`. + +The following shows how to connect using a bearer token. The `rpk cloud auth token` command (xref:manage:rpk/rpk-install.adoc[`rpk` v26.1.6+] required) retrieves a temporary authentication token for the SQL engine. If your `rpk` session is not authenticated to Redpanda Cloud, the command prompts you to first log in on your browser: + +// TODO: Confirm behavior when SQL is enabled but not yet fully provisioned. +// Does the `psql` connection attempt time out, or return an error message +//indicating the engine is not ready? +[,bash] +---- +rpsql_token=$(rpk cloud auth token) + +psql "host= port=5432 dbname=oxla user=ignored password=$rpsql_token options='-c auth_method=bearer' sslmode=require" +---- + +== Inspect your SQL cluster + +Redpanda SQL provides built-in commands to inspect the state of your SQL cluster: + +[,sql] +---- +SHOW NODES; -- List SQL compute nodes and their status +SHOW REDPANDA TABLES; -- List SQL tables mapped to Redpanda topics +SHOW QUERIES; -- List currently running queries +---- + +== Disable Redpanda SQL + +// TODO: Confirm with engineering exactly what happens when Redpanda SQL is +// disabled and document it precisely. Specifically: +// - What state is purged (catalog metadata, table mappings, role/grant +// data, query history, cached query results)? +// - What is deleted from object storage (Iceberg-translated data for +// Iceberg-enabled topics, internal SQL engine state)? +// - Are Redpanda topic data and Schema Registry subjects affected? +// - What error / status do clients see for in-flight queries? +// - If SQL is re-enabled later, is any state restored, or is the engine +// provisioned fresh? + +[WARNING] +==== +Disabling Redpanda SQL purges the stored catalog state for the SQL engine and deletes its data from object storage. In-flight queries fail when SQL is disabled. +==== + +Make a link:/api/doc/cloud-controlplane/operation/operation-clusterservice_updatecluster[`PATCH /v1/clusters/{cluster.id}`] request with `oxla.enabled` set to `false`. Replace `{cluster.id}` with your cluster ID: + +[,bash] +---- +curl -X PATCH "https://api.redpanda.com/v1/clusters/{cluster.id}" \ + -H "Authorization: Bearer $AUTH_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"oxla":{"enabled":false}}' +---- + +The request returns the ID of a long-running operation. Poll link:/api/doc/cloud-controlplane/operation/operation-operationservice_getoperation[`GET /v1/operations/{operation.id}`] until the operation completes. + +== Next steps + +* xref:sql:get-started/sql-quickstart.adoc[Quickstart]: Connect to Redpanda SQL with `psql` and run your first query. diff --git a/modules/sql/pages/get-started/index.adoc b/modules/sql/pages/get-started/index.adoc new file mode 100644 index 000000000..1a7ab8979 --- /dev/null +++ b/modules/sql/pages/get-started/index.adoc @@ -0,0 +1,3 @@ += Get Started with Redpanda SQL +:description: Get started with Redpanda SQL, a column-oriented OLAP query engine built into Redpanda Cloud that lets you query streaming topics using standard SQL. +:page-layout: index diff --git a/modules/sql/pages/get-started/sql-quickstart.adoc b/modules/sql/pages/get-started/sql-quickstart.adoc new file mode 100644 index 000000000..075f01ffa --- /dev/null +++ b/modules/sql/pages/get-started/sql-quickstart.adoc @@ -0,0 +1,194 @@ += Redpanda SQL quickstart +:description: Connect to Redpanda SQL on a BYOC cluster and run your first query on streaming data. +:page-topic-type: guide + +Redpanda SQL is a PostgreSQL-compatible SQL engine built into Redpanda BYOC. It lets you query streaming data in your Redpanda topics with standard SQL, without building ETL pipelines or deploying a separate analytics system. In this quickstart, you connect with `psql` and run your first query against a Redpanda topic. + +This quickstart is written for an admin who can view the SQL connection details in the Cloud Console. + +== Prerequisites + +* A Redpanda BYOC cluster on AWS with Redpanda SQL enabled. See xref:sql:get-started/deploy-sql-cluster.adoc[]. +* Admin access to the cluster in the Redpanda Cloud Console (required to view SQL connection details). +* A Redpanda topic with a schema registered in Schema Registry. If you don't have one, follow the optional <> section below to create a sample `orders` topic. +* xref:manage:rpk/rpk-install.adoc[`rpk` v26.1.6] or later installed on your local machine to generate an authentication token. +* https://www.postgresql.org/download/[`psql`^] (PostgreSQL client) installed on your local machine. + +// TODO: Verify the exact connection string format and where users get credentials. +// From PRD: SCRAM auth preserved, connection string available in Cloud Console and API response. +// Confirm with engineering what SCRAM credentials does the user use - superuser auto-created by Control Plane? + +[#optional-produce-sample-data] +== (Optional) Produce sample data + +[TIP] +==== +Skip this section if you already have a Redpanda topic with a schema registered in Schema Registry that you want to query. +==== + +If you don't have a schema-registered topic to query yet, follow these steps to create an `orders` topic with a small set of sample records. Redpanda SQL reads the topic's schema from Schema Registry to map fields to SQL columns, so the topic must have a registered schema before you can query it. + +You also need permissions to create topics, register schemas, and produce records. + +. https://cloud.redpanda.com/[Log in to Redpanda Cloud^] and select your cluster. +. On the *Topics* page, click *Create Topic*. Name the topic `orders` and create it with default settings. +. On the *Schema Registry* page, click *Create new schema*. +. Create a new schema with the following: ++ +* *Strategy*: Topic +* *Topic name*: orders +* *Schema applies to*: Value +* *Schema definition*: Select Protobuf and paste the following schema definition: ++ +[,proto] +---- +syntax = "proto3"; + +message Order { + int64 order_id = 1; + string customer = 2; + string product = 3; + int64 amount = 4; // amount in cents + string status = 5; // "pending", "shipped", "completed" +} +---- + +// TODO: Verify exact steps to produce records in UI +. Return to the *Topics* page and select the `orders` topic. Produce a few sample records: ++ +[,bash] +---- +{"order_id": 1, "customer": "alice", "product": "keyboard", "amount": 7500, "status": "completed"} +---- ++ +[,bash] +---- +{"order_id": 2, "customer": "bob", "product": "monitor", "amount": 32000, "status": "shipped"} +---- ++ +[,bash] +---- +{"order_id": 3, "customer": "carol", "product": "mouse", "amount": 4500, "status": "pending"} +---- ++ +[,bash] +---- +{"order_id": 4, "customer": "alice", "product": "monitor", "amount": 32000, "status": "completed"} +---- ++ +[,bash] +---- +{"order_id": 5, "customer": "dave", "product": "keyboard", "amount": 7500, "status": "pending"} +---- + +When you continue to the next section, use `orders` as the topic name when you define the SQL table. + +== Connect to Redpanda SQL + +SQL connection details are available on your cluster's *SQL* tab in the https://cloud.redpanda.com/[Cloud console]. To connect using `psql`: + +. Get an authentication token using `rpk`. If prompted, log in to Redpanda Cloud in your browser: ++ +[,bash] +---- +rpsql_token=$(rpk cloud auth token) +---- + +. Copy and run the `psql` connection string from the *SQL* tab: ++ +[,bash] +---- +psql "host= port=5432 dbname=oxla user=ignored password=$rpsql_token options='-c auth_method=bearer' sslmode=require" +---- + +On a successful connection, you should see output similar to: + +// TODO: Verify current psql banner text. +[.no-copy] +---- +psql (17.8 (Homebrew), server 16.0 (oxla version: 1.0.0, build: af2dffb-Release-x86_64-GNU, asio)) +SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, compression: off, ALPN: none) +Type "help" for help. + +=> +---- + +== Query a Redpanda topic + +When you enable Redpanda SQL, the engine automatically creates a Redpanda catalog named `default_redpanda_catalog` that connects to your cluster. + +To query a Redpanda topic as a SQL table, define a table against the topic with `CREATE TABLE`. The following example uses the `orders` topic from the previous section. Replace `orders` with the name of your topic, and `orders-value` with the Schema Registry subject that holds the topic's value schema: + +[,sql] +---- +CREATE TABLE default_redpanda_catalog=>orders WITH ( + topic = 'orders', + schema_subject = 'orders-value' +); +---- + +Redpanda SQL reads the registered Protobuf schema from Schema Registry and maps each top-level field to a SQL column. + +== Run queries + +After you create the table, query your topic data with standard SQL. The following examples use the `orders` schema from the optional sample data section. If you're using your own topic, substitute the table name and column names. + +View a sample of records: + +[,sql] +---- +SELECT * FROM default_redpanda_catalog=>orders LIMIT 10; +---- + +Count orders by status: + +[,sql] +---- +SELECT status, COUNT(*) AS total_orders +FROM default_redpanda_catalog=>orders +GROUP BY status; +---- + +Find the largest orders: + +[,sql] +---- +SELECT order_id, customer, product, amount +FROM default_redpanda_catalog=>orders +WHERE amount > 10000 +ORDER BY amount DESC +LIMIT 20; +---- + +== (Optional) Grant access to a non-admin user + +Redpanda SQL is deny-all by default. All queries to Redpanda SQL run under a single super-user SASL credential associated with `default_redpanda_catalog`, but the engine enforces per-user table access through Redpanda SQL `GRANT` statements. To let a non-admin user query a table, the admin creates a role and grants access: + +// TODO: Confirm the exact GRANT mechanism with engineering. Open questions: +// - Does access to SQL engine also require an accout in Redpanda Cloud (user +// or service account), with the new SQL roles? +// - Does the SQL `GRANT` statement alone enforce per-user table access, or is +// an additional ACL / Kafka-side step required? +// Update the steps and example below once confirmed. + +. Create a role for the user: ++ +[,sql] +---- +CREATE ROLE analyst LOGIN PASSWORD ''; +---- + +. Grant `SELECT` on the table to the role: ++ +[,sql] +---- +GRANT SELECT ON TABLE default_redpanda_catalog=>orders TO analyst; +---- + +The non-admin user can now connect to Redpanda SQL with their credentials and run `SELECT` against the `orders` table. + +== Next steps + +* xref:reference:sql/index.adoc[Redpanda SQL reference]: Explore the full SQL syntax, data types, functions, and clauses. +* xref:sql:connect-to-sql/language-clients/psycopg2.adoc[Connect with Python (psycopg2)]: Query Redpanda SQL programmatically. +* xref:sql:connect-to-sql/language-clients/java-jdbc.adoc[Connect with Java (JDBC)]: Integrate with Java applications.