diff --git a/modules/reference/pages/sql/sql-statements/create-table.adoc b/modules/reference/pages/sql/sql-statements/create-table.adoc index 027ea6681..291b2efa4 100644 --- a/modules/reference/pages/sql/sql-statements/create-table.adoc +++ b/modules/reference/pages/sql/sql-statements/create-table.adoc @@ -51,17 +51,25 @@ a|How to handle records that fail deserialization. |`struct_mapping_policy` |STRING |No -a|How to map nested structures to SQL columns. +a|How to map nested structures from the topic schema to SQL columns. -* `JSON` (default): Stores nested data as JSON. -* `FLATTEN`: Expands nested fields into top-level columns. -* `COMPOUND`: Maps to ROW types. -* `VARIANT`: Stores as a variant type. +* `COMPOUND` (default): Maps each nested structure to a SQL xref:reference:sql/sql-data-types/row.adoc[ROW] value with named fields, queryable using `(column).field_name` syntax. Cyclic types are not supported in `COMPOUND` mode — use `JSON` for recursive schemas. +* `JSON`: Stores each nested structure as a JSON value. Required for recursive (cyclic) types. |`output_schema_message_full_name` |STRING |No |Full Protobuf message name. Required when the schema contains multiple message definitions. + +|`confluent_wire_protocol` +|STRING +|No +a|Whether records on the topic are encoded with the https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/index.html#wire-format[Confluent Schema Registry wire format^] (a magic byte followed by a 4-byte schema ID before the payload). + +* `'true'` (default): Records carry the Confluent wire-format prefix. Use this for topics whose values were produced by a Schema-Registry-aware client. +* `'false'`: Records are raw Protobuf or Avro without the wire-format prefix. + +Only valid when `schema_lookup_policy = 'LATEST'`. |=== == Examples diff --git a/modules/sql/pages/query-data/index.adoc b/modules/sql/pages/query-data/index.adoc new file mode 100644 index 000000000..c9e39d9eb --- /dev/null +++ b/modules/sql/pages/query-data/index.adoc @@ -0,0 +1,3 @@ += Query data +:description: Query live and historical data in your Redpanda topics using standard PostgreSQL syntax. +:page-layout: index diff --git a/modules/sql/pages/query-data/query-streaming-topics.adoc b/modules/sql/pages/query-data/query-streaming-topics.adoc new file mode 100644 index 000000000..92df65119 --- /dev/null +++ b/modules/sql/pages/query-data/query-streaming-topics.adoc @@ -0,0 +1,74 @@ += Query streaming topics +:description: Map a Redpanda topic to a SQL table and run analytical queries directly against live streaming data. +:page-topic-type: how-to +:personas: app_developer, data_engineer +:learning-objective-1: Map a streaming Redpanda topic to a SQL table using the default Redpanda catalog +:learning-objective-2: Run analytical SQL queries against live topic data + +Map a Redpanda topic to a SQL table to run analytical queries directly against live streaming data without building ETL pipelines. Redpanda SQL reads each record's fields from the topic's schema in Schema Registry. + +To query the Iceberg-translated history of a Redpanda topic, see xref:sql:query-data/query-iceberg-topics.adoc[]. + +After completing these steps, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} + +== Prerequisites + +Before you query a topic with SQL: + +* Enable the Redpanda SQL engine on your Redpanda Bring Your Own Cloud (BYOC) cluster. See xref:sql:get-started/deploy-sql-cluster.adoc[Enable Redpanda SQL]. +* Connect to Redpanda SQL with `psql` or another PostgreSQL client. See xref:sql:connect-to-sql/index.adoc[Connect to Redpanda SQL]. +* Confirm that the Redpanda topic you want to query has a schema registered in Schema Registry. + +// TODO: Confirm permissions/roles/ACLs required +// Is it possible to use a topic without a registered schema? +// Any specific limitations on Protobuf vs JSON vs Avro formats? +// Any requirements related to wire format and subject naming strategy? + +== Map the topic to a SQL table + +Each Redpanda topic appears as a SQL table inside a Redpanda catalog. When you enable the SQL engine, Redpanda SQL automatically creates a catalog named `default_redpanda_catalog` that points at your cluster. + +Define a table against the topic with `CREATE TABLE`: + +[source,sql] +---- +CREATE TABLE default_redpanda_catalog=>orders WITH ( + topic = 'orders', + schema_subject = 'orders-value' +); +---- + +Replace `orders` with your topic name and `orders-value` with the Schema Registry subject that holds the topic's value schema. + +// TODO: Nested fields? +The table inherits its column definitions from the registered schema. For Protobuf schemas, Redpanda SQL maps each top-level field to a SQL column. + +== Run queries + +Query the table with standard `SELECT` syntax. The following query returns the first 10 records: + +[source,sql] +---- +SELECT * FROM default_redpanda_catalog=>orders LIMIT 10; +---- + +Aggregate and filter records using familiar PostgreSQL constructs: + +[source,sql] +---- +SELECT customer_id, SUM(amount) AS total +FROM default_redpanda_catalog=>orders +WHERE status = 'completed' +GROUP BY customer_id +ORDER BY total DESC +LIMIT 10; +---- + +== Next steps + +* xref:sql:query-data/query-iceberg-topics.adoc[Query Iceberg topics]: query the Iceberg-translated history of an Iceberg-enabled Redpanda topic, and run a single query that spans live and historical records. +* xref:reference:sql/sql-statements/create-table.adoc[CREATE TABLE]: full reference for the table-against-topic syntax, including all options. +* xref:reference:sql/index.adoc[Redpanda SQL Reference]: supported SQL statements, clauses, data types, and functions. diff --git a/modules/sql/pages/query-data/redpanda-catalogs.adoc b/modules/sql/pages/query-data/redpanda-catalogs.adoc index bffbd2479..c54df0b26 100644 --- a/modules/sql/pages/query-data/redpanda-catalogs.adoc +++ b/modules/sql/pages/query-data/redpanda-catalogs.adoc @@ -1,81 +1,2 @@ = Redpanda Catalogs -:description: Redpanda catalogs are named connections that map Redpanda topics to queryable SQL tables. -:page-topic-type: reference - -Redpanda catalogs are named connections that let you query Redpanda topics using standard SQL. The catalog model consists of three core concepts: - -* Catalogs: Named connections to a Redpanda cluster, created with xref:reference:sql/sql-statements/create-redpanda-catalog.adoc[CREATE REDPANDA CATALOG]. -* Tables: Redpanda topics mapped as queryable SQL tables using the `catalog_name\=>table_name` syntax, created with xref:reference:sql/sql-statements/create-table.adoc[CREATE TABLE]. -* Storage connections: Named connections to external object storage such as Amazon S3, created with xref:reference:sql/sql-statements/create-storage.adoc[CREATE STORAGE]. - -NOTE: Redpanda SQL operates in read-only mode. Data mutation operations such as `INSERT`, `UPDATE`, and `DELETE` are not available. Data is ingested into Redpanda topics and made queryable through catalog mappings. - -== Typical workflow - -To query Redpanda topic data with SQL: - -. Create a catalog connection: -+ -[source,sql] ----- -CREATE REDPANDA CATALOG production_redpanda -WITH ( - initial_brokers = 'broker1:9092', - schema_registry_url = 'http://schema-registry:8081' -); ----- - -. Map a topic as a table: -+ -[source,sql] ----- -CREATE TABLE production_redpanda=>user_events -WITH (topic = 'user-events'); ----- - -. Query the data: -+ -[source,sql] ----- -SELECT * FROM production_redpanda=>user_events LIMIT 10; ----- - -== Related statements - -[cols="<40%,<60%",options="header"] -|=== -|Statement |Description - -|xref:reference:sql/sql-statements/create-redpanda-catalog.adoc[CREATE REDPANDA CATALOG] -|Create a catalog connection to a Redpanda cluster. - -|xref:reference:sql/sql-statements/alter-redpanda-catalog.adoc[ALTER REDPANDA CATALOG] -|Modify connection properties of an existing catalog. - -|xref:reference:sql/sql-statements/create-table.adoc[CREATE TABLE] -|Map a Redpanda topic to a SQL table through a catalog. - -|xref:reference:sql/sql-statements/alter-table.adoc[ALTER TABLE] -|Modify options of an existing catalog table. - -|xref:reference:sql/sql-statements/drop-table.adoc[DROP TABLE] -|Remove a catalog table mapping. - -|xref:reference:sql/sql-statements/drop-redpanda-catalog.adoc[DROP REDPANDA CATALOG] -|Remove a Redpanda catalog connection. - -|xref:reference:sql/sql-statements/drop-storage.adoc[DROP STORAGE] -|Remove a named storage definition. - -|xref:reference:sql/sql-statements/show-tables.adoc[SHOW TABLES] -|List tables within a catalog. - -|xref:reference:sql/sql-statements/describe.adoc[DESCRIBE] -|Show details about a catalog or catalog table. - -|xref:reference:sql/sql-statements/create-storage.adoc[CREATE STORAGE] -|Create a connection to external object storage. - -|xref:reference:sql/sql-statements/alter-storage.adoc[ALTER STORAGE] -|Modify an existing storage connection. -|=== +// stub \ No newline at end of file