Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
256 changes: 151 additions & 105 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,37 @@
# olap-sql

[![Go](https://github.com/AWaterColorPen/olap-sql/actions/workflows/go.yml/badge.svg)](https://github.com/AWaterColorPen/olap-sql/actions/workflows/go.yml)
[![Go Reference](https://pkg.go.dev/badge/github.com/awatercolorpen/olap-sql.svg)](https://pkg.go.dev/github.com/awatercolorpen/olap-sql)

## Introduction

olap-sql is golang library for generating **adapted sql** by **olap query** with metrics, dimension and filter.
Then get **formatted sql result** by queried metrics and dimension.
**olap-sql** is a Go library that turns high-level OLAP query definitions into adapted SQL for multiple database backends (ClickHouse, MySQL, PostgreSQL, SQLite). You describe *what* you want — metrics, dimensions, filters — and olap-sql figures out *how* to query it.

### Example
### How it works

There is unprocessed olap data with table named `wikistat`.

| date | time | hits |
|------------|---------------------|------|
| 2021-05-07 | 2021-05-07 09:28:27 | 4783 |
| 2021-05-07 | 2021-05-07 09:33:59 | 1842 |
| 2021-05-07 | 2021-05-07 10:34:12 | 0 |
| 2021-05-06 | 2021-05-06 20:32:41 | 5 |
| 2021-05-06 | 2021-05-06 21:16:39 | 139 |

It wants a sql to query the data with `metrics: sum(hits) / count(*)` and `dimension: date`.

```sql
SELECT wikistat.date AS date, ( ( 1.0 * SUM(wikistat.hits) ) / NULLIF(( COUNT(*) ), 0) ) AS hits_avg FROM wikistat AS wikistat GROUP BY wikistat.date
```

It wants a sql to query the data with `metrics: sum(hits)` and `filter: date <= '2021-05-06'`.

```sql
SELECT SUM(wikistat.hits) AS hits FROM wikistat AS wikistat WHERE wikistat.date <= '2021-05-06'
Query (metrics + dimensions + filters)
Dictionary (schema/config)
Clause (backend-specific IR)
SQL string ──► Database ──► Result
```

## Documentation
---

## Quick Start

1. [Configuration](./docs/configuration.md) to configure olap-sql instance and OLAP dictionary.
2. [Query](./docs/query.md) to define olap query.
3. [Result](./docs/result.md) to parse olap result.
### 1. Install

## Getting Started
```bash
go get github.com/awatercolorpen/olap-sql
```

### Define the OLAP dictionary configuration file
### 2. Define the schema (TOML)

Create a new file for example named `olap-sql.toml` to define
[sets](./docs/configuration.md#sets),
[sources](./docs/configuration.md#sources),
[metrics](./docs/configuration.md#metrics),
[dimensions](./docs/configuration.md#dimensions).
Create `olap-sql.toml` describing your data model:

```toml
sets = [
Expand All @@ -57,8 +43,8 @@ sources = [
]

metrics = [
{data_source = "wikistat", type = "METRIC_SUM", name = "hits", field_name = "hits", value_type = "VALUE_INTEGER"},
{data_source = "wikistat", type = "METRIC_COUNT", name = "count", field_name = "*", value_type = "VALUE_INTEGER"},
{data_source = "wikistat", type = "METRIC_SUM", name = "hits", field_name = "hits", value_type = "VALUE_INTEGER"},
{data_source = "wikistat", type = "METRIC_COUNT", name = "count", field_name = "*", value_type = "VALUE_INTEGER"},
{data_source = "wikistat", type = "METRIC_DIVIDE", name = "hits_avg", value_type = "VALUE_FLOAT", dependency = ["wikistat.hits", "wikistat.count"]},
]

Expand All @@ -67,99 +53,159 @@ dimensions = [
]
```

### To make use of olap-sql in golang
### 3. Create a Manager

```go
package main

import (
"encoding/json"
"fmt"
"log"

olapsql "github.com/awatercolorpen/olap-sql"
"github.com/awatercolorpen/olap-sql/api/types"
)

func main() {
cfg := &olapsql.Configuration{
// Map each DB type to a connection option.
ClientsOption: olapsql.ClientsOption{
"clickhouse": {
DSN: "clickhouse://localhost:9000/default",
Type: types.DBTypeClickHouse,
},
},
// Point to your TOML schema file.
DictionaryOption: &olapsql.Option{
AdapterOption: olapsql.AdapterOption{Dsn: "olap-sql.toml"},
},
}

Create a new [manager instance](./docs/configuration.md#manager-configuration).
manager, err := olapsql.NewManager(cfg)
if err != nil {
log.Fatal(err)
}

```golang
import "github.com/awatercolorpen/olap-sql"
// --- Build the query ---
queryJSON := `{
"data_set_name": "wikistat",
"time_interval": {"name": "date", "start": "2021-05-06", "end": "2021-05-08"},
"metrics": ["hits", "hits_avg"],
"dimensions": ["date"]
}`

query := &types.Query{}
if err := json.Unmarshal([]byte(queryJSON), query); err != nil {
log.Fatal(err)
}

// set clients option
clientsOption := map[string]*olapsql.DBOption{
"clickhouse": &olapsql.DBOption{
DSN: "clickhouse://localhost:9000/default",
Type: "clickhouse"
}
},
// --- (Optional) Inspect the generated SQL ---
sql, err := manager.BuildSQL(query)
if err != nil {
log.Fatal(err)
}
fmt.Println("Generated SQL:", sql)

// set dictionary option
dictionaryOption := olapsql.AdapterOption{
Dsn: "olap_sql.toml",
}
// --- Run the query ---
result, err := manager.RunSync(query)
if err != nil {
log.Fatal(err)
}

// build manager configuration
configuration := &olapsql.Configuration{
ClientsOption: clientsOption,
DictionaryOption: dictionaryOption,
out, _ := json.MarshalIndent(result, "", " ")
fmt.Println(string(out))
}

// create a new manager instance
manager, err := olapsql.NewManager(configuration)
```

Build olap-sql [query](./docs/query.md).
**Generated SQL** (ClickHouse):

```sql
SELECT
wikistat.date AS date,
SUM(wikistat.hits) AS hits,
(1.0 * SUM(wikistat.hits)) / NULLIF(COUNT(*), 0) AS hits_avg
FROM wikistat AS wikistat
WHERE wikistat.date >= '2021-05-06'
AND wikistat.date < '2021-05-08'
GROUP BY wikistat.date
```

```golang
import "github.com/awatercolorpen/olap-sql/api/types"
**Result JSON**:

queryJson := `
```json
{
"data_set_name": "wikistat",
"time_interval": {
"name": "date",
"start": "2021-05-06",
"end": "2021-05-08"
},
"metrics": [
"hits",
"hits_avg"
],
"dimensions": [
"date"
"dimensions": ["date", "hits", "hits_avg"],
"source": [
{"date": "2021-05-06T00:00:00Z", "hits": 147, "hits_avg": 49},
{"date": "2021-05-07T00:00:00Z", "hits": 7178, "hits_avg": 897.25}
]
}`
}
```

---

query := &types.Query{}
err := json.Unmarshal([]byte(queryJson), query)
## Common Patterns

### Add filters

```go
query := &types.Query{
DataSetName: "wikistat",
Metrics: []string{"hits"},
Filters: []*types.Filter{
{
OperatorType: types.FilterOperatorTypeLessEquals,
Name: "date",
Value: []any{"2021-05-06"},
},
},
}
```

Generated SQL:

```sql
SELECT SUM(wikistat.hits) AS hits
FROM wikistat AS wikistat
WHERE wikistat.date <= '2021-05-06'
```

Run query to get result from manager.
### Stream large result sets

For large queries, use `RunChan` to receive rows one at a time instead of buffering everything in memory:

```golang
// run query with parallel chan
```go
result, err := manager.RunChan(query)
```

### Inspect SQL without executing

Use `BuildSQL` to preview the generated query (useful for debugging):

// run query with sync
result, err := manager.RunSync(query)
```go
sql, err := manager.BuildSQL(query)
fmt.Println(sql)
```

### Generate SQL then format result
---

Firstly, auto generate sql. [For detail](./docs/query.md#generate-sql-from-query).
## Documentation

Then, get [result](./docs/result.md) json with `dimensions` property and `source` property.
| Document | Description |
|----------|-------------|
| [Configuration](./docs/configuration.md) | Configure Manager, clients, and the OLAP dictionary |
| [Query](./docs/query.md) | Define metrics, dimensions, filters, orders, and limits |
| [Result](./docs/result.md) | Parse and work with query results |

```json
{
"dimensions": [
"date",
"hits",
"hits_avg"
],
"source": [
{
"date": "2021-05-06T00:00:00Z",
"hits": 147,
"hits_avg": 49
},
{
"date": "2021-05-07T00:00:00Z",
"hits": 7178,
"hits_avg": 897.25
}
]
}
```
---

## Requirements

- **Go 1.22+** (uses range-over-integer syntax)
- Supported databases: ClickHouse, MySQL, PostgreSQL, SQLite

---

## License

Expand Down
22 changes: 22 additions & 0 deletions client.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,24 @@ import (
"gorm.io/gorm/logger"
)

// ClientsOption is a map from connection-key to DBOption.
// The key is used to look up the correct database connection when running a query.
// Typical keys follow the pattern "<dbtype>" (e.g. "clickhouse") or
// "<dbtype>/<dataset>" for dataset-scoped connections.
type ClientsOption = map[string]*DBOption

// Clients is a registry of open *gorm.DB connections, keyed by "<dbtype>" or "<dbtype>/<dataset>".
type Clients map[string]*gorm.DB

// RegisterByKV registers a *gorm.DB connection under the composite key derived from dbType and dataset.
func (c Clients) RegisterByKV(dbType types.DBType, dataset string, db *gorm.DB) {
key := c.key(dbType, dataset)
c[key] = db
}

// RegisterByOption opens database connections for each entry in option
// and registers them in the Clients map.
// Returns an error if any connection cannot be established.
func (c Clients) RegisterByOption(option ClientsOption) error {
for k, v := range option {
db, err := v.NewDB()
Expand All @@ -28,12 +37,18 @@ func (c Clients) RegisterByOption(option ClientsOption) error {
return nil
}

// SetLogger replaces the GORM logger on every registered connection.
// Call this to enable SQL statement logging or to plug in a custom logger.
func (c Clients) SetLogger(log logger.Interface) {
for _, v := range c {
v.Config.Logger = log
}
}

// Get returns the *gorm.DB for the given dbType and dataset.
// If no dataset-specific connection is registered, it falls back to the
// type-level connection (dataset == "").
// Returns an error if neither key is found.
func (c Clients) Get(dbType types.DBType, dataset string) (*gorm.DB, error) {
key1 := c.key(dbType, dataset)
if v, ok := c[key1]; ok {
Expand All @@ -46,13 +61,16 @@ func (c Clients) Get(dbType types.DBType, dataset string) (*gorm.DB, error) {
return nil, fmt.Errorf("not found client %v %v", dbType, dataset)
}

// key builds the internal lookup key for a (dbType, dataset) pair.
func (c Clients) key(dbType types.DBType, dataset string) string {
if dataset == "" {
return fmt.Sprintf("%v", dbType)
}
return fmt.Sprintf("%v/%v", dbType, dataset)
}

// BuildDB selects the correct client for the clause and constructs
// a *gorm.DB with the translated query applied.
func (c Clients) BuildDB(clause types.Clause) (*gorm.DB, error) {
client, err := c.Get(clause.GetDBType(), clause.GetDataset())
if err != nil {
Expand All @@ -61,6 +79,8 @@ func (c Clients) BuildDB(clause types.Clause) (*gorm.DB, error) {
return clause.BuildDB(client)
}

// BuildSQL selects the correct client for the clause and returns
// the SQL string that would be executed, without actually running it.
func (c Clients) BuildSQL(clause types.Clause) (string, error) {
client, err := c.Get(clause.GetDBType(), clause.GetDataset())
if err != nil {
Expand All @@ -69,6 +89,8 @@ func (c Clients) BuildSQL(clause types.Clause) (string, error) {
return clause.BuildSQL(client)
}

// NewClients creates a Clients registry by opening connections for each DBOption in option.
// Returns an error if any connection fails to open.
func NewClients(option ClientsOption) (Clients, error) {
c := Clients{}
if err := c.RegisterByOption(option); err != nil {
Expand Down
Loading
Loading