Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ pub enum Geom {
// Statistical geoms
Histogram, Density, Smooth, Boxplot, Violin,
// Annotation geoms
Text, Label, Segment, Arrow, HLine, VLine, AbLine, ErrorBar,
Text, Label, Segment, Arrow, Rule, Linear, ErrorBar,
}

pub enum AestheticValue {
Expand Down Expand Up @@ -1202,7 +1202,7 @@ All clauses (MAPPING, SETTING, PARTITION BY, FILTER) are optional.

- **Basic**: `point`, `line`, `path`, `bar`, `col`, `area`, `tile`, `polygon`, `ribbon`
- **Statistical**: `histogram`, `density`, `smooth`, `boxplot`, `violin`
- **Annotation**: `text`, `label`, `segment`, `arrow`, `hline`, `vline`, `abline`, `errorbar`
- **Annotation**: `text`, `label`, `segment`, `arrow`, `rule`, `linear`, `errorbar`

**MAPPING Clause** (Aesthetic Mappings):

Expand Down
5 changes: 2 additions & 3 deletions doc/ggsql.xml
Original file line number Diff line number Diff line change
Expand Up @@ -141,9 +141,8 @@
<item>label</item>
<item>segment</item>
<item>arrow</item>
<item>hline</item>
<item>vline</item>
<item>abline</item>
<item>rule</item>
<item>linear</item>
<item>errorbar</item>
</list>

Expand Down
20 changes: 12 additions & 8 deletions doc/syntax/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,21 @@ ggsql augments the standard SQL syntax with a number of new clauses to describe
## Layers
There are many different layers to choose from when visualising your data. Some are straightforward translations of your data into visual marks such as a point layer, while others perform more or less complicated calculations like e.g. the histogram layer. A layer is selected by providing the layer name after the `DRAW` clause

- [`point`](layer/point.qmd) is used to create a scatterplot layer
- [`line`](layer/line.qmd) is used to produce lineplots with the data sorted along the x axis
- [`path`](layer/path.qmd) is like `line` above but does not sort the data but plot it according to its own order
- [`point`](layer/point.qmd) is used to create a scatterplot layer.
- [`line`](layer/line.qmd) is used to produce lineplots with the data sorted along the x axis.
- [`path`](layer/path.qmd) is like `line` above but does not sort the data but plot it according to its own order.
- [`segment`](layer/segment.qmd) connects two points with a line segment.
- [`linear`](layer/linear.qmd) draws a long line parameterised by a coefficient and intercept.
- [`rule`](layer/rule.qmd) draws horizontal and vertical reference lines.
- [`area`](layer/area.qmd) is used to display series as an area chart.
- [`ribbon`](layer/ribbon.qmd) is used to display series extrema.
- [`polygon`](layer/polygon.qmd) is used to display arbitrary shapes as polygons.
- [`bar`](layer/bar.qmd) creates a bar chart, optionally calculating y from the number of records in each bar
- [`density`](layer/density.qmd) creates univariate kernel density estimates, showing the distribution of a variable
- [`violin`](layer/violin.qmd) displays a rotated kernel density estimate
- [`histogram`](layer/histogram.qmd) bins the data along the x axis and produces a bar for each bin showing the number of records in it
- [`boxplot`](layer/boxplot.qmd) displays continuous variables as 5-number summaries
- [`bar`](layer/bar.qmd) creates a bar chart, optionally calculating y from the number of records in each bar.
- [`density`](layer/density.qmd) creates univariate kernel density estimates, showing the distribution of a variable.
- [`violin`](layer/violin.qmd) displays a rotated kernel density estimate.
- [`histogram`](layer/histogram.qmd) bins the data along the x axis and produces a bar for each bin showing the number of records in it.
- [`boxplot`](layer/boxplot.qmd) displays continuous variables as 5-number summaries.
- [`errorbar`](layer/errorbar.qmd) a line segment with hinges at the endpoints.

## Scales
A scale is responsible for translating a data value to an aesthetic literal, e.g. a specific color for the fill aesthetic, or a radius in points for the size aesthetic. A scale is a combination of a specific aesthetic and a scale type
Expand Down
71 changes: 71 additions & 0 deletions doc/syntax/layer/errorbar.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
title: "Errorbar"
---

> Layers are declared with the [`DRAW` clause](../clause/draw.qmd). Read the documentation for this clause for a thorough description of how to use it.

Errorbars are used to display paired metrics, typically some interval, for a variable. It is displayed as a line between the two values, often with hinges at the ends.

## Aesthetics
The following aesthetics are recognised by the errorbar layer.

### Required
* `x` or `y`: Position on the x- or y-axis. These are mutually exclusive.
* `xmin` or `ymin`: Position of one of the interval ends orthogonal to the main position. These are also mutually exclusive.
* `xmax` or `ymax`: Position of the other interval end orthogonal to the main position. These are also mutually exclusive.

Note that the required aesthetics is either a set of {`x`, `ymin`, `ymax`} or {`y`, `xmin`, `xmax`} and *not* a combination of the two.

### Optional
* `stroke`/`colour`: The colour of the lines in the errorbar.
* `opacity`: The opacity of the colour.
* `linewidth`: The width of the lines in the errorbar.
* `linetype`: The dash pattern of the lines in the errorbar.

## Settings
* `width`: The width of the hinges in points. Can be set to `null` to not display hinges.

## Data transformation
The errorbar layer does not transform its data but passes it through unchanged.

## Examples

```{ggsql}
#| code-fold: true
#| code-summary: "Create example data"
CREATE TABLE penguin_summary AS
SELECT
species,
MEAN(bill_dep) - STDDEV(bill_dep) AS low,
MEAN(bill_dep) AS mean,
MEAN(bill_dep) + STDDEV(bill_dep) AS high
FROM ggsql:penguins
GROUP BY species
```

Classic errorbar with point at centre.

```{ggsql}
VISUALISE species AS x FROM penguin_summary
DRAW errorbar MAPPING low AS ymax, high AS ymin
DRAW point MAPPING mean AS y
```

Dynamite plot using bars instead of points, using extra wide hinges.

```{ggsql}
VISUALISE species AS x FROM penguin_summary
DRAW errorbar
MAPPING low AS ymax, high AS ymin
SETTING width => 40
DRAW bar MAPPING mean AS y
```

The hinges can be omitted by setting `null` as width.

```{ggsql}
VISUALISE species AS x FROM penguin_summary
DRAW errorbar
MAPPING low AS ymax, high AS ymin
SETTING width => null
```
63 changes: 63 additions & 0 deletions doc/syntax/layer/linear.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
title: "Linear line"
---

> Layers are declared with the [`DRAW` clause](../clause/draw.qmd). Read the documentation for this clause for a thorough description of how to use it.

The linear layer is used to draw diagonal reference lines based on a coefficient and intercept. This is useful for adding regression lines, diagonal guides, or mathematical functions to plots. The lines extend across the full extent of the x-axis, regardless of the data range.
The layer is named for the following formula:

$$
y = a + \beta x
$$

Where $a$ is the `intercept` and $\beta$ is the `coef`.

## Aesthetics
The following aesthetics are recognised by the abline layer.

### Required
* `coef`: The coefficient/slope of the line i.e. the amount $y$ increases for every unit of $x$.
* `intercept`: The intercept where the line crosses the y-axis at $x = 0$.

### Optional
* `colour`/`stroke`: The colour of the line
* `opacity`: The opacity of the line
* `linewidth`: The width of the line
* `linetype`: The type of the line, i.e. the dashing pattern

## Settings
The linear layer has no additional settings.

## Data transformation
The linear layer does not transform its data but passes it through unchanged.

## Examples

Add a simple reference line to a scatterplot:

```{ggsql}
VISUALISE FROM ggsql:penguins
DRAW point MAPPING bill_len AS x, bill_dep AS y
DRAW linear MAPPING 0.4 AS coef, -1 AS intercept
```

Add multiple reference lines with different colors from a separate dataset:

```{ggsql}
WITH lines AS (
SELECT * FROM (VALUES
(0.4, -1, 'Line A'),
(0.2, 8, 'Line B'),
(0.8, -19, 'Line C')
) AS t(coef, intercept, label)
)
VISUALISE FROM ggsql:penguins
DRAW point MAPPING bill_len AS x, bill_dep AS y
DRAW linear
MAPPING
coef AS coef,
intercept AS intercept,
label AS colour
FROM lines
```
69 changes: 69 additions & 0 deletions doc/syntax/layer/rule.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
title: "H Line"
---

> Layers are declared with the [`DRAW` clause](../clause/draw.qmd). Read the documentation for this clause for a thorough description of how to use it.

The rule layer is used to draw horizontal or vertical reference lines at specified values. This is useful for adding thresholds, means, medians, avent markers, cutoff dates or other guides to the plot. The lines span the full width or height of the panels.

## Aesthetics
The following aesthetics are recognised by the hline layer.

### Required
* `x`\*: The x-coordinate for the vertical line.
* `y`\*: The y-coordinate for the horizontal line

\* Exactly one of `x` or `y` is required, not both.

### Optional
* `colour`/`stroke`: The colour of the line
* `opacity`: The opacity of the line
* `linewidth`: The width of the line
* `linetype`: The type of the line, i.e. the dashing pattern

## Settings
The rule layer has no additional settings.

## Data transformation
The rule layer does not transform its data but passes it through unchanged.

## Examples

Add a horizontal threshold line to a time series plot:

```{ggsql}
SELECT Date AS date, temp AS temperature
FROM ggsql:airquality
WHERE Month = 5

VISUALISE
DRAW line MAPPING date AS x, temperature AS y
DRAW rule MAPPING 70 AS y
```

Add a vertical line to mark a specific value:

```{ggsql}
VISUALISE FROM ggsql:penguins
DRAW point MAPPING bill_len AS x, bill_dep AS y
DRAW rule MAPPING 45 AS x
```

Add multiple threshold lines with different colors:

```{ggsql}
WITH thresholds AS (
SELECT * FROM (VALUES
(70, 'Target'),
(80, 'Warning'),
(90, 'Critical')
) AS t(value, label)
)
SELECT Date AS date, temp AS temperature
FROM ggsql:airquality
WHERE Month = 5

VISUALISE
DRAW line MAPPING date AS x, temperature AS y
DRAW rule MAPPING value AS y, label AS colour FROM thresholds
```
91 changes: 91 additions & 0 deletions doc/syntax/layer/segment.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
title: "Segment"
---

> Layers are declared with the [`DRAW` clause](../clause/draw.qmd). Read the documentation for this clause for a thorough description of how to use it.

The segment layer is used to create line segments between two endpoints. If differs from [lines](line.qmd) and [paths](path.qmd) in that it connects just two points rather than many. Data is expected to be in a different shape, with 4 coordinates for the start (x, y) and end (xend, yend) points on a single row.

## Aesthetics
The following aesthetics are recognised by the segment layer.

### Required
* `x`: Position along the x-axis of the start point.
* `y`: Position along the y-axis of the end point.
* `xend`\*: Position along the x-axis of the end point.
* `yend`\*: Position along the y-axis of the end point.

\* Only one of `xend` and `yend` is required.
If one is missing, it takes on the value of the start point.

### Optional
* `colour`/`stroke`: The colour of the line.
* `opacity`: The opacity of the line.
* `linewidth`: The width of the line.
* `linetype`: The type of the line, i.e. the dashing pattern.

## Settings
The segment layer has no additional settings.

## Data transformation
The segment layer does not transform its data but passes it through unchanged.

## Data transformation

## Examples

Segments are useful when you have known start and end points of the data. For example in a graph

```{ggsql}
WITH edges AS (
SELECT * FROM (VALUES
(0, 0, 1, 1, 'A'),
(1, 1, 2, 1, 'A'),
(2, 1, 3, 0, 'A'),
(0, 3, 1, 2, 'B'),
(1, 2, 2, 2, 'B'),
(2, 2, 3, 3, 'B'),
(1, 1, 1, 2, 'C'),
(2, 2, 2, 1, 'C')
) AS t(x, y, xend, yend, type)
)
VISUALISE x, y, xend, yend FROM edges
DRAW segment MAPPING type AS stroke
```

You can use segments as part of a lollipop chart by anchoring one of the ends to 0.
Note that `xend` is missing and has taken up the value of `x`.

```{ggsql}
SELECT ROUND(bill_dep) AS bill_dep, COUNT(*) AS n
FROM ggsql:penguins
GROUP BY ROUND(bill_dep)

VISUALISE bill_dep AS x, n AS y
DRAW segment MAPPING 0 AS yend
DRAW point
```

By overlaying a thick line on a thin line, you can create a candlestick chart.

```{ggsql}
SELECT
FIRST(Date) AS date,
FIRST(temp) AS open,
LAST(temp) AS close,
MAX(temp) AS high,
MIN(temp) AS low,
CASE
WHEN FIRST(temp) > LAST(temp) THEN 'colder'
ELSE 'warmer'
END AS trend
FROM ggsql:airquality
GROUP BY WEEKOFYEAR(Date)

VISUALISE date AS x, trend AS colour
DRAW segment
MAPPING open AS y, close AS yend
SETTING linewidth => 5
DRAW segment
MAPPING low AS y, high AS yend
```
2 changes: 1 addition & 1 deletion ggsql-vscode/syntaxes/ggsql.tmLanguage.json
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@
"patterns": [
{
"name": "support.type.geom.ggsql",
"match": "\\b(point|line|path|bar|col|area|tile|polygon|ribbon|histogram|density|smooth|boxplot|violin|text|label|segment|arrow|hline|vline|abline|errorbar)\\b"
"match": "\\b(point|line|path|bar|col|area|tile|polygon|ribbon|histogram|density|smooth|boxplot|violin|text|label|segment|arrow|rule|linear|errorbar)\\b"
},
{ "include": "#common-clause-patterns" }
]
Expand Down
5 changes: 2 additions & 3 deletions src/parser/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -595,9 +595,8 @@ fn parse_geom_type(text: &str) -> Result<Geom> {
"label" => Ok(Geom::label()),
"segment" => Ok(Geom::segment()),
"arrow" => Ok(Geom::arrow()),
"hline" => Ok(Geom::hline()),
"vline" => Ok(Geom::vline()),
"abline" => Ok(Geom::abline()),
"rule" => Ok(Geom::rule()),
"linear" => Ok(Geom::linear()),
"errorbar" => Ok(Geom::errorbar()),
_ => Err(GgsqlError::ParseError(format!(
"Unknown geom type: {}",
Expand Down
Loading