Skip to content

Add aggregate functionality for base layers#384

Open
thomasp85 wants to merge 36 commits intomainfrom
issue160-aggregate
Open

Add aggregate functionality for base layers#384
thomasp85 wants to merge 36 commits intomainfrom
issue160-aggregate

Conversation

@thomasp85
Copy link
Copy Markdown
Collaborator

@thomasp85 thomasp85 commented Apr 27, 2026

Fix #160

  • basic infrastructure
  • ensure aggregation happens with numeric pos1
  • special case for range layers
  • Consider segment
  • update documentation
  • What about non-positional aggregation

Comment thread src/plot/layer/geom/stat_aggregate.rs Outdated
Comment thread src/plot/layer/geom/stat_aggregate.rs
@thomasp85 thomasp85 marked this pull request as ready for review May 4, 2026 19:00
Copy link
Copy Markdown
Collaborator

@teunbrand teunbrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm excited for this! Don't let the nitpicky comments convince you otherwise

Comment thread doc/syntax/clause/draw.qmd Outdated
Comment thread doc/syntax/clause/draw.qmd Outdated
Comment thread doc/syntax/clause/draw.qmd Outdated
Comment thread doc/syntax/clause/draw.qmd
Comment thread doc/syntax/layer/type/area.qmd Outdated
Comment thread src/plot/layer/geom/stat_aggregate.rs Outdated
Comment thread src/plot/layer/geom/stat_aggregate.rs Outdated
Comment thread src/plot/layer/geom/stat_aggregate.rs Outdated
Comment thread src/plot/layer/geom/stat_aggregate.rs
Comment thread src/plot/layer/geom/stat_aggregate.rs
Comment thread src/plot/layer/geom/types.rs Outdated
Comment thread doc/syntax/clause/draw.qmd Outdated
Comment thread doc/syntax/clause/draw.qmd Outdated
thomasp85 and others added 6 commits May 6, 2026 13:36
Copy link
Copy Markdown
Collaborator

@teunbrand teunbrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Thomas this looks to be in good shape to me now. There are a few things that we could pickup in a different PRs and one thing I'd like to have, but don't need to have.

Maybe for another PR:

  • Since we're wiring now aesthetic_ctx to every stat, we should see if we can use it to throw better error messages from the stats.
  • It might be interesting to investigate wether we could generalise the pattern of stat chaining you introduced here to arbitrary stat combinations in some way.

Like to have, not a blocker:

  • I'd really like the aggregation parameter to follow the paths we've carved out for all the other parameters as well. The less custom logic the better. I understand it has more complex validation requirements, so what I'd propose is the following:
    • It gets included in the default_params of the affected layers
    • The ParamDefinition gets the string_or_string_array contraint, which is purely for type-checking the parameter.
    • The extra validation can happen in the setup_layer() method(s). Ideally it is parsed right there but I don't know how easy it would be to transfer the parsed result to the stat. (The setup_layer() may also need to be called in the validate.rs path, but that was an oversight of mine).

Most of the comments below this are minor. Nice work!

Comment thread doc/syntax/clause/draw.qmd Outdated
Comment thread doc/syntax/clause/draw.qmd Outdated
Comment thread doc/syntax/clause/draw.qmd Outdated
Comment thread doc/syntax/clause/draw.qmd Outdated
Comment thread doc/syntax/clause/draw.qmd Outdated
Comment thread CHANGELOG.md
Comment thread src/plot/layer/geom/stat_aggregate.rs
Comment thread src/plot/layer/geom/stat_aggregate.rs
Comment thread src/plot/layer/geom/stat_aggregate.rs
Comment on lines +885 to +890
"WITH {raw_src} AS ({query}), {rn_src} AS (\
SELECT *, \
ROW_NUMBER() OVER ({partition}ORDER BY (SELECT 1)) AS \"__ggsql_rn__\", \
COUNT(*) OVER ({partition_no_order}) AS \"__ggsql_max_rn__\" \
FROM {raw_src}\
)",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all tests we have for this FIRST/LAST thing are at the words of the query level. Maybe we should add a little test where this FIRST/LAST thing gets executed see if the resulting data is correct (just to watch for regressions).

Co-authored-by: Teun van den Brand <49372158+teunbrand@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Including summary metrics

2 participants