Skip to content

[BENCH] Real-world query performance comparison #652

@FrancescAlted

Description

@FrancescAlted

examples/ctable/real_world.py exists but is a fairly raw script. There is no self-contained example that clearly shows what performance gain SUMMARY indexes provide over a full scan, or how block size affects that gain — which is the most common question users will have after enabling auto-indexing.

Suggested work: Write bench/ctable/summary_index_perf.py that:

  • Generates a synthetic CTable with a few million rows and numeric columns
  • Runs the same where() query three ways: no index, SUMMARY at chunk granularity, SUMMARY at block granularity
  • Prints a clean results table (rows scanned, time, speedup)
  • Includes comments explaining the trade-offs

Ideally, it should work without any external dataset so it can be run immediately after install, but using an accessible dataset is also an option.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationhelp wantedExtra attention is needed

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions