Ctable 3 changes by Jacc4224 · Pull Request #606 · Blosc/python-blosc2

Jacc4224 · 2026-03-26T12:46:18Z

Pull request for local changes

Introduce CTable, a new columnar table class for efficient in-memory data storage using Blosc2 as the underlying compression engine. Each column is represented as a Column object wrapping a blosc2.NDArray with typed, compressed storage. Building on top of blosc2's existing infrastructure, CTable supports append, iteration and column-based queries. This is an early-stage (beta) implementation; the table is always fully loaded in memory. New files: - src/blosc2/ctable.py: CTable and Column class definitions - tests/ctable/: unit tests covering construction, slicing, deletion, compaction and row logic - bench/ctable/: benchmarks comparing CTable against pandas

Add CTable, a columnar in-memory table built on top of blosc2

- Add schema.py with spec primitives: int8/16/32/64, uint8/16/32/64, float32/64, bool, complex64/128, string, bytes — sharing a _NumericSpec mixin to avoid boilerplate - Add schema_compiler.py: compile_schema(), CompiledSchema/Column/Config, schema_to_dict() / schema_from_dict() for persistence groundwork - Export all spec types and field() from blosc2 namespace Validation: - Add schema_validation.py: Pydantic-backed row validation for append(), cached per schema, re-raised as plain ValueError - Add schema_vectorized.py: vectorized NumPy constraint checks for extend(), using np.char.str_len() for string/bytes columns - validate= per-call override on extend() (None inherits table default) CTable refactor: - Constructor accepts dataclass schemas; legacy Pydantic adapter kept - Schema introspection: table.schema, column_schema(), schema_dict() - _last_pos cache eliminates backward chunk scan on every append/extend - _grow() shared resize helper; delete() writes back in-place without creating a new array; _n_rows updated by subtraction not count_nonzero - head() and tail() unified through _find_physical_index() Tests and docs: - 135 tests across 10 test files, all passing - plans/ctable-implementation-log.md and ctable-user-guide.md added - Benchmarks: bench_validation.py and bench_append_regression.py

…QoL) Persistency: - FileTableStorage backend: disk layout _meta.b2frame / _valid_rows.b2nd / _cols/<name>.b2nd - CTable(Row, urlpath=..., mode="w"/"a"/"r"), CTable.open(), CTable.save(), CTable.load() - Read-only mode blocks all writes; save() always writes compacted rows Column aggregates: sum, min, max, mean, std, any, all (chunk-aware via iter_chunks) Column utilities: unique(), value_counts(), assign(), boolean mask __getitem__/__setitem__ Schema mutations: add_column (fills default for existing rows), drop_column, rename_column - All three update schema, handle disk files, and block on views View mutability model fix: - Views allow value writes (assign, __setitem__) — only structural mutations are blocked - _read_only=True reserved for mode="r" disk tables; base is not None guards structural ops QoL: __str__ pandas-style, __repr__, cbytes/nbytes, sample(n), Column.iter_chunks(size) Tests: 258 tests, ~5s — new test_persistency.py (33), test_schema_mutations.py (41), expanded test_column.py; optimized helpers to use to_numpy() instead of row[i]

Arrow compatibility Examples Tutorial

FrancescAlted · 2026-04-08T06:36:25Z

Overridden by PR #614

Jacc4224 and others added 22 commits March 26, 2026 11:05

Merge pull request Blosc#604 from Jacc4224/ctable-new

01e47f4

Add CTable, a columnar in-memory table built on top of blosc2

Add a plan for declaring a simple schema for CTable objects

c05c2ec

Add a pydantic as a new dependency

725c28b

Fix small formatting issues

0efd450

Simplify the plan for ctable schema

f504ad0

Disable wheel generation for each commit in this branch

46bf2e3

Add a new plan on CTable persistence

43bf562

_

e84f7ac

_

8de1870

Testing

a8db18d

Merge branch 'ctable3' of github.com:Blosc/python-blosc2 into my_ctable3

dd154b1

writen test

ce65607

Remove testing file

b623f0e

Merge branch 'ctable3' of github.com:Blosc/python-blosc2 into my_ctable3

b9e8c35

persistency half way done

ee1d0c4

CSV compatibility implementation

34f8219

Arrow compatibility Examples Tutorial

Persistent ctables.

6bf1ec8

Colision bug fixed 1

34c2eee

FrancescAlted closed this Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ctable 3 changes#606

Ctable 3 changes#606
Jacc4224 wants to merge 22 commits intoBlosc:mainfrom
Jacc4224:my_ctable3

Jacc4224 commented Mar 26, 2026

Uh oh!

FrancescAlted commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Jacc4224 commented Mar 26, 2026

Uh oh!

FrancescAlted commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants