Skip to content

feat: add manifest namespace extended properties#6215

Open
wojiaodoubao wants to merge 1 commit intolance-format:mainfrom
wojiaodoubao:manifest-extended-properties
Open

feat: add manifest namespace extended properties#6215
wojiaodoubao wants to merge 1 commit intolance-format:mainfrom
wojiaodoubao:manifest-extended-properties

Conversation

@wojiaodoubao
Copy link
Copy Markdown
Contributor

This is a sub-task of #5741

@github-actions github-actions bot added enhancement New feature or request python java labels Mar 17, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR Review: feat: add manifest namespace extended properties

Performance: query_manifest reads all columns (P1)

The new unified query_manifest() method removes the .project() call that the old query_manifest_for_table and query_manifest_for_namespace had. Previously, table lookups only read object_id + location, and namespace lookups only read object_id + metadata. Now every query reads all columns including all extended property columns.

As the number of extended properties grows, this becomes increasingly wasteful for simple existence checks and lookups. Consider keeping a projection — at minimum project the basic columns plus the extended columns, or re-add targeted projections for the two call sites.

Performance: Row-at-a-time slicing in query_manifest (P1)

for row_idx in 0..batch.num_rows() {
    let sliced_columns: Vec<Arc<dyn Array>> = batch
        .columns()
        .iter()
        .map(|col| col.slice(row_idx, 1))
        .collect();
    let row = RecordBatch::try_new(batch.schema(), sliced_columns)?;
    objects.push(parse_manifest_object(&row)?);
}

This creates a new RecordBatch per row. For queries that match many rows (e.g., listing all tables), this is O(rows × columns) allocations. Consider indexing into the batch directly in parse_manifest_object using a row index parameter instead of slicing.

Bug: Unused f32, f64 imports (P0 — clippy will fail)

use std::{
    collections::HashMap,
    f32, f64,
    hash::{DefaultHasher, Hash, Hasher},

These module imports (std::f32, std::f64) appear unused. This will likely trigger a clippy warning/error with -D warnings.

Bug: is_multiple_of requires nightly (P0)

if !s.len().is_multiple_of(2) {

usize::is_multiple_of is a nightly-only API (int_roundings feature). This won't compile on stable Rust. Use s.len() % 2 != 0 instead.

Design: Duplicated merge logic (P1)

The pattern of matching (extended_record, &request.properties) with 4 arms is copy-pasted across create_namespace_extended, declare_table_extended, and create_table_extended. Consider extracting a helper like fn resolve_extended_batch(...) to reduce duplication and the risk of the branches diverging.

Minor

  • build_metadata_json silently drops properties that fail to serialize (.ok()? on serde_json::to_string). HashMap<String, String> serialization cannot fail in practice, but the silent swallow is worth noting.
  • Comment typo: "Collect basic columns to excluded" → "to exclude"

🤖 Generated with Claude Code

@wojiaodoubao wojiaodoubao mentioned this pull request Mar 17, 2026
7 tasks
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 50.44053% with 225 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-namespace-impls/src/dir/manifest.rs 50.44% 199 Missing and 26 partials ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm feeling a little uncertain about the efficiency of this code, particularly around the handling of batches. Though I also think the bot is right about projecting many columns.

Comment thread rust/lance-namespace-impls/src/dir/manifest.rs Outdated
Comment thread rust/lance-namespace-impls/src/dir/manifest.rs Outdated
Comment thread rust/lance-namespace-impls/src/dir/manifest.rs
Comment thread rust/lance-namespace-impls/src/dir/manifest.rs Outdated
Comment thread rust/lance-namespace-impls/src/dir/manifest.rs Outdated
@wojiaodoubao wojiaodoubao force-pushed the manifest-extended-properties branch 2 times, most recently from 2afc019 to 935c6b2 Compare March 28, 2026 08:24
@wojiaodoubao wojiaodoubao force-pushed the manifest-extended-properties branch from 935c6b2 to 9574665 Compare March 28, 2026 08:58
@wojiaodoubao
Copy link
Copy Markdown
Contributor Author

I’m very sorry for the long delay. I’ve been tied up with something urgent recently. I’ve updated the PR—please take a look when you have time. Thanks very much! @wjones127

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request java python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants