Skip to content

Closes #112, adds VectorOptionsAttribute, #125

Merged
sl-at-ibm merged 4 commits intodatastax:mainfrom
a-random-steve:mr/112
Apr 16, 2026
Merged

Closes #112, adds VectorOptionsAttribute, #125
sl-at-ibm merged 4 commits intodatastax:mainfrom
a-random-steve:mr/112

Conversation

@a-random-steve
Copy link
Copy Markdown
Contributor

@a-random-steve a-random-steve commented Apr 2, 2026

Also moves LexicalOptionsAttribute to class level (not property)

Fixes #112

Comment on lines +38 to +44
public string SourceModel { get; set; }

/// <summary>The name of the embedding service provider.</summary>
public string Provider { get; set; }

/// <summary>The model name to use for embedding generation.</summary>
public string ModelName { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are ModelName and SourceModel redundant? e.g. looking at python, I only see model_name: https://docs.datastax.com/en/astra-api-docs/_attachments/python-client/astrapy/info.html#astrapy.info.CollectionVectorOptions

Copy link
Copy Markdown
Collaborator

@toptobes toptobes Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator

@sl-at-ibm sl-at-ibm Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution, these two should not be at the same level -- actually, assuming the shared goal of isomorphism with the Data API payloads, modelName should be within a "service" subobject together with provider (and authentication and/or parameters when they apply).

Here is a real payload for a createCollection with all the relevant stuff, for reference:

{
  "createCollection": {
    "name": "c",
    "options": {
      "vector": {
        "dimension": 1024,
        "metric": "dot_product",
        "service": {
          "provider": "nvidia",
          "modelName": "nvidia/nv-embedqa-e5-v5"
        },
        "sourceModel": "bert"
      }
    }
  }
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sl-at-ibm Agreed, they are at different levels.

public class VectorizeOptionsAttribute : Attribute
{
/// <summary>The number of dimensions for the vector.</summary>
public int Dimension { get; set; } = -1;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for Dimension and Metric to be here? (as opposed to being only 1-level-higher, in VectorOptionsAttribute where they belong)
On first sight, these should be removed from this class (and some of the definition-filling work in CollectionDefinition slightly revised accordingly).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sl-at-ibm sine the VectorOptionsAttribute accepts SourceModel, which Vectorize doesn't, I don't see a benefit in having VectorizeOptionsAttribute subclass VectorOptionsAttribute. Also, please note that these are just possible helper attributes for consistency, the user can specify the collection creation details manually.

@toptobes toptobes marked this pull request as draft April 8, 2026 16:42

case ColumnVectorizeAttribute vectorize:
createColumn = false;
vectorize.Validate($"Column {columnName}");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation should work for the auth pairs too, right? Maybe it should go in PairsToDict() directly?

@toptobes toptobes marked this pull request as ready for review April 15, 2026 17:06
@toptobes toptobes requested review from sl-at-ibm and toptobes April 15, 2026 17:17
Copy link
Copy Markdown
Collaborator

@sl-at-ibm sl-at-ibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (great job!)

@sl-at-ibm sl-at-ibm added this pull request to the merge queue Apr 16, 2026
Merged via the queue into datastax:main with commit 4e84a11 Apr 16, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

For Collections, add VectorOptionsAttribute and VectorizeOptionsAttribute

5 participants