Skip to content

feat: Add Astra document store operations#2904

Open
Keyur-S-Patel wants to merge 5 commits intodeepset-ai:mainfrom
Keyur-S-Patel:fix/2637-azure-ai-search
Open

feat: Add Astra document store operations#2904
Keyur-S-Patel wants to merge 5 commits intodeepset-ai:mainfrom
Keyur-S-Patel:fix/2637-azure-ai-search

Conversation

@Keyur-S-Patel
Copy link
Contributor

Related Issues

Proposed Changes:

Added the new document store operations requested for AstraDocumentStore:

  • count_documents_by_filter(filters: dict[str, Any]) -> int
  • count_unique_metadata_by_filter(filters: dict[str, Any], metadata_fields: list[str]) -> dict[str, int]
  • get_metadata_fields_info() -> dict[str, dict]
  • get_metadata_field_min_max(metadata_field: str) -> dict[str, Any]
  • get_metadata_field_unique_values(metadata_field: str, search_term: str | None, from_: int, size: int) -> tuple[list[str], int]

Implementation details:

  • Added support methods in the Astra client for filtered counting, distinct value retrieval, and projected document reads.
  • Implemented the new AstraDocumentStore APIs on top of Astra’s existing collection capabilities.
  • Kept existing document store methods unchanged and scoped the new logic to the added APIs only.
  • Updated docstrings for the new methods.
  • Added both unit and integration tests for the new operations.

How did you test it?

  • Ran hatch run test:unit
  • Ran hatch run fmt
  • Ran hatch run test:types
  • Added unit tests for the new operations using mocked Astra client behavior
  • Added integration tests for the new operations in the Astra document store test suite
    Output for integration tests
plugins: anyio-4.12.1, asyncio-1.3.0, rerunfailures-16.1, cov-7.0.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 84 items / 12 deselected / 72 selected                                                                                                   

test_document_store.py::TestDocumentStore::test_update_by_filter <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [  1%]
test_document_store.py::TestDocumentStore::test_update_by_filter_no_matches <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [  2%]
test_document_store.py::TestDocumentStore::test_update_by_filter_multiple_fields <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [  4%]
test_document_store.py::TestDocumentStore::test_update_by_filter_advanced_filters <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [  5%]
test_document_store.py::TestDocumentStore::test_delete_by_filter <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [  6%]
test_document_store.py::TestDocumentStore::test_delete_by_filter_no_matches <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [  8%]
test_document_store.py::TestDocumentStore::test_delete_by_filter_advanced_filters <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [  9%]
test_document_store.py::TestDocumentStore::test_delete_all_documents <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 11%]
test_document_store.py::TestDocumentStore::test_delete_all_documents_empty_store <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 12%]
test_document_store.py::TestDocumentStore::test_delete_all_documents_without_recreate_index <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py SKIPPED [ 13%]
test_document_store.py::TestDocumentStore::test_delete_all_documents_with_recreate_index <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py SKIPPED [ 15%]
test_document_store.py::TestDocumentStore::test_write_documents_duplicate_fail <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 16%]
test_document_store.py::TestDocumentStore::test_write_documents_duplicate_skip <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 18%]
test_document_store.py::TestDocumentStore::test_write_documents_duplicate_overwrite <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 19%]
test_document_store.py::TestDocumentStore::test_write_documents_invalid_input <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 20%]
test_document_store.py::TestDocumentStore::test_no_filters <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 22%]
test_document_store.py::TestDocumentStore::test_comparison_equal <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 23%]
test_document_store.py::TestDocumentStore::test_comparison_in <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 25%]
test_document_store.py::TestDocumentStore::test_comparison_in_with_with_non_list <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 26%]
test_document_store.py::TestDocumentStore::test_comparison_in_with_with_non_list_iterable <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 27%]
test_document_store.py::TestDocumentStore::test_and_operator <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 29%]
test_document_store.py::TestDocumentStore::test_or_operator <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 30%]
test_document_store.py::TestDocumentStore::test_missing_top_level_operator_key <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 31%]
test_document_store.py::TestDocumentStore::test_missing_top_level_conditions_key <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 33%]
test_document_store.py::TestDocumentStore::test_missing_condition_field_key <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 34%]
test_document_store.py::TestDocumentStore::test_missing_condition_operator_key <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 36%]
test_document_store.py::TestDocumentStore::test_missing_condition_value_key <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 37%]
test_document_store.py::TestDocumentStore::test_delete_documents <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 38%]
test_document_store.py::TestDocumentStore::test_delete_documents_empty_document_store <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 40%]
test_document_store.py::TestDocumentStore::test_count_empty <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 41%]
test_document_store.py::TestDocumentStore::test_count_not_empty <- ../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py PASSED [ 43%]
test_document_store.py::TestDocumentStore::test_comparison_equal_with_none PASSED                                                            [ 44%]
test_document_store.py::TestDocumentStore::test_write_documents PASSED                                                                       [ 45%]
test_document_store.py::TestDocumentStore::test_write_documents_skip_duplicates PASSED                                                       [ 47%]
test_document_store.py::TestDocumentStore::test_delete_documents_non_existing_document PASSED                                                [ 48%]
test_document_store.py::TestDocumentStore::test_delete_documents_more_than_twenty_delete_all PASSED                                          [ 50%]
test_document_store.py::TestDocumentStore::test_delete_documents_more_than_twenty_delete_ids PASSED                                          [ 51%]
test_document_store.py::TestDocumentStore::test_filter_documents_nested_filters PASSED                                                       [ 52%]
test_document_store.py::TestDocumentStore::test_filter_documents_by_id PASSED                                                                [ 54%]
test_document_store.py::TestDocumentStore::test_filter_documents_by_in_operator PASSED                                                       [ 55%]
test_document_store.py::TestDocumentStore::test_count_documents_by_filter PASSED                                                             [ 56%]
test_document_store.py::TestDocumentStore::test_count_unique_metadata_by_filter PASSED                                                       [ 58%]
test_document_store.py::TestDocumentStore::test_get_metadata_fields_info PASSED                                                              [ 59%]
test_document_store.py::TestDocumentStore::test_get_metadata_field_min_max PASSED                                                            [ 61%]
test_document_store.py::TestDocumentStore::test_get_metadata_field_unique_values PASSED                                                      [ 62%]
test_document_store.py::TestDocumentStore::test_not_operator SKIPPED (Unsupported filter operator not.)                                      [ 63%]
test_document_store.py::TestDocumentStore::test_comparison_not_equal_with_none SKIPPED (Unsupported filter operator $neq.)                   [ 65%]
test_document_store.py::TestDocumentStore::test_comparison_not_equal SKIPPED (Unsupported filter operator $neq.)                             [ 66%]
test_document_store.py::TestDocumentStore::test_comparison_not_in SKIPPED (Unsupported filter operator $nin.)                                [ 68%]
test_document_store.py::TestDocumentStore::test_comparison_not_in_with_with_non_list SKIPPED (Unsupported filter operator $nin.)             [ 69%]
test_document_store.py::TestDocumentStore::test_comparison_not_in_with_with_non_list_iterable SKIPPED (Unsupported filter operator $nin.)    [ 70%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than_with_iso_date SKIPPED (Unsupported filter operator $gt.)             [ 72%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than_with_string SKIPPED (Unsupported filter operator $gt.)               [ 73%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than_with_list SKIPPED (Unsupported filter operator $gt.)                 [ 75%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than_with_none SKIPPED (Unsupported filter operator $gt.)                 [ 76%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than SKIPPED (Unsupported filter operator $gt.)                           [ 77%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than_equal SKIPPED (Unsupported filter operator $gte.)                    [ 79%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than_equal_with_none SKIPPED (Unsupported filter operator $gte.)          [ 80%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than_equal_with_list SKIPPED (Unsupported filter operator $gte.)          [ 81%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than_equal_with_string SKIPPED (Unsupported filter operator $gte.)        [ 83%]
test_document_store.py::TestDocumentStore::test_comparison_greater_than_equal_with_iso_date SKIPPED (Unsupported filter operator $gte.)      [ 84%]
test_document_store.py::TestDocumentStore::test_comparison_less_than_equal SKIPPED (Unsupported filter operator $lte.)                       [ 86%]
test_document_store.py::TestDocumentStore::test_comparison_less_than_equal_with_string SKIPPED (Unsupported filter operator $lte.)           [ 87%]
test_document_store.py::TestDocumentStore::test_comparison_less_than_equal_with_list SKIPPED (Unsupported filter operator $lte.)             [ 88%]
test_document_store.py::TestDocumentStore::test_comparison_less_than_equal_with_iso_date SKIPPED (Unsupported filter operator $lte.)         [ 90%]
test_document_store.py::TestDocumentStore::test_comparison_less_than_equal_with_none SKIPPED (Unsupported filter operator $lte.)             [ 91%]
test_document_store.py::TestDocumentStore::test_comparison_less_than_with_none SKIPPED (Unsupported filter operator $lt.)                    [ 93%]
test_document_store.py::TestDocumentStore::test_comparison_less_than_with_list SKIPPED (Unsupported filter operator $lt.)                    [ 94%]
test_document_store.py::TestDocumentStore::test_comparison_less_than_with_string SKIPPED (Unsupported filter operator $lt.)                  [ 95%]
test_document_store.py::TestDocumentStore::test_comparison_less_than_with_iso_date SKIPPED (Unsupported filter operator $lt.)                [ 97%]
test_document_store.py::TestDocumentStore::test_comparison_less_than SKIPPED (Unsupported filter operator $lt.)                              [ 98%]
test_embedding_retrieval.py::TestEmbeddingRetrieval::test_search_with_top_k PASSED                                                           [100%]

============================================================= short test summary info ==============================================================
SKIPPED [1] ../../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py:674: delete_all_documents has no recreate_index or recreate_collection parameter
SKIPPED [1] ../../../../../../Library/Application Support/hatch/env/virtual/astra-haystack/azrDRwGJ/test/lib/python3.14/site-packages/haystack/testing/document_store.py:696: delete_all_documents has no recreate_index or recreate_collection parameter
SKIPPED [1] test_document_store.py:382: Unsupported filter operator not.
SKIPPED [1] test_document_store.py:386: Unsupported filter operator $neq.
SKIPPED [1] test_document_store.py:390: Unsupported filter operator $neq.
SKIPPED [1] test_document_store.py:394: Unsupported filter operator $nin.
SKIPPED [1] test_document_store.py:398: Unsupported filter operator $nin.
SKIPPED [1] test_document_store.py:402: Unsupported filter operator $nin.
SKIPPED [1] test_document_store.py:406: Unsupported filter operator $gt.
SKIPPED [1] test_document_store.py:410: Unsupported filter operator $gt.
SKIPPED [1] test_document_store.py:414: Unsupported filter operator $gt.
SKIPPED [1] test_document_store.py:418: Unsupported filter operator $gt.
SKIPPED [1] test_document_store.py:422: Unsupported filter operator $gt.
SKIPPED [1] test_document_store.py:426: Unsupported filter operator $gte.
SKIPPED [1] test_document_store.py:430: Unsupported filter operator $gte.
SKIPPED [1] test_document_store.py:434: Unsupported filter operator $gte.
SKIPPED [1] test_document_store.py:438: Unsupported filter operator $gte.
SKIPPED [1] test_document_store.py:442: Unsupported filter operator $gte.
SKIPPED [1] test_document_store.py:446: Unsupported filter operator $lte.
SKIPPED [1] test_document_store.py:450: Unsupported filter operator $lte.
SKIPPED [1] test_document_store.py:454: Unsupported filter operator $lte.
SKIPPED [1] test_document_store.py:458: Unsupported filter operator $lte.
SKIPPED [1] test_document_store.py:462: Unsupported filter operator $lte.
SKIPPED [1] test_document_store.py:466: Unsupported filter operator $lt.
SKIPPED [1] test_document_store.py:470: Unsupported filter operator $lt.
SKIPPED [1] test_document_store.py:474: Unsupported filter operator $lt.
SKIPPED [1] test_document_store.py:478: Unsupported filter operator $lt.
SKIPPED [1] test_document_store.py:482: Unsupported filter operator $lt.
================================================== 44 passed, 28 skipped, 12 deselected in 45.15s ==================================================

Notes for the reviewer

  • The implementation intentionally avoids refactoring existing Astra document store methods and only adds the new requested operations.
  • The metadata-related methods are implemented using Astra’s available collection operations such as distinct values and projected reads, rather than a broader aggregation refactor.

Checklist

Checklist

@Keyur-S-Patel Keyur-S-Patel requested a review from a team as a code owner March 2, 2026 06:06
@Keyur-S-Patel Keyur-S-Patel requested review from bogdankostic and removed request for a team March 2, 2026 06:06
@github-actions github-actions bot added integration:astra type:documentation Improvements or additions to documentation labels Mar 2, 2026
@Keyur-S-Patel Keyur-S-Patel force-pushed the fix/2637-azure-ai-search branch from b47ce08 to 4371bc4 Compare March 2, 2026 06:11
@Keyur-S-Patel Keyur-S-Patel force-pushed the fix/2637-azure-ai-search branch from 4371bc4 to 578447a Compare March 2, 2026 06:14
@Keyur-S-Patel
Copy link
Contributor Author

@julian-risch this issue is similar to the one you reviewed here #2903

@davidsbatista davidsbatista self-requested a review March 5, 2026 11:07
@davidsbatista
Copy link
Contributor

@Keyur-S-Patel thank you for the contribution! I did a few minor changes.

@Keyur-S-Patel
Copy link
Contributor Author

@davidsbatista ty, Can we merge this change and close this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:astra type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add the following operations to AstraDocumentStore

2 participants