Skip to content

Conversation

@GFJHogue
Copy link
Collaborator

@GFJHogue GFJHogue commented Jan 22, 2026

partially replaces #97

Reconfigured RRF hybrid retrieval using LLM query-expansion.

@GFJHogue GFJHogue marked this pull request as ready for review February 6, 2026 20:16
@GFJHogue GFJHogue requested a review from heliamoh February 6, 2026 20:16
Copy link
Collaborator

@heliamoh heliamoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@heliamoh heliamoh merged commit 9232748 into ReactToMe_update-greg Feb 8, 2026
4 of 5 checks passed
GFJHogue added a commit that referenced this pull request Feb 9, 2026
@GFJHogue GFJHogue deleted the multi-query branch February 9, 2026 15:44
GFJHogue added a commit that referenced this pull request Feb 9, 2026
…ss (#102)

* feat: expand preprocessing to a multi-step workflow.

- Implement parallel execution of safety and scope check, query expansion, and language detection

* feat: Add new runnables for checking question safety and scope, query expansion and conversation history management

* feat:improved hybrid retrieval
- Replace SelfQueryRetriever with efficient hybrid search (BM25 + vector)
- Add RRF (Reciprocal Rank Fusion) support for query expansion
- Implement parallel processing for improved performance

* feat: Add new runnables for checking question safety and scope, query expansion and conversation history management

* code quality check fixes

* fix: Resolve mypy linter errors

- Add type annotation for rrf_scores in retrieval_utils.py
- Fix metadata dictionary comprehension in csv_chroma.py
- Update retriever type annotations to use Any
- Add isinstance check for BM25Retriever
- Remove default values from TypedDict in base.py
- Fix TypedDict expansion in postprocess method

* remove: Remove reactome_kg directory from repository

* feat: expand preprocessing to a multi-step workflow.
- Implement parallel execution of safety and scope check, query expansion, and language detection

* feat: expand preprocessing to a multi-step workflow.
- Implement parallel execution of safety and scope check, query expansion, and language detection

* feat:improved hybrid retrieval
- Replace SelfQueryRetriever with efficient hybrid search (BM25 + vector)
- Add RRF (Reciprocal Rank Fusion) support for query expansion
- Implement parallel processing for improved performance

* feat:improved answer generation, in-line citation handling and hallucination mitigation

* remove irrelevant docs

* [WIP] clean up changes

* [WIP] clean up changes (2)

* revert retrieval changes

* macos-intel actions runner

* fix SafetyCheck type usage

* stream unsafe response to user

* black spacing

* cross-db use detected_language from base state

* pre-release docker push

* new HybridRetriever class (#102)

* rewrite HybridRetriever class

* fix types

* fix HybridRetriever class inheritance issues

* fix lint

* multithread, as in Helia's code

* fix typing for #102

---------

Co-authored-by: Helia Mohammadi <helia.mohammadi01@gmail.com>
GFJHogue added a commit that referenced this pull request Feb 9, 2026
…ss (#102)

* feat: expand preprocessing to a multi-step workflow.

- Implement parallel execution of safety and scope check, query expansion, and language detection

* feat: Add new runnables for checking question safety and scope, query expansion and conversation history management

* feat:improved hybrid retrieval
- Replace SelfQueryRetriever with efficient hybrid search (BM25 + vector)
- Add RRF (Reciprocal Rank Fusion) support for query expansion
- Implement parallel processing for improved performance

* feat: Add new runnables for checking question safety and scope, query expansion and conversation history management

* code quality check fixes

* fix: Resolve mypy linter errors

- Add type annotation for rrf_scores in retrieval_utils.py
- Fix metadata dictionary comprehension in csv_chroma.py
- Update retriever type annotations to use Any
- Add isinstance check for BM25Retriever
- Remove default values from TypedDict in base.py
- Fix TypedDict expansion in postprocess method

* remove: Remove reactome_kg directory from repository

* feat: expand preprocessing to a multi-step workflow.
- Implement parallel execution of safety and scope check, query expansion, and language detection

* feat: expand preprocessing to a multi-step workflow.
- Implement parallel execution of safety and scope check, query expansion, and language detection

* feat:improved hybrid retrieval
- Replace SelfQueryRetriever with efficient hybrid search (BM25 + vector)
- Add RRF (Reciprocal Rank Fusion) support for query expansion
- Implement parallel processing for improved performance

* feat:improved answer generation, in-line citation handling and hallucination mitigation

* remove irrelevant docs

* [WIP] clean up changes

* [WIP] clean up changes (2)

* revert retrieval changes

* macos-intel actions runner

* fix SafetyCheck type usage

* stream unsafe response to user

* black spacing

* cross-db use detected_language from base state

* pre-release docker push

* new HybridRetriever class (#102)

* rewrite HybridRetriever class

* fix types

* fix HybridRetriever class inheritance issues

* fix lint

* multithread, as in Helia's code

* fix typing for #102

---------

Co-authored-by: Helia Mohammadi <helia.mohammadi01@gmail.com>
GFJHogue added a commit that referenced this pull request Feb 9, 2026
…ss (#102)

* feat: expand preprocessing to a multi-step workflow.

- Implement parallel execution of safety and scope check, query expansion, and language detection

* feat: Add new runnables for checking question safety and scope, query expansion and conversation history management

* feat:improved hybrid retrieval
- Replace SelfQueryRetriever with efficient hybrid search (BM25 + vector)
- Add RRF (Reciprocal Rank Fusion) support for query expansion
- Implement parallel processing for improved performance

* feat: Add new runnables for checking question safety and scope, query expansion and conversation history management

* code quality check fixes

* fix: Resolve mypy linter errors

- Add type annotation for rrf_scores in retrieval_utils.py
- Fix metadata dictionary comprehension in csv_chroma.py
- Update retriever type annotations to use Any
- Add isinstance check for BM25Retriever
- Remove default values from TypedDict in base.py
- Fix TypedDict expansion in postprocess method

* remove: Remove reactome_kg directory from repository

* feat: expand preprocessing to a multi-step workflow.
- Implement parallel execution of safety and scope check, query expansion, and language detection

* feat: expand preprocessing to a multi-step workflow.
- Implement parallel execution of safety and scope check, query expansion, and language detection

* feat:improved hybrid retrieval
- Replace SelfQueryRetriever with efficient hybrid search (BM25 + vector)
- Add RRF (Reciprocal Rank Fusion) support for query expansion
- Implement parallel processing for improved performance

* feat:improved answer generation, in-line citation handling and hallucination mitigation

* remove irrelevant docs

* [WIP] clean up changes

* [WIP] clean up changes (2)

* revert retrieval changes

* macos-intel actions runner

* fix SafetyCheck type usage

* stream unsafe response to user

* black spacing

* cross-db use detected_language from base state

* pre-release docker push

* new HybridRetriever class (#102)

* rewrite HybridRetriever class

* fix types

* fix HybridRetriever class inheritance issues

* fix lint

* multithread, as in Helia's code

* fix typing for #102

---------

Co-authored-by: Helia Mohammadi <helia.mohammadi01@gmail.com>

* revert pre-release branch stuff

---------

Co-authored-by: Helia Mohammadi <helia.mohammadi01@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants