Python: Azure AI Search Citation Extraction - V2#2588
Python: Azure AI Search Citation Extraction - V2#2588giles17 wants to merge 1 commit intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR enhances the Azure AI Search sample by adding citation extraction functionality to demonstrate how to retrieve and display citation information from agent responses.
Key Changes:
- Added
extract_citations_from_responsehelper function to parse citations from agent responses - Modified the query to request detailed winter hotel information
- Changed
query_typefrom "simple" to "vector" mode - Added citation display logic to show extracted citation URLs
| if citations: | ||
| for i, citation in enumerate(citations, 1): | ||
| print(f"Citation {i}:") | ||
| print(f" URL: {citation['url']}") |
There was a problem hiding this comment.
The extract_citations_from_response function has incomplete citation output. Currently only the URL is printed (line 94), but the function extracts title, file_id, and positions which are not displayed. This makes the citation information incomplete for users who may need these additional details.
Consider printing all extracted citation information:
print(f"Citation {i}:")
print(f" URL: {citation['url']}")
if citation.get('title'):
print(f" Title: {citation['title']}")
if citation.get('file_id'):
print(f" File ID: {citation['file_id']}")
if citation.get('positions'):
print(f" Positions: {citation['positions']}")| print(f" URL: {citation['url']}") | |
| print(f" URL: {citation['url']}") | |
| if citation.get('title'): | |
| print(f" Title: {citation['title']}") | |
| if citation.get('file_id'): | |
| print(f" File ID: {citation['file_id']}") | |
| if citation.get('positions'): | |
| print(f" Positions: {citation['positions']}") |
| """Extract citation information from an AgentRunResponse.""" | ||
| citations: list[dict[str, Any]] = [] | ||
|
|
||
| if hasattr(response, "messages") and response.messages: |
There was a problem hiding this comment.
you can skip a bunch of these checks (it's a sample, so we can be a less strict on typing, to improve readability) and inverse the logic for the rest:
| if hasattr(response, "messages") and response.messages: | |
| from itertools import chain | |
| if not response.messages: | |
| return citations | |
| for content in chain.from_iterable(message.contents for message in response.messages): | |
| if not content.annotations: | |
| continue | |
| # parse the annotations |
Also why not just return the citations instead of this dict?
|
Is this still current @giles17, or can we close? |
|
@eavanvalkenburg yes, I'll close it |
Motivation and Context
Adds extraction method to sample
Description
Contribution Checklist