-
Notifications
You must be signed in to change notification settings - Fork 849
Add LoadFromAsync and SaveToAsync helper methods to DataContent #7159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…peMap Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
stephentoub
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/MediaTypeMap.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/MediaTypeMap.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/MediaTypeMap.cs
Outdated
Show resolved
Hide resolved
…ontent methods Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
All review comments addressed in 840f61c. Changes include: MediaTypeMap moved to LegacySupport with full mappings, methods are now stable (not experimental), DefaultMediaType moved to DataContent as private const, using GetBuffer() for MemoryStream, proper null handling for Name, SaveToAsync handles directory paths and always infers extensions, and NET9_0_OR_GREATER conditional for ReadOnlyMemory overload. |
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds helper methods to DataContent for loading from and saving to file system paths and streams, with automatic media type (MIME type) inference based on file extensions.
Key Changes
- Adds static
LoadFromAsyncmethods (path and stream overloads) that infer media types and names from file extensions - Adds instance
SaveToAsyncmethod that handles directory paths and extension inference - Implements
MediaTypeMappolyfill for .NET 9 and earlier to enable file extension-to-MIME type mapping - Marks the new APIs as stable (non-experimental) in the JSON baseline
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs |
Implements the new LoadFromAsync and SaveToAsync methods with media type inference logic |
src/LegacySupport/MediaTypeMap/MediaTypeMap.cs |
Adds polyfill implementation of media type mapping for pre-.NET 10 frameworks |
src/LegacySupport/MediaTypeMap/README.md |
Documents the MediaTypeMap polyfill purpose and origin |
eng/MSBuild/LegacySupport.props |
Configures MSBuild to inject MediaTypeMap polyfill for legacy target frameworks |
src/Libraries/Microsoft.Extensions.AI.Abstractions/Microsoft.Extensions.AI.Abstractions.csproj |
Enables MediaTypeMap injection for the abstractions project |
src/Libraries/Microsoft.Extensions.AI.Abstractions/Microsoft.Extensions.AI.Abstractions.json |
Adds the new methods to the stable API baseline |
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Contents/DataContentTests.cs |
Adds comprehensive unit tests covering all new functionality and edge cases |
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
…Array in SaveToAsync Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
…add empty path test Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
@copilot please address unresolved feedback |
Both unresolved feedback items were already addressed in c89fff7:
|
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
@copilot, are there any places in the repo where DataContent is constructed from a file or stream such that LoadFromAsync could now be used instead? Or similarly anywhere where it's written out to a file and SaveToAsync could be used instead? |
Yes! Found one excellent candidate in No candidates found for |
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
src/Libraries/Microsoft.Extensions.DataIngestion.MarkItDown/MarkItDownMcpReader.cs
Outdated
Show resolved
Hide resolved
…mAsync Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
| mediaType ??= DefaultMediaType; | ||
|
|
||
| // Read the stream contents | ||
| using var memoryStream = new MemoryStream(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot, if stream.CanSeek is true, this should pre-initialize the MemoryStream's capacity, e.g.
| using var memoryStream = new MemoryStream(); | |
| using MemoryStream memoryStream = source.CanSeek ? new((int)Math.Min(source.Length, int.MaxValue)) : new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in ced6113. Now pre-initializing MemoryStream capacity when stream.CanSeek is true using stream.Length.
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
| _ = Throw.IfNullOrEmpty(path); | ||
|
|
||
| // Infer media type from the file extension if not provided | ||
| mediaType ??= System.Net.Mime.MediaTypeMap.GetMediaType(path) ?? DefaultMediaType; | ||
|
|
||
| // Read the file contents | ||
| byte[] data = await File.ReadAllBytesAsync(path, cancellationToken).ConfigureAwait(false); | ||
|
|
||
| string? name = Path.GetFileName(path); | ||
| return new DataContent(data, mediaType) | ||
| { | ||
| Name = string.IsNullOrEmpty(name) ? null : name | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if we could just have this as an extension that uses the Stream.
| _ = Throw.IfNullOrEmpty(path); | |
| // Infer media type from the file extension if not provided | |
| mediaType ??= System.Net.Mime.MediaTypeMap.GetMediaType(path) ?? DefaultMediaType; | |
| // Read the file contents | |
| byte[] data = await File.ReadAllBytesAsync(path, cancellationToken).ConfigureAwait(false); | |
| string? name = Path.GetFileName(path); | |
| return new DataContent(data, mediaType) | |
| { | |
| Name = string.IsNullOrEmpty(name) ? null : name | |
| }; | |
| _ = Throw.IfNullOrEmpty(path); | |
| using var fileStream = File.OpenRead(path); | |
| return await LoadFromAsync(fileStream, mediaType, cancellationToken).ConfigureAwait(false); |
| /// <param name="cancellationToken">The <see cref="CancellationToken"/> to monitor for cancellation requests.</param> | ||
| /// <returns>A <see cref="DataContent"/> containing the stream data with the inferred or specified media type and name.</returns> | ||
| /// <exception cref="ArgumentNullException"><paramref name="stream"/> is <see langword="null"/>.</exception> | ||
| public static async ValueTask<DataContent> LoadFromAsync(Stream stream, string? mediaType = null, CancellationToken cancellationToken = default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name could be an optional argument as well, when loading from streams that aren't FileStream, this would be useful for SaveFromAsync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name is a settable property on DataContent.
| /// <param name="path"> | ||
| /// The path to save the data to. If the path is an existing directory, the file name will be inferred | ||
| /// from the <see cref="Name"/> property or a random name will be used. | ||
| /// If the path does not have a file extension, an extension will be added based on the <see cref="MediaType"/>. | ||
| /// </param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth mentioning what is the behavior when the Path is not absolute, what will be considered the root when it's relative, the executable path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current working directory. That's the case for all file-related APIs.
| var extension = MediaTypeMap.GetExtension(dataContent.MediaType); | ||
| if (extension is not null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For DataContent when no name is present GUID but for images is image{extension} ?
I'm not very much familiarized with the code, but worrying that this approach may collide often and not be consistent.
By collide I mean that if you run this twice, the unidentified images may override when saved.
Is this intentional?
| /// A service might use this name as part of citations or to help infer the type of data | ||
| /// being represented based on a file extension. | ||
| /// </remarks> | ||
| public string? Name { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might consider adding an extra para remark to the Name property explaining that the name might be used when Saving.
System.Net.MimenamespaceLoadFromAsyncmethods toDataContentclass:SaveToAsyncmethod toDataContentclass:Path.GetFileName(Name)[Experimental]attribute and add methods to JSON API baseline as stableGetBuffer()instead ofToArray()for MemoryStream efficiencyWriteAllBytesAsync(ReadOnlyMemory<byte>)polyfill to remove#ifdeffromSaveToAsyncTask<T>toValueTask<T>for all new public async methodsFile.ReadAllBytesAsyncandFile.WriteAllBytesAsyncin LegacySupport directorynameparameter fromLoadFromAsync(Stream)- Name can be assigned after the callInjectFilePolyfillsOnLegacyto test projectMarkItDownMcpReader.csto useDataContent.LoadFromAsyncwith optional media type parameterImageGeneratorExtensions.cs- Replace_extensionToMimeTypedictionary withMediaTypeMap.GetMediaTypeOpenAIClientExtensions.cs- ReplaceImageUriToMediaTypewithMediaTypeMap.GetMediaTypeOpenAIImageGenerator.cs- Replace_mimeTypeToExtensiondictionary withMediaTypeMap.GetExtensionIngestionDocumentReader.cs- Remove custom lookup, only useMediaTypeMap.GetMediaTypeOriginal prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.
Microsoft Reviewers: Open in CodeFlow