From 9ed41897a1d2f06d302176132fcadee30207bb23 Mon Sep 17 00:00:00 2001 From: Mattt Zmuda Date: Wed, 11 Feb 2026 06:27:59 -0800 Subject: [PATCH 1/4] Restructure usage section of README --- README.md | 143 +++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 115 insertions(+), 28 deletions(-) diff --git a/README.md b/README.md index efb9097..77f6caf 100644 --- a/README.md +++ b/README.md @@ -287,26 +287,14 @@ see OWASP's ## Usage -### Apple Foundation Models - -Uses Apple's [system language model](https://developer.apple.com/documentation/FoundationModels) -(requires macOS 26 / iOS 26 / visionOS 26 or later). - -```swift -let model = SystemLanguageModel.default -let session = LanguageModelSession(model: model) +### Guided Generation -let response = try await session.respond { - Prompt("Explain quantum computing in one sentence") -} -``` - -> [!NOTE] -> Image inputs are not yet supported by Apple Foundation Models. - -`SystemLanguageModel` supports guided generation, +All on-device models — Apple Foundation Models, Core ML, MLX, and llama.cpp — +support guided generation, letting you request strongly typed outputs using `@Generable` and `@Guide` instead of parsing raw strings. +Cloud providers (OpenAI, Open Responses, Anthropic, and Gemini) +also support guided generation. For more details, see [Generating Swift data structures with guided generation](https://developer.apple.com/documentation/foundationmodels/generating-swift-data-structures-with-guided-generation). @@ -323,7 +311,7 @@ struct CatProfile { var profile: String } -let session = LanguageModelSession(model: .default) +let session = LanguageModelSession(model: model) let response = try await session.respond( to: "Generate a cute rescue cat", generating: CatProfile.self @@ -331,6 +319,115 @@ let response = try await session.respond( print(response.content) ``` +> [!NOTE] +> Ollama does not currently support guided generation. + +### Image Inputs + +Many providers support image inputs, +letting you include images alongside text prompts. +Pass images using the `images:` or `image:` parameter on `respond`: + +```swift +let response = try await session.respond( + to: "Describe what you see", + images: [ + .init(url: URL(string: "https://example.com/photo.jpg")!), + .init(url: URL(fileURLWithPath: "/path/to/local.png")) + ] +) +``` + +Image support varies by provider: + +| Provider | Image Inputs | +| ----------------------- | :-------------: | +| Apple Foundation Models | — | +| Core ML | — | +| MLX | model-dependent | +| llama.cpp | — | +| Ollama | model-dependent | +| OpenAI | yes | +| Open Responses | yes | +| Anthropic | yes | +| Google Gemini | yes | + +For MLX and Ollama, +use a vision-capable model +(for example, a VLM or `-vl` variant). + +### Tool Calling + +Tool calling is supported by all providers except llama.cpp. +Define tools using the `Tool` protocol and pass them when creating a session: + +```swift +struct WeatherTool: Tool { + let name = "getWeather" + let description = "Retrieve the latest weather information for a city" + + @Generable + struct Arguments { + @Guide(description: "The city to fetch the weather for") + var city: String + } + + func call(arguments: Arguments) async throws -> String { + "The weather in \(arguments.city) is sunny and 72°F / 23°C" + } +} + +let session = LanguageModelSession(model: model, tools: [WeatherTool()]) + +let response = try await session.respond { + Prompt("How's the weather in Cupertino?") +} +print(response.content) +``` + +To observe or control tool execution, assign a delegate on the session: + +```swift +actor ToolExecutionObserver: ToolExecutionDelegate { + func didGenerateToolCalls(_ toolCalls: [Transcript.ToolCall], in session: LanguageModelSession) async { + print("Generated tool calls: \(toolCalls)") + } + + func toolCallDecision( + for toolCall: Transcript.ToolCall, + in session: LanguageModelSession + ) async -> ToolExecutionDecision { + // Return .stop to halt after tool calls, or .provideOutput(...) to bypass execution. + // This is a good place to ask the user for confirmation (for example, in a modal dialog). + .execute + } + + func didExecuteToolCall( + _ toolCall: Transcript.ToolCall, + output: Transcript.ToolOutput, + in session: LanguageModelSession + ) async { + print("Executed tool call: \(toolCall)") + } +} + +session.toolExecutionDelegate = ToolExecutionObserver() +``` + +### Apple Foundation Models + +Uses Apple's [system language model](https://developer.apple.com/documentation/FoundationModels) +(requires macOS 26 / iOS 26 / visionOS 26 or later). + +```swift +let model = SystemLanguageModel.default +let session = LanguageModelSession(model: model) + +let response = try await session.respond { + Prompt("Explain quantum computing in one sentence") +} +``` + ### Core ML Run [Core ML](https://developer.apple.com/documentation/coreml) models @@ -355,9 +452,6 @@ Enable the trait in Package.swift: ) ``` -> [!NOTE] -> Image inputs are not currently supported with `CoreMLLanguageModel`. - ### MLX Run [MLX](https://github.com/ml-explore/mlx-swift) models on Apple Silicon @@ -398,9 +492,6 @@ Enable the trait in Package.swift: ) ``` -> [!NOTE] -> MLX supports guided generation (structured output via `@Generable`). - ### llama.cpp (GGUF) Run GGUF quantized models via [llama.cpp](https://github.com/ggml-org/llama.cpp) @@ -451,10 +542,6 @@ let response = try await session.respond( ) ``` -> [!NOTE] -> Image inputs are not currently supported with `LlamaLanguageModel`. -> Guided generation (structured output via `@Generable`) is supported. - ### OpenAI Supports both From 71fc3863079f4fa73062ab2b056be8daf266355d Mon Sep 17 00:00:00 2001 From: Mattt Zmuda Date: Wed, 11 Feb 2026 06:35:13 -0800 Subject: [PATCH 2/4] Create separate Providers section and reorganize subheadings --- README.md | 128 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 67 insertions(+), 61 deletions(-) diff --git a/README.md b/README.md index 77f6caf..d4e5f37 100644 --- a/README.md +++ b/README.md @@ -414,6 +414,8 @@ actor ToolExecutionObserver: ToolExecutionDelegate { session.toolExecutionDelegate = ToolExecutionObserver() ``` +## Providers + ### Apple Foundation Models Uses Apple's [system language model](https://developer.apple.com/documentation/FoundationModels) @@ -430,7 +432,7 @@ let response = try await session.respond { ### Core ML -Run [Core ML](https://developer.apple.com/documentation/coreml) models +Runs [Core ML](https://developer.apple.com/documentation/coreml) models (requires `CoreML` trait): ```swift @@ -454,7 +456,7 @@ Enable the trait in Package.swift: ### MLX -Run [MLX](https://github.com/ml-explore/mlx-swift) models on Apple Silicon +Runs [MLX](https://github.com/ml-explore/mlx-swift) models on Apple Silicon (requires `MLX` trait): ```swift @@ -494,7 +496,7 @@ Enable the trait in Package.swift: ### llama.cpp (GGUF) -Run GGUF quantized models via [llama.cpp](https://github.com/ggml-org/llama.cpp) +Runs GGUF quantized models via [llama.cpp](https://github.com/ggml-org/llama.cpp) (requires `Llama` trait): ```swift @@ -542,6 +544,56 @@ let response = try await session.respond( ) ``` +### Ollama + +Run models locally via Ollama's +[HTTP API](https://github.com/ollama/ollama/blob/main/docs/api.md): + +```swift +// Default: connects to http://localhost:11434 +let model = OllamaLanguageModel(model: "qwen3") // `ollama pull qwen3:8b` + +// Custom endpoint +let model = OllamaLanguageModel( + endpoint: URL(string: "http://remote-server:11434")!, + model: "llama3.2" +) + +let session = LanguageModelSession(model: model) +let response = try await session.respond { + Prompt("Tell me a joke") +} +``` + +For local models, make sure you're using a vision‑capable model +(for example, a `-vl` variant). +You can combine multiple images: + +```swift +let model = OllamaLanguageModel(model: "qwen3-vl") // `ollama pull qwen3-vl:8b` +let session = LanguageModelSession(model: model) +let response = try await session.respond( + to: "Compare these posters and summarize their differences", + images: [ + .init(url: URL(string: "https://example.com/poster1.jpg")!), + .init(url: URL(fileURLWithPath: "/path/to/poster2.jpg")) + ] +) +print(response.content) +``` + +Pass any model-specific parameters using custom generation options: + +```swift +var options = GenerationOptions(temperature: 0.8) +options[custom: OllamaLanguageModel.self] = [ + "seed": .int(42), + "repeat_penalty": .double(1.2), + "num_ctx": .int(4096), + "stop": .array([.string("###")]) +] +``` + ### OpenAI Supports both @@ -599,28 +651,32 @@ options[custom: OpenAILanguageModel.self] = .init( ### Open Responses -Connects to any API that conforms to the [Open Responses](https://www.openresponses.org) specification (e.g. OpenAI, OpenRouter, or other compatible providers). Base URL is required—use your provider’s endpoint: +Connects to any API that conforms to the +[Open Responses](https://www.openresponses.org) specification +(e.g. OpenAI, OpenRouter, or other compatible providers). +Base URL is required—use your provider’s endpoint: ```swift -// Example: OpenAI +// Example: OpenRouter (https://openrouter.ai/api/v1/) let model = OpenResponsesLanguageModel( - baseURL: URL(string: "https://api.openai.com/v1/")!, + baseURL: URL(string: "https://openrouter.ai/api/v1/")!, apiKey: ProcessInfo.processInfo.environment["OPEN_RESPONSES_API_KEY"]!, - model: "gpt-4o-mini" + model: "openai/gpt-4o-mini" ) -// Example: OpenRouter (https://openrouter.ai/api/v1/) +// Example: OpenAI let model = OpenResponsesLanguageModel( - baseURL: URL(string: "https://openrouter.ai/api/v1/")!, + baseURL: URL(string: "https://api.openai.com/v1/")!, apiKey: ProcessInfo.processInfo.environment["OPEN_RESPONSES_API_KEY"]!, - model: "openai/gpt-4o-mini" + model: "gpt-4o-mini" ) let session = LanguageModelSession(model: model) let response = try await session.respond(to: "Say hello") ``` -Custom options support Open Responses–specific fields such as `tool_choice` (including `allowed_tools`) and `extraBody`: +Custom options support Open Responses–specific fields, +such as `tool_choice` (including `allowed_tools`) and `extraBody`: ```swift var options = GenerationOptions(temperature: 0.8) @@ -754,56 +810,6 @@ let response = try await session.respond( > [!TIP] > Gemini server tools are not available as client tools (`Tool`) for other models. -### Ollama - -Run models locally via Ollama's -[HTTP API](https://github.com/ollama/ollama/blob/main/docs/api.md): - -```swift -// Default: connects to http://localhost:11434 -let model = OllamaLanguageModel(model: "qwen3") // `ollama pull qwen3:8b` - -// Custom endpoint -let model = OllamaLanguageModel( - endpoint: URL(string: "http://remote-server:11434")!, - model: "llama3.2" -) - -let session = LanguageModelSession(model: model) -let response = try await session.respond { - Prompt("Tell me a joke") -} -``` - -For local models, make sure you're using a vision‑capable model -(for example, a `-vl` variant). -You can combine multiple images: - -```swift -let model = OllamaLanguageModel(model: "qwen3-vl") // `ollama pull qwen3-vl:8b` -let session = LanguageModelSession(model: model) -let response = try await session.respond( - to: "Compare these posters and summarize their differences", - images: [ - .init(url: URL(string: "https://example.com/poster1.jpg")!), - .init(url: URL(fileURLWithPath: "/path/to/poster2.jpg")) - ] -) -print(response.content) -``` - -Pass any model-specific parameters using custom generation options: - -```swift -var options = GenerationOptions(temperature: 0.8) -options[custom: OllamaLanguageModel.self] = [ - "seed": .int(42), - "repeat_penalty": .double(1.2), - "num_ctx": .int(4096), - "stop": .array([.string("###")]) -] -``` - ## Testing Run the test suite to verify everything works correctly: From 730a8530f5b0d34f58657a21cd86dac06280a696 Mon Sep 17 00:00:00 2001 From: Mattt Zmuda Date: Wed, 11 Feb 2026 07:19:02 -0800 Subject: [PATCH 3/4] Remove now outdated note about Ollama not supporting guided generation --- README.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/README.md b/README.md index d4e5f37..091760e 100644 --- a/README.md +++ b/README.md @@ -319,9 +319,6 @@ let response = try await session.respond( print(response.content) ``` -> [!NOTE] -> Ollama does not currently support guided generation. - ### Image Inputs Many providers support image inputs, From 0446484a27483233635e31fbc1737ed765f550e2 Mon Sep 17 00:00:00 2001 From: Mattt Zmuda Date: Wed, 11 Feb 2026 07:19:58 -0800 Subject: [PATCH 4/4] Bump version to 0.7.0 --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 091760e..d8de89d 100644 --- a/README.md +++ b/README.md @@ -98,7 +98,7 @@ Add this package to your `Package.swift`: ```swift dependencies: [ - .package(url: "https://github.com/mattt/AnyLanguageModel", from: "0.5.0") + .package(url: "https://github.com/mattt/AnyLanguageModel", from: "0.7.0") ] ``` @@ -126,7 +126,7 @@ To enable specific traits, specify them in your package's dependencies: dependencies: [ .package( url: "https://github.com/mattt/AnyLanguageModel.git", - from: "0.5.0", + from: "0.7.0", traits: ["CoreML", "MLX"] // Enable CoreML and MLX support ) ]