Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
270 changes: 180 additions & 90 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ Add this package to your `Package.swift`:

```swift
dependencies: [
.package(url: "https://github.com/mattt/AnyLanguageModel", from: "0.5.0")
.package(url: "https://github.com/mattt/AnyLanguageModel", from: "0.7.0")
]
```

Expand Down Expand Up @@ -126,7 +126,7 @@ To enable specific traits, specify them in your package's dependencies:
dependencies: [
.package(
url: "https://github.com/mattt/AnyLanguageModel.git",
from: "0.5.0",
from: "0.7.0",
traits: ["CoreML", "MLX"] // Enable CoreML and MLX support
)
]
Expand Down Expand Up @@ -287,26 +287,14 @@ see OWASP's

## Usage

### Apple Foundation Models

Uses Apple's [system language model](https://developer.apple.com/documentation/FoundationModels)
(requires macOS 26 / iOS 26 / visionOS 26 or later).

```swift
let model = SystemLanguageModel.default
let session = LanguageModelSession(model: model)
### Guided Generation

let response = try await session.respond {
Prompt("Explain quantum computing in one sentence")
}
```

> [!NOTE]
> Image inputs are not yet supported by Apple Foundation Models.

`SystemLanguageModel` supports guided generation,
All on-device models — Apple Foundation Models, Core ML, MLX, and llama.cpp —
support guided generation,
letting you request strongly typed outputs using `@Generable` and `@Guide`
instead of parsing raw strings.
Cloud providers (OpenAI, Open Responses, Anthropic, and Gemini)
also support guided generation.
For more details, see
[Generating Swift data structures with guided generation](https://developer.apple.com/documentation/foundationmodels/generating-swift-data-structures-with-guided-generation).

Expand All @@ -323,17 +311,125 @@ struct CatProfile {
var profile: String
}

let session = LanguageModelSession(model: .default)
let session = LanguageModelSession(model: model)
let response = try await session.respond(
to: "Generate a cute rescue cat",
generating: CatProfile.self
)
print(response.content)
```

### Image Inputs

Many providers support image inputs,
letting you include images alongside text prompts.
Pass images using the `images:` or `image:` parameter on `respond`:

```swift
let response = try await session.respond(
to: "Describe what you see",
images: [
.init(url: URL(string: "https://example.com/photo.jpg")!),
.init(url: URL(fileURLWithPath: "/path/to/local.png"))
]
)
```

Image support varies by provider:

| Provider | Image Inputs |
| ----------------------- | :-------------: |
| Apple Foundation Models | — |
| Core ML | — |
| MLX | model-dependent |
| llama.cpp | — |
| Ollama | model-dependent |
| OpenAI | yes |
| Open Responses | yes |
| Anthropic | yes |
| Google Gemini | yes |

For MLX and Ollama,
use a vision-capable model
(for example, a VLM or `-vl` variant).

### Tool Calling

Tool calling is supported by all providers except llama.cpp.
Define tools using the `Tool` protocol and pass them when creating a session:

```swift
struct WeatherTool: Tool {
let name = "getWeather"
let description = "Retrieve the latest weather information for a city"

@Generable
struct Arguments {
@Guide(description: "The city to fetch the weather for")
var city: String
}

func call(arguments: Arguments) async throws -> String {
"The weather in \(arguments.city) is sunny and 72°F / 23°C"
}
}

let session = LanguageModelSession(model: model, tools: [WeatherTool()])

let response = try await session.respond {
Prompt("How's the weather in Cupertino?")
}
print(response.content)
```

To observe or control tool execution, assign a delegate on the session:

```swift
actor ToolExecutionObserver: ToolExecutionDelegate {
func didGenerateToolCalls(_ toolCalls: [Transcript.ToolCall], in session: LanguageModelSession) async {
print("Generated tool calls: \(toolCalls)")
}

func toolCallDecision(
for toolCall: Transcript.ToolCall,
in session: LanguageModelSession
) async -> ToolExecutionDecision {
// Return .stop to halt after tool calls, or .provideOutput(...) to bypass execution.
// This is a good place to ask the user for confirmation (for example, in a modal dialog).
.execute
}

func didExecuteToolCall(
_ toolCall: Transcript.ToolCall,
output: Transcript.ToolOutput,
in session: LanguageModelSession
) async {
print("Executed tool call: \(toolCall)")
}
}

session.toolExecutionDelegate = ToolExecutionObserver()
```

## Providers

### Apple Foundation Models

Uses Apple's [system language model](https://developer.apple.com/documentation/FoundationModels)
(requires macOS 26 / iOS 26 / visionOS 26 or later).

```swift
let model = SystemLanguageModel.default
let session = LanguageModelSession(model: model)

let response = try await session.respond {
Prompt("Explain quantum computing in one sentence")
}
```

### Core ML

Run [Core ML](https://developer.apple.com/documentation/coreml) models
Runs [Core ML](https://developer.apple.com/documentation/coreml) models
(requires `CoreML` trait):

```swift
Expand All @@ -355,12 +451,9 @@ Enable the trait in Package.swift:
)
```

> [!NOTE]
> Image inputs are not currently supported with `CoreMLLanguageModel`.

### MLX

Run [MLX](https://github.com/ml-explore/mlx-swift) models on Apple Silicon
Runs [MLX](https://github.com/ml-explore/mlx-swift) models on Apple Silicon
(requires `MLX` trait):

```swift
Expand Down Expand Up @@ -398,12 +491,9 @@ Enable the trait in Package.swift:
)
```

> [!NOTE]
> MLX supports guided generation (structured output via `@Generable`).

### llama.cpp (GGUF)

Run GGUF quantized models via [llama.cpp](https://github.com/ggml-org/llama.cpp)
Runs GGUF quantized models via [llama.cpp](https://github.com/ggml-org/llama.cpp)
(requires `Llama` trait):

```swift
Expand Down Expand Up @@ -451,9 +541,55 @@ let response = try await session.respond(
)
```

> [!NOTE]
> Image inputs are not currently supported with `LlamaLanguageModel`.
> Guided generation (structured output via `@Generable`) is supported.
### Ollama

Run models locally via Ollama's
[HTTP API](https://github.com/ollama/ollama/blob/main/docs/api.md):

```swift
// Default: connects to http://localhost:11434
let model = OllamaLanguageModel(model: "qwen3") // `ollama pull qwen3:8b`

// Custom endpoint
let model = OllamaLanguageModel(
endpoint: URL(string: "http://remote-server:11434")!,
model: "llama3.2"
)

let session = LanguageModelSession(model: model)
let response = try await session.respond {
Prompt("Tell me a joke")
}
```

For local models, make sure you're using a vision‑capable model
(for example, a `-vl` variant).
You can combine multiple images:

```swift
let model = OllamaLanguageModel(model: "qwen3-vl") // `ollama pull qwen3-vl:8b`
let session = LanguageModelSession(model: model)
let response = try await session.respond(
to: "Compare these posters and summarize their differences",
images: [
.init(url: URL(string: "https://example.com/poster1.jpg")!),
.init(url: URL(fileURLWithPath: "/path/to/poster2.jpg"))
]
)
print(response.content)
```

Pass any model-specific parameters using custom generation options:

```swift
var options = GenerationOptions(temperature: 0.8)
options[custom: OllamaLanguageModel.self] = [
"seed": .int(42),
"repeat_penalty": .double(1.2),
"num_ctx": .int(4096),
"stop": .array([.string("###")])
]
```

### OpenAI

Expand Down Expand Up @@ -512,28 +648,32 @@ options[custom: OpenAILanguageModel.self] = .init(

### Open Responses

Connects to any API that conforms to the [Open Responses](https://www.openresponses.org) specification (e.g. OpenAI, OpenRouter, or other compatible providers). Base URL is required—use your provider’s endpoint:
Connects to any API that conforms to the
[Open Responses](https://www.openresponses.org) specification
(e.g. OpenAI, OpenRouter, or other compatible providers).
Base URL is required—use your provider’s endpoint:

```swift
// Example: OpenAI
// Example: OpenRouter (https://openrouter.ai/api/v1/)
let model = OpenResponsesLanguageModel(
baseURL: URL(string: "https://api.openai.com/v1/")!,
baseURL: URL(string: "https://openrouter.ai/api/v1/")!,
apiKey: ProcessInfo.processInfo.environment["OPEN_RESPONSES_API_KEY"]!,
model: "gpt-4o-mini"
model: "openai/gpt-4o-mini"
)

// Example: OpenRouter (https://openrouter.ai/api/v1/)
// Example: OpenAI
let model = OpenResponsesLanguageModel(
baseURL: URL(string: "https://openrouter.ai/api/v1/")!,
baseURL: URL(string: "https://api.openai.com/v1/")!,
apiKey: ProcessInfo.processInfo.environment["OPEN_RESPONSES_API_KEY"]!,
model: "openai/gpt-4o-mini"
model: "gpt-4o-mini"
)

let session = LanguageModelSession(model: model)
let response = try await session.respond(to: "Say hello")
```

Custom options support Open Responses–specific fields such as `tool_choice` (including `allowed_tools`) and `extraBody`:
Custom options support Open Responses–specific fields,
such as `tool_choice` (including `allowed_tools`) and `extraBody`:

```swift
var options = GenerationOptions(temperature: 0.8)
Expand Down Expand Up @@ -667,56 +807,6 @@ let response = try await session.respond(
> [!TIP]
> Gemini server tools are not available as client tools (`Tool`) for other models.

### Ollama

Run models locally via Ollama's
[HTTP API](https://github.com/ollama/ollama/blob/main/docs/api.md):

```swift
// Default: connects to http://localhost:11434
let model = OllamaLanguageModel(model: "qwen3") // `ollama pull qwen3:8b`

// Custom endpoint
let model = OllamaLanguageModel(
endpoint: URL(string: "http://remote-server:11434")!,
model: "llama3.2"
)

let session = LanguageModelSession(model: model)
let response = try await session.respond {
Prompt("Tell me a joke")
}
```

For local models, make sure you're using a vision‑capable model
(for example, a `-vl` variant).
You can combine multiple images:

```swift
let model = OllamaLanguageModel(model: "qwen3-vl") // `ollama pull qwen3-vl:8b`
let session = LanguageModelSession(model: model)
let response = try await session.respond(
to: "Compare these posters and summarize their differences",
images: [
.init(url: URL(string: "https://example.com/poster1.jpg")!),
.init(url: URL(fileURLWithPath: "/path/to/poster2.jpg"))
]
)
print(response.content)
```

Pass any model-specific parameters using custom generation options:

```swift
var options = GenerationOptions(temperature: 0.8)
options[custom: OllamaLanguageModel.self] = [
"seed": .int(42),
"repeat_penalty": .double(1.2),
"num_ctx": .int(4096),
"stop": .array([.string("###")])
]
```

## Testing

Run the test suite to verify everything works correctly:
Expand Down