Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
f6e3294
add StatsModels.jl to readme
ablaom Feb 19, 2025
d067068
Update README.md
ablaom Feb 20, 2025
22d5af4
fix formatting of link in README.md
ablaom Feb 20, 2025
18c154d
Update ROADMAP.md
ablaom Feb 20, 2025
b406b69
minor doc tweak
Mar 14, 2025
da305f5
update actions/cache to julia-actions/cache
Mar 14, 2025
f89288b
fix a bad link
Apr 6, 2025
719b7d7
Merge pull request #52 from JuliaAI/doc-fix
ablaom Apr 6, 2025
283de3f
spelling
ablaom May 13, 2025
920f7a5
add sees_features trait
ablaom Jul 29, 2025
6bed975
add new KindsOfLearner trait; and forgotten tweak to IID contract
ablaom Jul 29, 2025
6db67ea
tweak features/target/weights contracts; dump sees_features trait
ablaom Aug 10, 2025
35ee267
temporarily block out LearnTestAPI docs
ablaom Aug 10, 2025
146d614
roll out the kind_of(learner) additions and changes
ablaom Aug 10, 2025
49d3fe6
fix some whitespace
ablaom Aug 10, 2025
80b3aad
remove fallback for weights
ablaom Aug 21, 2025
8b01f86
doc updates
ablaom Aug 21, 2025
d925e51
more doc updates
ablaom Aug 21, 2025
50ec2fa
fix mistake in table of traits re `kind_of(learner)`
ablaom Aug 21, 2025
df811e3
doc tweak
ablaom Aug 21, 2025
4541243
bump 0.2.0
ablaom Aug 23, 2025
9885d58
doc tweak
ablaom Aug 24, 2025
65041b8
remove out-dated tests
ablaom Aug 24, 2025
5d504d7
fix typo
ablaom Aug 24, 2025
55c550b
add a test
ablaom Aug 24, 2025
67a900d
fix mistake in test
ablaom Aug 24, 2025
5e31043
temporarily remove LearnTestAPI from the docs/Project.toml
ablaom Aug 24, 2025
69d8d3e
add "verbosity" preference and `default_verbosity()`
ablaom Oct 18, 2025
e74f1d0
adjust docstrings/docs to reflect verbosity change
ablaom Oct 18, 2025
77de486
update Anatomy of an Implementation re verbosity
ablaom Oct 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
with:
version: ${{ matrix.version }}
arch: ${{ matrix.arch }}
- uses: actions/cache@v1
- uses: julia-actions/cache@v1
env:
cache-name: cache-artifacts
with:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ sandbox/
/docs/site/
/docs/Manifest.toml
.vscode
LocalPreferences.toml
10 changes: 7 additions & 3 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
authors = ["Anthony D. Blaom <anthony.blaom@gmail.com>"]
name = "LearnAPI"
uuid = "92ad9a40-7767-427a-9ee6-6e577f1266cb"
authors = ["Anthony D. Blaom <anthony.blaom@gmail.com>"]
version = "1.0.1"
version = "2.0.0"

[compat]
Preferences = "1.5.0"
julia = "1.10"

[deps]
Preferences = "21216c6a-2e73-6563-6e65-726566657250"

[extras]
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[targets]
test = ["Test",]
test = ["Test"]
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Here `learner` specifies the configuration the algorithm (the hyperparameters) w
`model` stores learned parameters and any byproducts of algorithm execution.

LearnAPI.jl is mostly method stubs and lots of documentation. It does not provide
meta-algorithms, such as cross-validation or hyperparameter optimization, but does aim to
meta-algorithms, such as cross-validation, hyperparameter optimization, or model composition, but does aim to
support such algorithms.

## Related packages
Expand All @@ -37,6 +37,8 @@ support such algorithms.

- [StatisticalMeasures.jl](https://github.com/JuliaAI/StatisticalMeasures.jl): Package providing metrics, compatible with LearnAPI.jl

- [StatsModels.jl](https://github.com/JuliaStats/StatsModels.jl): Provides the R-style formula implementation of data preprocessing handled by [LearnDataFrontEnds.jl](https://github.com/JuliaAI/LearnDataFrontEnds.jl)

### Selected packages providing alternative API's

The following alphabetical list of packages provide public base API's. Some provide
Expand Down
2 changes: 1 addition & 1 deletion ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"Common Implementation Patterns". As real-world implementations roll out, we could
increasingly point to those instead, to conserve effort
- [x] regression
- [ ] classification
- [x] classification
- [ ] clustering
- [x] gradient descent
- [x] iterative algorithms
Expand Down
1 change: 0 additions & 1 deletion docs/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterInterLinks = "d12716ef-a0f6-4df4-a9f1-a5a34e75c656"
LearnAPI = "92ad9a40-7767-427a-9ee6-6e577f1266cb"
LearnTestAPI = "3111ed91-c4f2-40e7-bb19-7f6c618409b8"
MLCore = "c2834f40-e789-41da-a90e-33b280584a8c"
ScientificTypesBase = "30f210dd-8aff-4c5f-94ba-8e64358c1161"
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
Expand Down
5 changes: 3 additions & 2 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@ using Documenter
using LearnAPI
using ScientificTypesBase
using DocumenterInterLinks
using LearnTestAPI
# using LearnTestAPI

const REPO = Remotes.GitHub("JuliaAI", "LearnAPI.jl")

makedocs(
modules=[LearnAPI, LearnTestAPI],
modules=[LearnAPI, ], #LearnTestAPI],
format=Documenter.HTML(
prettyurls = true,#get(ENV, "CI", nothing) == "true",
collapselevel = 1,
Expand All @@ -18,6 +18,7 @@ makedocs(
"Reference" => [
"Overview" => "reference.md",
"Public Names" => "list_of_public_names.md",
"Kinds of learner" => "kinds_of_learner.md",
"fit/update" => "fit_update.md",
"predict/transform" => "predict_transform.md",
"Kinds of Target Proxy" => "kinds_of_target_proxy.md",
Expand Down
130 changes: 74 additions & 56 deletions docs/src/anatomy_of_an_implementation.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Anatomy of an Implementation

The core LearnAPI.jl pattern looks like this:
LearnAPI.jl supports three core patterns. The default pattern, known as the
[`LearnAPI.Descriminative`](@ref) pattern, looks like this:

```julia
model = fit(learner, data)
Expand All @@ -10,38 +11,51 @@ predict(model, newdata)
Here `learner` specifies [hyperparameters](@ref hyperparameters), while `model` stores
learned parameters and any byproducts of algorithm execution.

Variations on this pattern:
[Transformers](@ref) ordinarily implement `transform` instead of `predict`. For more on
`predict` versus `transform`, see [Predict or transform?](@ref)

- [Transformers](@ref) ordinarily implement `transform` instead of `predict`. For more on
`predict` versus `transform`, see [Predict or transform?](@ref)
Two other `fit`/`predict`/`transform` patterns supported by LearnAPI.jl are:
[`LearnAPI.Generative`](@ref) which has the form:

- ["Static" (non-generalizing) algorithms](@ref static_algorithms), which includes some
simple transformers and some clustering algorithms, have a `fit` that consumes no
`data`. Instead `predict` or `transform` does the heavy lifting.
```julia
model = fit(learner, data)
predict(model) # a single distribution, for example
```

- In [density estimation](@ref density_estimation), the `newdata` argument in `predict` is
missing.
and [`LearnAPI.Static`](@ref), which looks like this:

```julia
model = fit(learner) # no `data` argument
predict(model, data) # may mutate `model` to record byproducts of computation
```

These are the basic possibilities.
Do not read too much into the names for these patterns, which are formalized [here](@ref kinds_of_learner). Use may not always correspond to prior associations.

Elaborating on the core pattern above, this tutorial details an implementation of the
LearnAPI.jl for naive [ridge regression](https://en.wikipedia.org/wiki/Ridge_regression)
with no intercept. The kind of workflow we want to enable has been previewed in [Sample
workflow](@ref). Readers can also refer to the [demonstration](@ref workflow) of the
implementation given later.
Elaborating on the common `Descriminative` pattern above, this tutorial details an
implementation of the LearnAPI.jl for naive [ridge
regression](https://en.wikipedia.org/wiki/Ridge_regression) with no intercept. The kind of
workflow we want to enable has been previewed in [Sample workflow](@ref). Readers can also
refer to the [demonstration](@ref workflow) of the implementation given later.

## A basic implementation
!!! tip "Quick Start for new implementations"

See [here](@ref code) for code without explanations.
1. From this tutorial, read at least "[A basic implementation](@ref)" below.
1. Looking over the examples in "[Common Implementation Patterns](@ref patterns)", identify the appropriate core learner pattern above for your algorithm.
1. Implement `fit` (probably following an existing example). Read the [`fit`](@ref) document string to see what else may need to be implemented, paying particular attention to the "New implementations" section.
3. Rinse and repeat with each new method implemented.
4. Identify any additional [learner traits](@ref traits) that have appropriate overloadings; use the [`@trait`](@ref) macro to define these in one block.
5. Ensure your implementation includes the compulsory method [`LearnAPI.learner`](@ref) and compulsory traits [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref). Read and apply "[Testing your implementation](@ref)".

We suppose our algorithm's `fit` method consumes data in the form `(X, y)`, where
`X` is a suitable table¹ (the features) and `y` a vector (the target).
If you get stuck, refer back to this tutorial and the [Reference](@ref reference) sections.

!!! important

Implementations wishing to support other data
patterns may need to take additional steps explained under
[Other data patterns](@ref di) below.
## A basic implementation

See [here](@ref code) for code without explanations.

Let us suppose our algorithm's `fit` method is to consume data in the form `(X, y)`, where
`X` is a suitable table¹ (the features, a.k.a., covariates or predictors) and `y` a vector
(the target, a.k.a., labels or response).

The first line below imports the lightweight package LearnAPI.jl whose methods we will be
extending. The second imports libraries needed for the core algorithm.
Expand Down Expand Up @@ -110,7 +124,7 @@ Note that we also include `learner` in the struct, for it must be possible to re
The implementation of `fit` looks like this:

```@example anatomy
function LearnAPI.fit(learner::Ridge, data; verbosity=1)
function LearnAPI.fit(learner::Ridge, data; verbosity=LearnAPI.default_verbosity())
X, y = data

# data preprocessing:
Expand Down Expand Up @@ -158,6 +172,22 @@ If the kind of proxy is omitted, as in `predict(model, Xnew)`, then a fallback g
first element of the tuple returned by [`LearnAPI.kinds_of_proxy(learner)`](@ref), which
we overload appropriately below.

### Data deconstructors: `target` and `features`

LearnAPI.jl is flexible about the form of training `data`. However, to buy into
meta-functionality, such as cross-validation, we'll need to say something about the
structure of this data. We implement [`LearnAPI.target`](@ref) to say what
part of the data constitutes a [target variable](@ref proxy), and
[`LearnAPI.features`](@ref) to say what are the features (valid `newdata` in a
`predict(model, newdata)` call):

```@example anatomy
LearnAPI.target(learner::Ridge, (X, y)) = y
LearnAPI.features(learner::Ridge, (X, y)) = X
```

Another data deconstructor, for learners that support per-observation weights in training,
is [`LearnAPI.weights`](@ref).

### [Accessor functions](@id af)

Expand Down Expand Up @@ -241,15 +271,11 @@ the *type* of the argument.
### The `functions` trait

The last trait, `functions`, above returns a list of all LearnAPI.jl methods that can be
meaningfully applied to the learner or associated model, with the exception of traits. You
always include the first five you see here: `fit`, `learner`, `clone` ,`strip`,
`obs`. Here [`clone`](@ref) is a utility function provided by LearnAPI that you never
overload, while [`obs`](@ref) is discussed under [Providing a separate data front
end](@ref) below and is always included because it has a meaningful fallback. The
`features` method, here provided by a fallback, articulates how the features `X` can be
extracted from the training data `(X, y)`. We must also include `target` here to flag our
model as supervised; again the method itself is provided by a fallback valid in the
present case.
meaningfully applied to the learner or the output of `fit` (denoted `model` above), with
the exception of traits. You always include the first five you see here: `fit`, `learner`,
`clone` ,`strip`, `obs`. Here [`clone`](@ref) is a utility function provided by LearnAPI
that you never overload, while [`obs`](@ref) is discussed under [Providing a separate data
front end](@ref) below and is always included because it has a meaningful fallback.

See [`LearnAPI.functions`](@ref) for a checklist of what the `functions` trait needs to
return.
Expand Down Expand Up @@ -340,11 +366,6 @@ assumptions about data from those made above.
under [Providing a separate data front end](@ref) below; or (ii) overload the trait
[`LearnAPI.data_interface`](@ref) to specify a more relaxed data API.

- Where the form of data consumed by `fit` is different from that consumed by
`predict/transform` (as in classical supervised learning) it may be necessary to
explicitly overload the functions [`LearnAPI.features`](@ref) and (if supervised)
[`LearnAPI.target`](@ref). The same holds if overloading [`obs`](@ref); see below.


## Providing a separate data front end

Expand Down Expand Up @@ -414,7 +435,7 @@ The [`obs`](@ref) methods exist to:

!!! important

While many new learner implementations will want to adopt a canned data front end, such as those provided by [LearnDataFrontEnds.jl](https://juliaai.github.io/LearnAPI.jl/dev/), we
While many new learner implementations will want to adopt a canned data front end, such as those provided by [LearnDataFrontEnds.jl](https://juliaai.github.io/LearnDataFrontEnds.jl/dev/), we
focus here on a self-contained implementation of `obs` for the ridge example above, to show
how it works.

Expand Down Expand Up @@ -448,14 +469,14 @@ newobservations = MLCore.getobs(observations, test_indices)
predict(model, newobservations)
```

which works for any non-static learner implementing `predict`, no matter how one is
supposed to accesses the individual observations of `data` or `newdata`. See also the
demonstration [below](@ref advanced_demo). Furthermore, fallbacks ensure the above pattern
still works if we choose not to implement a front end at all, which is allowed, if
supported `data` and `newdata` already implement `getobs`/`numobs`.
which works for any [`LearnAPI.Descriminative`](@ref) learner implementing `predict`, no
matter how one is supposed to accesses the individual observations of `data` or
`newdata`. See also the demonstration [below](@ref advanced_demo). Furthermore, fallbacks
ensure the above pattern still works if we choose not to implement a front end at all,
which is allowed, if supported `data` and `newdata` already implement `getobs`/`numobs`.

Here we specifically wrap all the preprocessed data into single object, for which we
introduce a new type:
In the ridge regression example we specifically wrap all the preprocessed data into single
object, for which we introduce a new type:

```@example anatomy2
struct RidgeFitObs{T,M<:AbstractMatrix{T}}
Expand All @@ -476,13 +497,13 @@ function LearnAPI.obs(::Ridge, data)
end
```

We informally refer to the output of `obs` as "observations" (see [The `obs`
contract](@ref) below). The previous core `fit` signature is now replaced with two
We informally refer to the output of `obs` as "observations" (see "[The `obs`
contract](@ref)" below). The previous core `fit` signature is now replaced with two
methods - one to handle "regular" input, and one to handle the pre-processed data
(observations) which appears first below:

```@example anatomy2
function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1)
function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=LearnAPI.default_verbosity())

lambda = learner.lambda

Expand Down Expand Up @@ -545,13 +566,10 @@ LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) =
predict(model, Point(), obs(model, Xnew))
```

### `features` and `target` methods
### Data deconstructors: `features` and `target`

Two methods [`LearnAPI.features`](@ref) and [`LearnAPI.target`](@ref) articulate how
features and target can be extracted from `data` consumed by LearnAPI.jl
methods. Fallbacks provided by LearnAPI.jl sufficed in our basic implementation
above. Here we must explicitly overload them, so that they also handle the output of
`obs(learner, data)`:
These methods must be able to handle any `data` supported by `fit`, which includes the
output of `obs(learner, data)`:

```@example anatomy2
LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A
Expand All @@ -573,7 +591,7 @@ LearnAPI.target(learner::Ridge, data) = LearnAPI.target(learner, obs(learner, da

Since LearnAPI.jl provides fallbacks for `obs` that simply return the unadulterated data
argument, overloading `obs` is optional. This is provided data in publicized
`fit`/`predict` signatures already consists only of objects implement the
`fit`/`predict` signatures already consists only of objects implementing the
[`LearnAPI.RandomAccess`](@ref) interface (most tables¹, arrays³, and tuples thereof).

To opt out of supporting the MLCore.jl interface altogether, an implementation must
Expand Down
6 changes: 4 additions & 2 deletions docs/src/common_implementation_patterns.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@ This guide is intended to be consulted after reading [Anatomy of an Implementati
which introduces the main interface objects and terminology.

Although an implementation is defined purely by the methods and traits it implements, many
implementations fall into one (or more) of the following informally understood patterns or
tasks:
implementations fall into one (or more) of the informally understood patterns or tasks
below. While some generally fall into one of the core `Descriminative`, `Generative` or
`Static` patterns detailed [here](@id kinds_of_learner), there are exceptions (such as
clustering, which has both `Descriminative` and `Static` variations).

- [Regression](@ref): Supervised learners for continuous targets

Expand Down
17 changes: 12 additions & 5 deletions docs/src/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ end
Instantiate a ridge regression learner, with regularization of `lambda`.
"""
Ridge(; lambda=0.1) = Ridge(lambda)
LearnAPI.constructor(::Ridge) = Ridge

# struct for output of `fit`
struct RidgeFitted{T,F}
Expand All @@ -33,7 +32,7 @@ struct RidgeFitted{T,F}
named_coefficients::F
end

function LearnAPI.fit(learner::Ridge, data; verbosity=1)
function LearnAPI.fit(learner::Ridge, data; verbosity=LearnAPI.default_verbosity())
X, y = data

# data preprocessing:
Expand All @@ -58,6 +57,10 @@ end
LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) =
Tables.matrix(Xnew)*model.coefficients

# data deconstructors:
LearnAPI.target(learner::Ridge, (X, y)) = y
LearnAPI.features(learner::Ridge, (X, y)) = X

# accessor functions:
LearnAPI.learner(model::RidgeFitted) = model.learner
LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients
Expand Down Expand Up @@ -126,7 +129,11 @@ function LearnAPI.obs(::Ridge, data)
end
LearnAPI.obs(::Ridge, observations::RidgeFitObs) = observations

function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1)
function LearnAPI.fit(
learner::Ridge,
observations::RidgeFitObs;
verbosity=LearnAPI.default_verbosity(),
)

lambda = learner.lambda

Expand Down Expand Up @@ -160,7 +167,7 @@ LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) =
LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) =
predict(model, Point(), obs(model, Xnew))

# methods to deconstruct training data:
# training data deconstructors:
LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A
LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y
LearnAPI.features(learner::Ridge, data) = LearnAPI.features(learner, obs(learner, data))
Expand Down Expand Up @@ -223,7 +230,7 @@ frontend = FrontEnds.Saffron()
LearnAPI.obs(learner::Ridge, data) = FrontEnds.fitobs(learner, data, frontend)
LearnAPI.obs(model::RidgeFitted, data) = obs(model, data, frontend)

function LearnAPI.fit(learner::Ridge, observations::FrontEnds.Obs; verbosity=1)
function LearnAPI.fit(learner::Ridge, observations::FrontEnds.Obs; verbosity=LearnAPI.default_verbosity())

lambda = learner.lambda

Expand Down
Loading
Loading