Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 15 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,31 @@ The OpenML documentation in written in MarkDown. The sources are generated by [M

The overal structure (navigation) of the docs is configurated in the `mkdocs.yml` file.

Some of the API's use other documentation generators, such as [Sphinx](https://restcoder.readthedocs.io/en/latest/sphinx-docgen.html) in openml-python. This documentation is pulled in via iframes to gather all docs into the same place, but they need to be edited in their own GitHub repo's.
This documentation of other APIs is pulled in using the [multirepo plugin](https://github.com/jdoiro3/mkdocs-multirepo-plugin) to gather all docs into the same place, but they need to be edited in their own GitHub repo's. This allows the documentation to live closer to the code and follow conventions of the respective community.

## Editing documentation
Documentation can be edited by simply editing the markdown files in the `docs` folder and creating a pull request.

End users can edit the docs by simply clicking the edit button (the pencil icon) on the top of every documentation page. It will open up an editing page on [GitHub](https://github.com/) (you do need to be logged in on GitHub). When you are done, add a small message explaining the change and click 'commit changes'. On the next page, just launch the pull request. We will then review it and approve the changes, or discuss them if necessary.

For other information on how to write and build documentation locally, see our [contributing](./contributing/OpenML-Docs.md#General-Documentation) page.

## Deployment
The documentation is hosted on GitHub pages.

To deploy the documentation, you need to have MkDocs and MkDocs-Material installed, and then run `mkdocs gh-deploy` in the top directory (with the `mkdocs.yml` file). This will build the HTML files and push them to the gh-pages branch of openml/docs. `https://docs.openml.org` is just a reverse proxy for `https://openml.github.io/docs/`.
To deploy the documentation, you need to have MkDocs installed locally, and then run `mkdocs gh-deploy` in the top directory (with the `mkdocs.yml` file). This will build the HTML files and push them to the gh-pages branch of openml/docs. `https://docs.openml.org` is just a reverse proxy for `https://openml.github.io/docs/`.

MkDocs and all required extensions can be installed as follows:
```
pip install -r requirements.txt
```

MKDocs and MkDocs-Material can be installed as follows:
To test the documentation locally, run
```
pip install mkdocs
pip install mkdocs-material
pip install -U fontawesome_markdown
mkdocs serve
```

To deploy to GitHub Pages, run
```
mkdocs gh-deploy
```
33 changes: 22 additions & 11 deletions docs/contributing/OpenML-Docs.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,34 @@
## Documentation

Documentation of OpenML consists of the general information pages, such as these, that include common concepts.
Additionally, each software package such as the Python, Java, and R connectors has their own documentation.
For convenience, those documentation pages are also available through this common documentation portal.

We always value contributions to our documentation. If you notice any mistake in these documentation pages, click the :material-pencil: button (on the top right). It will open up an editing page on [GitHub](https://github.com/) (you do need to be logged in). When you are done, add a small message explaining the change and click 'commit changes'. On the next page, just launch the pull request. We will then review it and approve the changes, or discuss them if necessary.

Below you can find more information about how each set of documentation pages is built.

## General Documentation
High-quality and up-to-date documentation are crucial. If you notice any mistake in these documentation pages, click the :material-pencil: button (on the top right). It will open up an editing page on [GitHub](https://github.com/) (you do need to be logged in). When you are done, add a small message explaining the change and click 'commit changes'. On the next page, just launch the pull request. We will then review it and approve the changes, or discuss them if necessary.

The sources are generated by [MkDocs](http://www.mkdocs.org/), using the [Material theme](https://squidfunk.github.io/mkdocs-material/).
Check these docs to see what is possible in terms of styling.

OpenML is a big project with multiple repositories. To keep the documentation close to the code, it will always be kept in the relevant repositories (see below), and
OpenML is a big project with multiple repositories.
To keep the documentation close to the code, it will always be kept in the relevant repositories (see below), and
combined into these documentation pages using [MkDocs multirepo](https://github.com/jdoiro3/mkdocs-multirepo-plugin/issues/3).

!!! note "Developer note"
To work on the documentation locally, do the following:
```
git clone https://github.com/openml/docs.git
pip install -r requirements.txt
```
To build the documentation, run `mkdocs serve` in the top directory (with the `mkdocs.yml` file). Any changes made after that will be hot-loaded.
To build the documentation locally, first make sure all dependencies specified in `requirements.txt` are installed:

```bash
python -m venv .venv
source .venv/bin/activate
python -m pip install uv
uv pip install -r requirements.txt
```

The documentation will be auto-deployed with every push or merge with the master branch of `https://www.github.com/openml/docs/`. In the background, a CI job
will run `mkdocs gh-deploy`, which will build the HTML files and push them to the gh-pages branch of openml/docs. `https://docs.openml.org` is just a reverse proxy for `https://openml.github.io/docs/`.
After installing the dependencies, run `mkdocs serve -f mkdocs-local.yml` in the top directory (with the `mkdocs.yml` file). Any changes made after that will be hot-loaded.

To build the full documentation, including importing the documentation from other repositories, run `mkdocs serve` in the top directory (with the `mkdocs.yml` file). This can take a while to compile, so only use this when needed. You might also need to set `export NUMPY_EXPERIMENTAL_DTYPE_API=1` (or `set NUMPY_EXPERIMENTAL_DTYPE_API=1` on Windows).

## Python API
To edit the tutorial, you have to edit the `reStructuredText` files on [openml-python/doc](https://github.com/openml/openml-python/tree/master/doc). When done, you can do a pull request.
Expand Down
51 changes: 5 additions & 46 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,56 +15,15 @@ icon: material/creation
<p><i class="fa fa-graduation-cap fa-fw fa-lg"></i>&nbsp; Make your work more visible and reusable</p>
<p><i class="fa fa-bolt fa-fw fa-lg"></i>&nbsp; Built for automation: streamline your experiments and model building</p>

## Installation
## How to use OpenML

The OpenML package is available in many languages and across libraries. For more information about them, see the [Integrations](./ecosystem/index.md) page.<br><br>
OpenML is accessible to a wide range of people:

=== "Python/sklearn"
:computer: <a href="https://www.openml.org" target='blank_'>Explore the OpenML website</a> to discover, download and upload ML resources.

- [Python/sklearn repository](https://github.com/openml/openml-python)
- `pip install openml`
:robot: [Install an OpenML library](intro/index.md) to access and share resources programmatically through our APIs. Select one of the detailed guides in the top menu.

=== "Pytorch"

- [Pytorch repository](https://github.com/openml/openml-pytorch)
- `pip install openml-pytorch`

=== "Keras"

- [Keras repository](https://github.com/openml/openml-keras)
- `pip install openml-keras`

=== "TensorFlow"

- [TensorFlow repository](https://github.com/openml/openml-tensorflow)
- `pip install openml-tensorflow`

=== "R"

- [R repository](https://github.com/openml/openml-R)
- `install.packages("mlr3oml")`
=== "Julia"

- [Julia repository](https://github.com/JuliaAI/OpenML.jl/tree/master)
- `using Pkg;Pkg.add("OpenML")`

=== "RUST"

- [RUST repository](https://github.com/mbillingr/openml-rust)
- Install from source

=== ".Net"

- [.Net repository](https://github.com/openml/openml-dotnet)
- `Install-Package openMl`


You might also need to set up the API key. For more information, see [Authentication](http://localhost:8000/concepts/openness/).

## Learning OpenML

Aside from the individual package documentations, you can learn more about OpenML through the following resources:<br>
The core concepts of OpenML are explained in the [Concepts](./concepts/index.md) page. These concepts include the principle behind using Datasets, Runs, Tasks, Flows, Benchmarking and much more. Going through them will help you leverage OpenML even better in your work.<br>
:mortar_board: [Get started](./concepts/index.md) by learning more about the structure and concepts behind OpenML, such as Datasets, Tasks, Flows, Runs, Benchmarking and much more. This will help you leverage OpenML even better in your work.

## Contributing to OpenML

Expand Down
107 changes: 107 additions & 0 deletions docs/intro/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
icon: material/rocket-launch
---

## :computer: Installation

The OpenML package is available in many languages and has deep integration in many machine learning libraries.

=== "Python/sklearn"

- [Python/sklearn repository](https://github.com/openml/openml-python)
- `pip install openml`

=== "Pytorch"

- [Pytorch repository](https://github.com/openml/openml-pytorch)
- `pip install openml-pytorch`

=== "TensorFlow"

- [TensorFlow repository](https://github.com/openml/openml-tensorflow)
- `pip install openml-tensorflow`

=== "R"

- [R repository](https://github.com/openml/openml-R)
- `install.packages("mlr3oml")`

=== "Julia"

- [Julia repository](https://github.com/JuliaAI/OpenML.jl/tree/master)
- `using Pkg;Pkg.add("OpenML")`

=== "RUST"

- [RUST repository](https://github.com/mbillingr/openml-rust)
- Install from source

=== ".Net"

- [.Net repository](https://github.com/openml/openml-dotnet)
- `Install-Package openMl`

You can find detailed guides for the different libraries in the top menu.


## :key: Authentication

OpenML is entirely open and you do not need an account to access data (rate limits apply). However, <a href="https://www.openml.org" target='blank_'>signing up via the OpenML website</a> is very easy (and free) and required to upload new resources to OpenML and to manage them online.

API authentication happens via an **API key**, which you can find in your profile after logging in to openml.org.

```
openml.config.apikey = "YOUR KEY"
```

## :joystick: Minimal Example

:material-database: Use the following code to load the [credit-g](https://www.openml.org/search?type=data&sort=runs&status=active&id=31) [dataset](https://docs.openml.org/concepts/data/) directly into a pandas dataframe. Note that OpenML can automatically load all datasets, separate data X and labels y, and give you useful dataset metadata (e.g. feature names and which ones have categorical data).

```python
import openml

dataset = openml.datasets.get_dataset("credit-g") # or by ID get_dataset(31)
X, y, categorical_indicator, attribute_names = dataset.get_data(target="class")
```


:trophy: Get a [task](https://docs.openml.org/concepts/tasks/) for [supervised classification on credit-g](https://www.openml.org/search?type=task&id=31&source_data.data_id=31).
Tasks specify how a dataset should be used, e.g. including train and test splits.

```python
task = openml.tasks.get_task(31)
dataset = task.get_dataset()
X, y, categorical_indicator, attribute_names = dataset.get_data(target=task.target_name)
# get splits for the first fold of 10-fold cross-validation
train_indices, test_indices = task.get_train_test_split_indices(fold=0)
```

:bar_chart: Use an [OpenML benchmarking suite](https://docs.openml.org/concepts/benchmarking/) to get a curated list of machine-learning tasks:
```python
suite = openml.study.get_suite("amlb-classification-all") # Get a curated list of tasks for classification
for task_id in suite.tasks:
task = openml.tasks.get_task(task_id)
```

:star2: You can now benchmark your models easily across many datasets at once. A model training is called a run:

```python
from sklearn import neighbors

task = openml.tasks.get_task(403)
clf = neighbors.KNeighborsClassifier(n_neighbors=5)
run = openml.runs.run_model_on_task(clf, task)
```

:raised_hands: You can now publish your experiment on OpenML so that others can build on it:

```python
myrun = run.publish()
print(f"kNN on {data.name}: {myrun.openml_url}")
```


## Learning more OpenML

Next, check out the :rocket: [10 minute tutorial](notebooks/getting_started.ipynb) and the :mortar_board: [short description of OpenML concepts](concepts/index.md).
2 changes: 1 addition & 1 deletion docs/notebooks/getting_started.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Getting Started\n",
"# OpenML in 10 minutes\n",
"\n",
"This page will guide you through the process of getting started with OpenML. While this page is a good starting point, for more detailed information, please refer to the [integrations section](Scikit-learn/index.md) and the rest of the documentation.\n",
"\n"
Expand Down
13 changes: 10 additions & 3 deletions mkdocs-local.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,12 @@ markdown_extensions:
plugins:
- autorefs
- section-index
- mkdocs-jupyter:
ignore: ['temp_dir/**/*','docs/examples/**/*']
theme: light
remove_tag_config:
remove_input_tags:
- hide_code
- redirects:
redirect_maps:
'APIs.md': 'https://www.openml.org/apis'
Expand All @@ -98,9 +104,10 @@ plugins:
- git-committers:
repository: openml/docs
nav:
- OpenML:
- Introduction: index.md
- Getting Started: notebooks/getting_started.ipynb
- OpenML: index.md
- Get Started:
- OpenML: intro/index.md
- 10 Minute Tutorial: notebooks/getting_started.ipynb
- Concepts:
- Main concepts: concepts/index.md
- Data: concepts/data.md
Expand Down
10 changes: 7 additions & 3 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,8 @@ plugins:
docstring_section_style: table
show_docstring_functions: true
docstring_style: numpy
follow_imports: false
show_submodules: false
- gen-files:
scripts:
- scripts/gen_python_ref_pages.py
Expand All @@ -131,9 +133,10 @@ plugins:
- git-committers:
repository: openml/docs
nav:
- OpenML:
- Introduction: index.md
- Getting Started: notebooks/getting_started.ipynb
- OpenML: index.md
- Get Started:
- OpenML: intro/index.md
- 10 Minute Tutorial: notebooks/getting_started.ipynb
- Concepts:
- Main concepts: concepts/index.md
- Data: concepts/data.md
Expand Down Expand Up @@ -213,6 +216,7 @@ extra_css:
- css/extra.css
extra_javascript:
- js/extra.js
- js/reset_nav.js
exclude_docs: |
scripts/
old/
Expand Down
21 changes: 11 additions & 10 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,15 @@ mkdocs-redirects==1.2.1
mkdocs-jupyter==0.25.0
mkdocs-awesome-pages-plugin==2.9.3
mkdocs-multirepo-plugin==0.8.3
mkdocs-autorefs
mkdocs-section-index
mkdocs-gen-files
mkdocs-literate-nav
mkdocs-git-committers-plugin-2
mkdocs-git-revision-date-localized-plugin
mkdocstrings
mkdocstrings-python
markdown-include
mkdocs-autorefs==1.2.0
mkdocs-section-index==0.3.9
mkdocs-gen-files==0.5.0
mkdocs-literate-nav==0.6.1
mkdocs-git-committers-plugin-2==2.5.0
mkdocs-git-revision-date-localized-plugin==1.3.0
mkdocstrings==0.26.2
mkdocstrings-python==1.12.1
markdown-include==0.8.1
notebook==6.4.12
tqdm
jupyter_contrib_nbextensions==0.7.0
tqdm
Loading