Skip to content

Commit f67996d

Browse files
committed
populate quickstart
Signed-off-by: Avi Deitcher <avi@deitcher.net>
1 parent 8f28c37 commit f67996d

File tree

1 file changed

+138
-1
lines changed

1 file changed

+138
-1
lines changed

docs/quickstart/index.md

Lines changed: 138 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,141 @@ title: Quick Start
55
sidebar_position: 1
66
---
77

8-
Welcome to the quickstart guide.
8+
Welcome to the Clowder quickstart guide.
9+
10+
## Requirements
11+
12+
What do you need to run Clowder?
13+
14+
Clowder runs on top of Kubernetes, so you will need a Kubernetes cluster. You can use any Kubernetes provider, such as:
15+
16+
- [Official Kubernetes](https://kubernetes.io/)
17+
- [Minikube](https://minikube.sigs.k8s.io/docs/)
18+
- [Kind](https://kind.sigs.k8s.io/)
19+
- [K3s](https://k3s.io/)
20+
- [OpenShift](https://www.openshift.com/)
21+
22+
As well as managed Kubernetes services, such as:
23+
24+
- [Amazon EKS](https://aws.amazon.com/eks/)
25+
- [Google GKE](https://cloud.google.com/kubernetes-engine)
26+
- [Azure AKS](https://azure.microsoft.com/en-us/services/kubernetes-service/)
27+
28+
Because Clowder runs inference on AI models, you will want hardware to run your model. That can be both a CPU and a GPU as well
29+
as dedicated inference processors.
30+
31+
If you want to run on an [0xide sled](https://oxide.computer/), you will need to deploy a Kubernetes cluster to it.
32+
The good folks at Ainekko have created
33+
[a single-binary Kubernetes installer and controller for 0xide sleds](https://github.com/aifoundry-org/oxide-controller).
34+
35+
## Deployment
36+
37+
Once you have your Kubernetes cluster up and running, and credentials to access it,
38+
deploy Clowder to the Kubernetes cluster.
39+
40+
You can deploy Clowder using either `helm` or `kubectl`. The recommended way is to use `helm`, but you can also use `kubectl` directly.
41+
42+
### With helm
43+
44+
The easiest way to install Clowder is using [helm](https://helm.sh/) package manager:
45+
46+
```sh
47+
helm install clowder oci://docker.io/aifoundryorg/clowder
48+
```
49+
50+
### With kubectl
51+
52+
Otherwise you can use kubernetes cli directly. For this to work you should clone
53+
this repository first:
54+
55+
```sh
56+
git clone https://github.com/clowder-dev/clowder.git
57+
cd clowder
58+
```
59+
60+
Then deploy all the parts with a single command:
61+
62+
```sh
63+
kubectl apply -f k8s/
64+
```
65+
66+
## Local Access
67+
68+
The Clowder API is exposed using a single Kubernetes `Service`. If you want it exposed to your local
69+
machine, use `kubectl port-forward`:
70+
71+
```sh
72+
# Start this and leave running in a separate terminal window.
73+
kubectl port-forward svc/nekko-lb-svc 3090:3090
74+
```
75+
76+
## Use it!
77+
78+
For our examples, we will assumg you have forwarded the Clowder API to your local machine on port `3090`, per the above instructions.
79+
If you are running it elsewhere, you will need to adjust the API URL accordingly.
80+
81+
Let's download a model and run inference on it. For access, we need the authentication token for the Clowder API.
82+
Unless configured otherwise, the default token is `nekko-admin-token`. Since we did a default quickstart installation, we can use this token to
83+
access the API.
84+
85+
First, download the model. We will use the [SmolLM2-135M-Instruct](https://huggingface.co/unsloth/SmolLM2-135M-Instruct-GGUF) model from Hugging Face.
86+
87+
Let's get a list of physical nodes to decide on which node to deploy our model with runtime:
88+
89+
```sh
90+
kubectl get nodes
91+
NAME STATUS ROLES AGE VERSION
92+
ainekko-control-plane-0-10ahb6ro Ready control-plane,etcd,master 85m v1.32.4+k3s1
93+
ainekko-worker-1747986577-r7qql3ie Ready <none> 85m v1.32.4+k3s1
94+
```
95+
96+
Mark that the worker node is called `ainekko-worker-1747986577-r7qql3ie`. We will use this node to deploy our model.
97+
In the future, you will be able to:
98+
99+
* use the Clowder API to get a list of nodes
100+
* select a node based on its labels
101+
* select a node based on its capabilities, such as GPU or CPU
102+
* tell Clowder to automatically select a node for you
103+
104+
For now, we will use the node name directly.
105+
106+
```sh
107+
curl -H "Authorization: Bearer nekko-admin-token" \
108+
-X POST \
109+
--data '{"modelUrl": "hf:///unsloth//SmolLM2-135M-Instruct-GGUF/SmolLM2-135M-Instruct-Q4_K_M.gguf", "modelAlias": "smol", "nodeName": "ainekko-worker-1747986577-r7qql3ie", "credentials": "YOUR_HUGGING_FACE_TOKEN"}' \
110+
-i \
111+
http://localhost:3090/api/v1/workers
112+
```
113+
114+
This downloads the model and starts a worker runtime pod, let's check:
115+
116+
```sh
117+
curl -H "Authorization: Bearer nekko-admin-token" http://localhost:3090/api/v1/workers/list
118+
```
119+
120+
This gives the result:
121+
122+
```json
123+
{"count":1,"status":"success","workers":[{"name":"","model_name":"smol","model_alias":"smol"}]}
124+
```
125+
126+
We can now use `http://localhost:3090/v1/chat_completions` as we would any
127+
OpenAI API compatible chat completions endpoint.
128+
129+
### Web UI
130+
131+
By default, Open WebUI client app is deployed on the cluster.
132+
133+
Expose it locally with:
134+
135+
```sh
136+
# Start this and leave it running in a separate terminal window.
137+
kubectl port-forward svc/open-webui 4080:4080
138+
```
139+
140+
Now we can open the UI at [http://localhost:4080](http://localhost:4080), select the model
141+
and have a chat.
142+
143+
144+
145+

0 commit comments

Comments
 (0)