@@ -5,4 +5,141 @@ title: Quick Start
55sidebar_position : 1
66---
77
8- Welcome to the quickstart guide.
8+ Welcome to the Clowder quickstart guide.
9+
10+ ## Requirements
11+
12+ What do you need to run Clowder?
13+
14+ Clowder runs on top of Kubernetes, so you will need a Kubernetes cluster. You can use any Kubernetes provider, such as:
15+
16+ - [ Official Kubernetes] ( https://kubernetes.io/ )
17+ - [ Minikube] ( https://minikube.sigs.k8s.io/docs/ )
18+ - [ Kind] ( https://kind.sigs.k8s.io/ )
19+ - [ K3s] ( https://k3s.io/ )
20+ - [ OpenShift] ( https://www.openshift.com/ )
21+
22+ As well as managed Kubernetes services, such as:
23+
24+ - [ Amazon EKS] ( https://aws.amazon.com/eks/ )
25+ - [ Google GKE] ( https://cloud.google.com/kubernetes-engine )
26+ - [ Azure AKS] ( https://azure.microsoft.com/en-us/services/kubernetes-service/ )
27+
28+ Because Clowder runs inference on AI models, you will want hardware to run your model. That can be both a CPU and a GPU as well
29+ as dedicated inference processors.
30+
31+ If you want to run on an [ 0xide sled] ( https://oxide.computer/ ) , you will need to deploy a Kubernetes cluster to it.
32+ The good folks at Ainekko have created
33+ [ a single-binary Kubernetes installer and controller for 0xide sleds] ( https://github.com/aifoundry-org/oxide-controller ) .
34+
35+ ## Deployment
36+
37+ Once you have your Kubernetes cluster up and running, and credentials to access it,
38+ deploy Clowder to the Kubernetes cluster.
39+
40+ You can deploy Clowder using either ` helm ` or ` kubectl ` . The recommended way is to use ` helm ` , but you can also use ` kubectl ` directly.
41+
42+ ### With helm
43+
44+ The easiest way to install Clowder is using [ helm] ( https://helm.sh/ ) package manager:
45+
46+ ``` sh
47+ helm install clowder oci://docker.io/aifoundryorg/clowder
48+ ```
49+
50+ ### With kubectl
51+
52+ Otherwise you can use kubernetes cli directly. For this to work you should clone
53+ this repository first:
54+
55+ ``` sh
56+ git clone https://github.com/clowder-dev/clowder.git
57+ cd clowder
58+ ```
59+
60+ Then deploy all the parts with a single command:
61+
62+ ``` sh
63+ kubectl apply -f k8s/
64+ ```
65+
66+ ## Local Access
67+
68+ The Clowder API is exposed using a single Kubernetes ` Service ` . If you want it exposed to your local
69+ machine, use ` kubectl port-forward ` :
70+
71+ ``` sh
72+ # Start this and leave running in a separate terminal window.
73+ kubectl port-forward svc/nekko-lb-svc 3090:3090
74+ ```
75+
76+ ## Use it!
77+
78+ For our examples, we will assumg you have forwarded the Clowder API to your local machine on port ` 3090 ` , per the above instructions.
79+ If you are running it elsewhere, you will need to adjust the API URL accordingly.
80+
81+ Let's download a model and run inference on it. For access, we need the authentication token for the Clowder API.
82+ Unless configured otherwise, the default token is ` nekko-admin-token ` . Since we did a default quickstart installation, we can use this token to
83+ access the API.
84+
85+ First, download the model. We will use the [ SmolLM2-135M-Instruct] ( https://huggingface.co/unsloth/SmolLM2-135M-Instruct-GGUF ) model from Hugging Face.
86+
87+ Let's get a list of physical nodes to decide on which node to deploy our model with runtime:
88+
89+ ``` sh
90+ kubectl get nodes
91+ NAME STATUS ROLES AGE VERSION
92+ ainekko-control-plane-0-10ahb6ro Ready control-plane,etcd,master 85m v1.32.4+k3s1
93+ ainekko-worker-1747986577-r7qql3ie Ready < none> 85m v1.32.4+k3s1
94+ ```
95+
96+ Mark that the worker node is called ` ainekko-worker-1747986577-r7qql3ie ` . We will use this node to deploy our model.
97+ In the future, you will be able to:
98+
99+ * use the Clowder API to get a list of nodes
100+ * select a node based on its labels
101+ * select a node based on its capabilities, such as GPU or CPU
102+ * tell Clowder to automatically select a node for you
103+
104+ For now, we will use the node name directly.
105+
106+ ``` sh
107+ curl -H " Authorization: Bearer nekko-admin-token" \
108+ -X POST \
109+ --data ' {"modelUrl": "hf:///unsloth//SmolLM2-135M-Instruct-GGUF/SmolLM2-135M-Instruct-Q4_K_M.gguf", "modelAlias": "smol", "nodeName": "ainekko-worker-1747986577-r7qql3ie", "credentials": "YOUR_HUGGING_FACE_TOKEN"}' \
110+ -i \
111+ http://localhost:3090/api/v1/workers
112+ ```
113+
114+ This downloads the model and starts a worker runtime pod, let's check:
115+
116+ ``` sh
117+ curl -H " Authorization: Bearer nekko-admin-token" http://localhost:3090/api/v1/workers/list
118+ ```
119+
120+ This gives the result:
121+
122+ ``` json
123+ {"count" :1 ,"status" :" success" ,"workers" :[{"name" :" " ,"model_name" :" smol" ,"model_alias" :" smol" }]}
124+ ```
125+
126+ We can now use ` http://localhost:3090/v1/chat_completions ` as we would any
127+ OpenAI API compatible chat completions endpoint.
128+
129+ ### Web UI
130+
131+ By default, Open WebUI client app is deployed on the cluster.
132+
133+ Expose it locally with:
134+
135+ ``` sh
136+ # Start this and leave it running in a separate terminal window.
137+ kubectl port-forward svc/open-webui 4080:4080
138+ ```
139+
140+ Now we can open the UI at [ http://localhost:4080 ] ( http://localhost:4080 ) , select the model
141+ and have a chat.
142+
143+
144+
145+
0 commit comments