Skip to content
This repository was archived by the owner on Jul 23, 2025. It is now read-only.

Commit bf14bd7

Browse files
authored
Add instructions for llama.cpp (#16)
* Add llama.cpp usage instructions * Quick fix to admonition title syntax
1 parent 6eac9cd commit bf14bd7

2 files changed

Lines changed: 34 additions & 3 deletions

File tree

docs/how-to/use-with-continue.mdx

Lines changed: 33 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -273,8 +273,39 @@ Replace `YOUR_API_KEY` with your
273273
</TabItem>
274274
<TabItem value="llamacpp" label="llama.cpp">
275275

276-
Replace `MODEL_NAME` with the name of a model you have available locally with
277-
`llama.cpp`, such as `qwen2.5-coder-1.5b-instruct-q5_k_m`.
276+
:::note Performance
277+
278+
Docker containers on macOS cannot access the GPU, which impacts the performance
279+
of llama.cpp in CodeGate. For better performance on macOS, we recommend using a
280+
standalone Ollama installation.
281+
282+
:::
283+
284+
CodeGate has built-in support for llama.ccp. This is considered an advanced
285+
option, best suited to quick experimentation with various coding models.
286+
287+
To use this provider, download your desired model file in GGUF format from the
288+
[Hugging Face library](https://huggingface.co/models?library=gguf&sort=trending).
289+
Then copy it into the `/app/codegate_volume/models` directory in the CodeGate
290+
container. To persist models between restarts, run CodeGate with a Docker
291+
volume as shown in the [recommended configuration](./install.md#recommended-settings).
292+
293+
Example using huggingface-cli to download our recommended models for chat (at
294+
least a 7B model is recommended for best results) and autocomplete (a 1.5B or 3B
295+
model is recommended for performance):
296+
297+
```bash
298+
# For chat functions
299+
huggingface-cli download Qwen/Qwen2.5-7B-Instruct-GGUF qwen2.5-7b-instruct-q5_k_m.gguf --local-dir .
300+
docker cp qwen2.5-7b-instruct-q5_k_m.gguf codegate:/app/codegate_volume/models/
301+
302+
# For autocomplete functions
303+
huggingface-cli download Qwen/Qwen2.5-1.5B-Instruct-GGUF qwen2.5-1.5b-instruct-q5_k_m.gguf --local-dir .
304+
docker cp qwen2.5-1.5b-instruct-q5_k_m.gguf codegate:/app/codegate_volume/models/
305+
```
306+
307+
In the Continue config file, replace `MODEL_NAME` with the file name without the
308+
.gguf extension, for example `qwen2.5-coder-7b-instruct-q5_k_m`.
278309

279310
```json title="~/.continue/config.json"
280311
{

docs/quickstart-copilot.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ browser: [http://localhost:9090](http://localhost:9090)
5252
To enable CodeGate, you must install its Certificate Authority (CA) into your
5353
certificate trust store.
5454

55-
:::info[Why is this needed?]
55+
:::info Why is this needed?
5656

5757
The CA certificate allows CodeGate to securely intercept and modify traffic
5858
between GitHub Copilot and your IDE. Decrypted traffic never leaves your local

0 commit comments

Comments
 (0)