You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/rpc.md
+12-9Lines changed: 12 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Building and Using the RPC Server with `stable-diffusion.cpp`
2
2
3
-
This guide covers how to build a version of the RPC server from `llama.cpp` that is compatible with your version of `stable-diffusion.cpp` to manage multi-backends setups. RPC allows you to offload specific model components to a remote server.
3
+
This guide covers how to build a version of [the RPC server from `llama.cpp`](https://github.com/ggml-org/llama.cpp/blob/master/tools/rpc/README.md) that is compatible with your version of `stable-diffusion.cpp` to manage multi-backends setups. RPC allows you to offload specific model components to a remote server.
4
4
5
5
> **Note on Model Location:** The model files (e.g., `.safetensors` or `.gguf`) remain on the **Client** machine. The client parses the file and transmits the necessary tensor data and computational graphs to the server. The server does not need to store the model files locally.
6
6
@@ -16,7 +16,7 @@ cmake .. \
16
16
cmake --build . --config Release -j $(nproc)
17
17
```
18
18
19
-
> **Note:** Ensure you add the other flags you would normally use (e.g., `-DSD_VULKAN=ON`, `-DSD_CUDA=ON`, `-DSD_HIPBLAS=ON`, or `-DGGML_METAL=ON`), for more information about building `stable-diffusion.cpp` from source, please refer to the `build.md` documentation.
19
+
> **Note:** Ensure you add the other flags you would normally use (e.g., `-DSD_VULKAN=ON`, `-DSD_CUDA=ON`, `-DSD_HIPBLAS=ON`, or `-DGGML_METAL=ON`), for more information about building `stable-diffusion.cpp` from source, please refer to the [docs/build.md](docs%build.md) documentation.
To save on download time and storage, you can use a shallow clone to download only the target commit:
42
+
To save on download time and storage, you can use a shallow clone to download only the target commit:
44
43
```bash
45
44
mkdir -p llama.cpp
46
45
cd llama.cpp
@@ -54,7 +53,7 @@ To save on download time and storage, you can use a shallow clone to download on
54
53
55
54
The RPC server acts as the worker. You must explicitly enable the **backend** (the hardware interface, such as CUDA for Nvidia, Metal for Apple Silicon, or Vulkan) when building, otherwise the server will default to using only the CPU.
56
55
57
-
To find the correct flags, refer to the official documentation for the `llama.cpp` repository.
56
+
To find the correct flags, refer to the official documentation for the [`llama.cpp`](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md) repository.
58
57
59
58
>**Crucial:** You must include the compiler flags required to satisfy the API compatibility with `stable-diffusion.cpp` (`-DGGML_MAX_NAME=128`). Without this flag, `GGML_MAX_NAME` will default to `64`for the server, and data transfers between the client and server will fail. Of course, `-DGGML_RPC` must also be enabled.
60
59
>
@@ -167,18 +166,22 @@ Example: A main machine (192.168.1.10) with 3 GPUs, with one GPU running CUDA an
0 commit comments