You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/concepts/services/index.html
+50Lines changed: 50 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -4748,6 +4748,56 @@ <h3 id="replicas-and-scaling">Replicas and scaling<a class="headerlink" href="#r
4748
4748
<blockquote>
4749
4749
<p>The <code>scaling</code> property requires creating a <ahref="../gateways/">gateway</a>.</p>
4750
4750
</blockquote>
4751
+
<detailsclass="info">
4752
+
<summary>Replica groups</summary>
4753
+
<p>A service can include multiple replica groups. Each group can define its own <code>commands</code>, <code>resources</code> requirements, and <code>scaling</code> rules.</p>
<p>Properties such as <code>regions</code>, <code>port</code>, <code>image</code>, <code>env</code> and some other cannot be configured per replica group. This support is coming soon.</p>
4795
+
</blockquote>
4796
+
</details>
4797
+
<detailsclass="info">
4798
+
<summary>Disaggregated serving</summary>
4799
+
<p>Native support for disaggregated prefill and decode, allowing both worker types to run within a single service, is coming soon.</p>
0 commit comments