Skip to content
This repository was archived by the owner on Sep 9, 2025. It is now read-only.

Commit a72e0f2

Browse files
committed
sdg: Add details on how to configure serving for custom pipelines
The design proposal in #109, and the corresponding implementation in instructlab/sdg#86, raised the importance of clearly defining how a custom pipeline that requires a model with custom adapters would be configured. This document explores that topic. It's possible this should just become a subsection of #109. Signed-off-by: Russell Bryant <rbryant@redhat.com>
1 parent 88f2461 commit a72e0f2

File tree

1 file changed

+100
-0
lines changed

1 file changed

+100
-0
lines changed
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
# Serve Config for `data generate` command
2+
3+
`ilab` currently automates model serving under the following conditions:
4+
5+
* `ilab model serve`
6+
* `ilab model chat` without a custom API endpoint and without `ilab
7+
model serve` already running.
8+
* `ilab data generate` without a custom API endpoint and without `ilab
9+
model serve` already running.
10+
* `ilab model evaluate`
11+
12+
As features are added to the `instructlab-sdg` library, the configuration
13+
requirements are growing beyond what is currently available through the `ilab`
14+
CLI's `data generate` command. This document reviews the requirements and makes
15+
a proposal for how to configure `ilab` for the expanded SDG use cases.
16+
17+
## Requirements
18+
19+
In all other cases of automatically serving a model, `ilab` is only serving a
20+
single model. We now have a need to serve both a model, as well as that model
21+
with custom adapters. This is [supported by
22+
vllm](https://docs.vllm.ai/en/latest/models/lora.html#serving-lora-adapters),
23+
one of our model serving backends for `ilab`.
24+
25+
In addition to specifying which lora adapter(s) to serve, we must also be able
26+
to configure the model ID that is used for it in the OpenAI API. There is a related design
27+
that is [proposing a configuration format for SDG flows](https://github.com/instructlab/sdg/pull/86).
28+
A flow configuration file will include an expectation of one or more model IDs
29+
to be accessible, so we need a way to ensure our config matches the flow
30+
expectations.
31+
32+
## Proposal
33+
34+
### Use Case
35+
36+
`ilab data generate` with a custom workflow that requires a custom model adapter
37+
in addition to the model without an adapter.
38+
39+
### Configuration
40+
41+
First, let's put the custom adapter aside and review how we would configure the
42+
teacher model today.
43+
44+
The `serve:` section of `config.yaml` includes a `model_path` field. The model
45+
path is also used as the model ID used to request this model via the OpenAI API.
46+
This same model ID must be in the `generate.model` configuration option for this
47+
same ID to be used as the default teacher model.
48+
49+
```yaml
50+
serve:
51+
model_path: "path/to/model_directory" # both a path and the model ID used in the API
52+
...
53+
generate:
54+
model: "path/to/model_directory" # the default model ID to request from the API
55+
...
56+
```
57+
58+
If we want to serve a model with a custom adapter, we can do so using custom
59+
`vllm_args` in the configuration file.
60+
61+
```yaml
62+
serve:
63+
model_path: "path/to/model_directory" # both a path and the model ID used in the API
64+
backend: "vllm"
65+
backend_args:
66+
vllm_args:
67+
- ...
68+
- "--lora-modules"
69+
- "my_custom_adapter=path/to/my_custom_adapter"
70+
- ...
71+
...
72+
generate:
73+
model: "path/to/model_directory" # the default model ID to request from the API
74+
...
75+
76+
In this example, we have added another model ID, `my_custom_adapter`, to the
77+
OpenAI API endpoint served by `vllm`. This model ID can match the expectation of
78+
a custom flow configuration file. Using a potential configuration example from
79+
[an open PR](https://github.com/instructlab/sdg/pull/86), here is how the
80+
expectation of `my_custom_adapter` could be expressed. Note that the details
81+
of this configuration format are pending the resoltuion of the [corresponding
82+
design proposal](https://github.com/instructlab/dev-docs/pull/109).
83+
84+
```yaml
85+
version: "1.0"
86+
models:
87+
- name: myfunkyadaptor
88+
description: a funky adaptor for generating questions
89+
block_configs:
90+
- block_type: LLMBlock
91+
block_config:
92+
block_name: gen_questions
93+
config_path: configs/skills/freeform_questions.yaml
94+
add_num_samples: True
95+
model: myfunkyadatpor
96+
output_cols:
97+
- question
98+
drop_duplicates:
99+
- question
100+
```

0 commit comments

Comments
 (0)