Implement vLLM FSDP LoRA hot-swapping integration by jacobthebanana · Pull Request #10 · VectorInstitute/vectorlm

jacobthebanana · 2024-05-09T22:20:27Z

This pull request enables vLLM to run in parallel with VectorLM on the same set of GPUs. Additionally, this pull request includes an example of LoRA adapter hot-swapping for tracking the behavior of the model during the training process.

Proof of concept: gist link.
Schematics: gist link.

Added support for non-fsdp models.

trainer: replaced clip_grad_norm_ with nn.utils.clip_grad_norm_ for lora compatibility.

Also see https://github.com/facebookresearch/llama-recipes/blob/674b37ee66f59a7845cbc3868948f4d7fa69c679/src/llama_recipes/utils/fsdp_utils.py#L9

Set model path to local copy of llama-2-7b in example config.

…is method no longer wraps load_model_and_tokenizer) test_modelling: revised base model fixture scope since torch FSDP wrap is in-place. launch_benchmark: added confirmation before launching.

…enchmarking * added changes to implement low cpu mem usage feature * implemented new ruff linting changes and ran a fix across files

…s/config.md accordingly.

…ng configs and documentations.

Still need to move barrier logic into _VLLMCallbackWrapper.

Cleanup is required.

…mize changes required in llama_example.py.

…_example.py.

…ess_wrap.

adil-a · 2024-05-28T15:56:14Z

docs/config.md

+
+### Sampling during Training
+
+To disable sampling during training, delete the entire "sampling" section.


Are we "deleting" the section or just commenting out?

"Comment out" might be sufficient, as it allows the user to easily re-enabled the sampling engine as needed.

adil-a · 2024-05-28T15:56:43Z

configs/config_gemma.yaml

Is this file required to be a part of the main codebase?

That config file has been included by mistake. I will delete that from version control.

adil-a · 2024-05-28T16:10:01Z

docs/sampling.md

Great writeup!

adil-a · 2024-05-28T16:10:40Z

configs/config_gemma.yaml

+wandb_config:
+  project: vector-lm-verify
+  name: benchmark-lora
+  # tags: ["20240418-1a-preemption"]


This should be removed.

adil-a · 2024-05-28T16:11:53Z

examples/__init__.py

Don't need an init file in examples. It's not part of the package installation.

Sounds good. I have also added some verification logic to ensure that users are invoking the wrapper correctly.

examples/llama_example.py

…d importing vLLM when not required. Ruff formatting fixes.

jacobthebanana and others added 30 commits February 26, 2024 17:07

Implemented baseline LoRA peft for one Nvidia GPU.

904d1e1

Added support for saving lora adapters.

2ace67e

Added support for non-fsdp models.

save_utils: added support for non-FSDP optimizers.

a25e667

trainer: replaced clip_grad_norm_ with nn.utils.clip_grad_norm_ for lora compatibility.

example_lora: highlighted current lora (non-fsdp) limitations.

65a2dbf

Added instructions on LoRA on one GPU.

ed4c84f

Added example script for launching lora.

5a72392

Revised instructions on LoRA on one GPU.

e176ac8

Implemented LoRA FSDP.

2d869b0

Also see https://github.com/facebookresearch/llama-recipes/blob/674b37ee66f59a7845cbc3868948f4d7fa69c679/src/llama_recipes/utils/fsdp_utils.py#L9

Reverted automatic formatter changes in README.md

dc098d6

Eliminated non-FSDP logic from save_utils.

5a1fd76

Set model path to local copy of llama-2-7b in example config.

Moved lora config out of example config.yaml.

7e187bc

Implemented LoRA benchmarking logic for worker.

3eea331

model_utils: Refactored get_lora_model to reduce interface width. (th…

906e4f3

…is method no longer wraps load_model_and_tokenizer) test_modelling: revised base model fixture scope since torch FSDP wrap is in-place. launch_benchmark: added confirmation before launching.

test_modelling: moved text output to data/.

0c41535

added example yaml config for lora benchmarking.

f24d2fa

launch_benchmark: marked qos flag as optional.

7d27d90

launch_benchmark: added option to limit number of jobs launched.

d22ea85

launch_benchmark: implemented torch profiler integration.

84b953a

Merged changes from low CPU memory usage feature (#6) into jjt/lora-b…

e1cda07

…enchmarking * added changes to implement low cpu mem usage feature * implemented new ruff linting changes and ran a fix across files

Revised launch_benchmark.py to use new profiling path.

48f61d9

Enabled automatic creation of data/trace folder.

9876ebe

Added instructions for profiling tools.

5330871

Merge remote-tracking branch 'origin/master' into jjt/lora-baseline

17e24bd

Cleaned up duplicate imports from merge.

9982791

Cleaned up duplicate imports from merge.

9a76e80

Cleaned up parse_benchmark.py

ffa7067

Integrated LoRA logic into llama_example.py.

bd893e1

Moved lora_configs into train_parameters in config yaml. Adjusted doc…

c2f346f

…s/config.md accordingly.

Revised handling of nproc-per-node in benchmark script.

56cb750

Included parameter_count info in benchmark output.

97ddd8c

jacobthebanana added 20 commits April 25, 2024 16:33

Added reference sampling steps to llama_example. Added example sampli…

675367b

…ng configs and documentations.

Added train_parameters.get("sampler").

ca2cad8

[WIP] Implemented vLLM wrapper combining vectorlm and vLLM workers.

649a4b8

vllm integration: Eliminated duplicate vllm ResultHandler.

ebb7bc9

vllm integration [WIP]: Revised vectorlm-vllm concurrency handling.

1f1f88e

vllm integration [WIP]: Implemented inference during training.

11a1ba5

vllm integration [WIP]: Implemented lora hotswap.

b697dc0

Still need to move barrier logic into _VLLMCallbackWrapper.

vllm integration [WIP]: Moved sampler-related logic into Trainer.

112ea3c

Merge remote-tracking branch 'origin/master' into jjt/lora-vllm-hotswap

07405dc

vllm integration: Added documentation on sampling engine.

e707987

vllm integration: Added documentation on sampling engine.

61c39ad

[WIP] vllm hotswapping: Implement minimum-viable wrapper for vllm/main.

609c023

Cleanup is required.

[WIP] vllm hotswapping: Reduced area of vLLM integration interface.

9585c01

Cleanup is required.

vllm hotswapping [WIP]: Reduced area of vLLM integration interface.

31464aa

Cleanup is required.

vllm hotswapping [WIP]: Refactored vLLM integration interface to mini…

059d57f

…mize changes required in llama_example.py.

vllm hotswapping [WIP]: deleted unneded torch dist.barrier from llama…

b5c6389

…_example.py.

vllm hotswapping [WIP]: documentation fixes and cleanup.

f506812

vllm hotswapping [WIP]: cleaned up documentation related to multiproc…

3e27e84

…ess_wrap.

vllm hotswapping [WIP]: cleaned up changes in llama_example.py.

879399f

vllm hotswapping [WIP]: added example gemma sampling config.

bc0ae52

adil-a reviewed May 28, 2024

View reviewed changes

docs/sampling.md

Copy link

Collaborator

adil-a May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great writeup!

jacobthebanana reacted with laugh emoji

adil-a reviewed May 28, 2024

View reviewed changes

examples/llama_example.py Show resolved Hide resolved

vllm hotswapping: Refactoring and cleanup.

5e8944d

jacobthebanana marked this pull request as ready for review June 18, 2024 00:53

jacobthebanana requested a review from adil-a June 18, 2024 00:53

vllm hotswapping: Moved Sampler import into conditional block to avoi…

2005a7d

…d importing vLLM when not required. Ruff formatting fixes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement vLLM FSDP LoRA hot-swapping integration#10

Implement vLLM FSDP LoRA hot-swapping integration#10
jacobthebanana wants to merge 92 commits intomasterfrom
jjt/lora-vllm-hotswap

jacobthebanana commented May 9, 2024

Uh oh!

adil-a May 28, 2024

Uh oh!

jacobthebanana Jun 18, 2024

Uh oh!

adil-a May 28, 2024

Uh oh!

jacobthebanana Jun 18, 2024

Uh oh!

adil-a May 28, 2024

Uh oh!

adil-a May 28, 2024

Uh oh!

adil-a May 28, 2024

Uh oh!

jacobthebanana Jun 18, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		### Sampling during Training

		To disable sampling during training, delete the entire "sampling" section.

Conversation

jacobthebanana commented May 9, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants