Skip to content

Commit ba72d46

Browse files
committed
respond to comments on kimi-k2 user guide
1 parent b352a55 commit ba72d46

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

tests/end_to_end/tpu/kimi/Run_Kimi.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright 2023–2025 Google LLC
2+
# Copyright 2026 Google LLC
33
#
44
# Licensed under the Apache License, Version 2.0 (the "License");
55
# you may not use this file except in compliance with the License.
@@ -49,7 +49,7 @@ python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
4949

5050
## Checkpoint conversion
5151
To get started, download the model from [HuggingFace](https://huggingface.co/moonshotai/Kimi-K2-Instruct). Kimi K2 uses a trillion-parameter architecture that requires efficient sharding.
52-
* Run the conversion script to transform the HuggingFace weights into the MaxText-compatible [Orbax](https://orbax.readthedocs.io/en/latest/guides/checkpoint/orbax_checkpoint_101.html) format.
52+
* Run [convert_deepseek_family_ckpt.py](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/checkpoint_conversion/standalone_scripts/convert_deepseek_family_ckpt.py) to convert the checkpoint for MaxText compatibility in [Orbax](https://orbax.readthedocs.io/en/latest/guides/checkpoint/orbax_checkpoint_101.html) for training and fine-tuning.
5353
* Note that Kimi K2 utilizes **YaRN** for context window extension to 128k; ensure your configuration reflects these positional embedding settings during conversion for decoding.
5454

5555
## Fine-tuning

0 commit comments

Comments
 (0)