respond to comments on kimi-k2 user guide

snehalv2002 · snehalv2002 · commit ba72d46fdd62 · 2026-04-10T21:59:52.000Z
diff --git a/tests/end_to_end/tpu/kimi/Run_Kimi.md b/tests/end_to_end/tpu/kimi/Run_Kimi.md
@@ -1,5 +1,5 @@
 <!--
- # Copyright 2023–2025 Google LLC
+ # Copyright 2026 Google LLC
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -49,7 +49,7 @@ python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
 
 ## Checkpoint conversion
 To get started, download the model from [HuggingFace](https://huggingface.co/moonshotai/Kimi-K2-Instruct). Kimi K2 uses a trillion-parameter architecture that requires efficient sharding.
-* Run the conversion script to transform the HuggingFace weights into the MaxText-compatible [Orbax](https://orbax.readthedocs.io/en/latest/guides/checkpoint/orbax_checkpoint_101.html) format.
+* Run [convert_deepseek_family_ckpt.py](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/checkpoint_conversion/standalone_scripts/convert_deepseek_family_ckpt.py) to convert the checkpoint for MaxText compatibility in [Orbax](https://orbax.readthedocs.io/en/latest/guides/checkpoint/orbax_checkpoint_101.html) for training and fine-tuning.
 * Note that Kimi K2 utilizes **YaRN** for context window extension to 128k; ensure your configuration reflects these positional embedding settings during conversion for decoding.
 
 ## Fine-tuning