How to use GitHub
- Please use the 👍 reaction to show that you are interested into the same feature.
- Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
- Subscribe to receive notifications on status change and new comments.
Feature request
Microsoft has released Bitnet B1.58-2B-4T under the MIT license. This is a 1.58 bit model with 2 billion parameters, natively trained for that precision (instead of being created by quantization of larger models).
The implication is that BitNet is fairly small, yet powerful, and can run solely on a CPU with acceptable inference speed.
Furthermore, Microsoft provides an own inference framework named BitNet.cpp, based on llama.cpp. My understanding is that this would allow a rather straightforward integration in existing codebase.
How to use GitHub
Feature request
Microsoft has released Bitnet B1.58-2B-4T under the MIT license. This is a 1.58 bit model with 2 billion parameters, natively trained for that precision (instead of being created by quantization of larger models).
The implication is that BitNet is fairly small, yet powerful, and can run solely on a CPU with acceptable inference speed.
Furthermore, Microsoft provides an own inference framework named BitNet.cpp, based on llama.cpp. My understanding is that this would allow a rather straightforward integration in existing codebase.