Microsoft unveils powerful lightweight AI model for CPUs

BitNet b1.58 2B4T beats several rival models on benchmarks and runs twice as fast on some tasks, requiring less memory and no GPU.

Microsoft has released BitNet b1.58 2B4T, the largest 1-bit AI model yet, delivering faster performance and better efficiency using CPUs instead of GPUs.

Microsoft researchers have introduced the largest 1-bit AI model to date, called BitNet b1.58 2B4T, designed to run efficiently on standard CPUs instead of relying on GPUs. This ‘bitnet’ model, now openly available under the MIT license, can even operate on Apple’s M2 chips.

Bitnets use extreme weight quantisation, storing only -1, 0, or 1 as values, making them far more memory- and compute-efficient than most conventional models.

With 2 billion parameters and trained on 4 trillion tokens, roughly the equivalent of 33 million books, BitNet b1.58 2B4T outperforms several similarly sized models in key benchmarks.

Microsoft claims it beats Meta’s Llama 3.2 1B, Google’s Gemma 3 1B, and Alibaba’s Qwen 2.5 1.5B on tasks like grade-school maths and physical reasoning. It also runs up to twice as fast while using significantly less memory, offering a potential edge for lower-end or energy-constrained devices.

The main limitation lies in its dependence on Microsoft’s custom bitnet.cpp framework, which supports only select hardware and does not yet work with GPUs.

Instead of being broadly compatible with existing AI systems, BitNet’s performance depends on a narrower infrastructure, a hurdle that may limit adoption, despite its promise for lightweight AI deployment.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!