Maia 200 AI inference accelerator unveiled by Microsoft

Advanced silicon design, high-bandwidth networking, and optimised memory systems enable scalable, cost-efficient AI inference at global cloud scale.

Microsoft has launched Maia 200, a new AI inference accelerator designed to deliver higher performance, improved efficiency, and faster deployment of large-scale generative models.

Microsoft has unveiled Maia 200, a next-generation AI inference accelerator built to boost performance, efficiency, and cost-effectiveness at scale. Built on TSMC’s 3-nanometre process, the chip boosts speed, efficiency, and memory throughput for advanced AI models.

The new accelerator will power Microsoft’s cloud infrastructure across Azure, Microsoft Foundry, and Microsoft 365 Copilot, including workloads for OpenAI’s latest GPT-5.2 models.

Internal teams will use Maia 200 for synthetic data generation and reinforcement learning, accelerating AI development. Maia 200 is being rolled out in Microsoft’s US Central data centre region, with further deployments planned across additional global locations.

A preview version of the Maia software development kit is also being released, offering developers access to PyTorch integration, optimised compilers, and low-level programming tools to fine-tune AI models across heterogeneous computing environments.

The system introduces a redesigned networking and memory architecture optimised for high-bandwidth data movement and large-scale inference clusters.

Microsoft says the platform delivers significant improvements in performance per dollar, scalability, and power efficiency, positioning Maia 200 as a cornerstone of its long-term AI infrastructure strategy.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot