Trainium2 power surges as AWS’s Project Rainier enters service for Anthropic

AWS built an EC2 UltraCluster of Trainium2 UltraServers so Anthropic can iterate frontier models faster at lower latency.

Anthropic begins training Claude on AWS Project Rainier, tapping nearly half a million Trainium2 chips across multiple data centres.

Anthropic and AWS switched on Project Rainier, a vast Trainium2 cluster spanning multiple US sites to accelerate Claude’s evolution.

Project Rainier is now fully operational, less than a year after its announcement. AWS engineered an EC2 UltraCluster of Trainium2 UltraServers to deliver massive training capacity. Anthropic says it offers more than five times the compute used for prior Claude models.

UltraServers bind four Trainium2 servers with high-speed NeuronLinks so 64 chips act as one. Tens of thousands of networks are connected through Elastic Fabric Adapter across buildings. The design reduces latency within racks while preserving flexible scale across data centres.

Anthropic is already training and serving Claude on Rainier across the US and plans to exceed one million Trainium2 chips by year’s end. More computing should raise model accuracy, speed evaluations, and shorten iteration cycles for new frontier releases.

AWS controls the stack from chip to data centre for reliability and efficiency. Teams tune power delivery, cooling, and software orchestration. New sites add water-wise cooling, contributing to the company’s renewable energy and net-zero goals.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!