Tokens-at-scale with Intel’s Crescent Island and Xe architecture

Designed for air-cooled servers, Intel’s Crescent Island aims to cut inference costs while boosting memory capacity and bandwidth.

Intel's heterogeneous play pairs Xeon 6 and GPUs to match the right silicon to each AI task.

Intel unveils ‘Crescent Island’ data-centre GPU at OCP, targeting real-time, everywhere inference with high memory capacity and energy-efficient performance for agentic AI.

Sachin Katti said scaling complex inference needs heterogeneous systems and an open, developer-first stack; Intel positions Xe architecture GPUs to deliver efficient headroom as token volumes surge.

Intel’s approach spans AI PC to data centre and edge, pairing Xeon 6 and GPUs with workload-centric orchestration to simplify deployment, scaling, and developer continuity.

Crescent Island is designed for air-cooled enterprise servers, optimised for power and cost, and tuned for inference with large memory capacity and bandwidth.

Key features include the Xe3P microarchitecture for performance-per-watt gains, 160GB LPDDR5X, broad data-type support for ‘tokens-as-a-service’, and a unified software stack proven on Arc Pro B-Series; customer sampling is slated for H2 2026.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!