FuriosaAI unveils efficient AI inference chip

RNGD chip showcases FuriosaAI’s rapid innovation, offering a sustainable and accessible AI computing solution.

Disagreements over business strategy led to the breakdown of FuriosaAI’s acquisition talks with Meta, despite the $800 million offer.

FuriosaAI has launched its latest AI inference chip, RNGD, which promises to be a significant accelerator for data centres handling large language models (LLMs) and multimodal model inference. Founded in 2017 by former AMD, Qualcomm, and Samsung engineers, FuriosaAI has rapidly developed cutting-edge technology, culminating in the RNGD chip.

The RNGD chip, developed with the support of TSMC, has demonstrated impressive performance in early tests, particularly with models such as GPT-J and Llama 3.1. The chip’s architecture, featuring a Tensor Contraction Processor (TCP) and 48GB of HBM3 memory, delivers high efficiency and programmability, achieving token throughput of 2,000 to 3,000 tokens per second for models with around 10 billion parameters.

FuriosaAI’s approach to innovation is evident in its quick development and optimisation cycles. Within weeks of receiving silicon for their first-generation chip in 2021, the company achieved notable results in MLPerf benchmarks, with performance improvements reaching 113% in subsequent submissions. The RNGD chip is the next step in their strategy, offering a sustainable solution with a lower power draw than leading GPUs.

The RNGD chip is sampled by early access customers, with a broader release anticipated in early 2025. FuriosaAI’s CEO, June Paik, expressed pride in the team’s dedication and excitement for the future as the company continues to push the boundaries of AI computing.