Meta builds 1700W superchip and custom MTIA chips while ditching Nvidia, AMD, Intel and ARM for inference

Meta’s 1700W superchip offers 30 PFLOPs and 512 GB of HBM memory
MTIA 450 and 500 prioritize inference over pre-training workloads
Future generations of MTIA will support GenAI inference and classification workloads

Meta advances its AI infrastructure with a portfolio of custom MTIA chips designed specifically for inference workloads in its applications.

The company is developing a 1700W superchip capable of 30 PFLOPs and 512 GB of HBM, integrated into the same MTIA infrastructure to handle large-scale inference tasks.

Interestingly, it achieves this feat without any of its friends – no Nvidia, AMD, Intel or ARM.

Article continues below

According to Meta, hundreds of thousands of MTIA chips are already deployed in production, supporting ranking, recommendations and ad serving workloads.

These chips are part of a complete system optimized for Meta’s specific requirements, achieving higher computing efficiency than general-purpose hardware for intended workloads.

Unlike other hyperscalers such as Google, AWS, Microsoft and Apple, Meta pursues a completely custom silicon strategy.

This design prioritizes efficiency over general usability, allowing inference to run more cost-effectively than on traditional GPUs or CPUs.

It maintains compatibility with industry standard software such as PyTorch, vLLM and Triton.

Meta’s MTIA roadmap foresees four new generations of chips over the next two years, including the MTIA 300, currently in production for ranking and recommendations.

Future generations—MTIA 400, 450, and 500—will expand support for GenAI inference workloads, with designs capable of integrating into existing rack infrastructure.

Meta emphasizes rapid, iterative development, releasing new chips approximately every six months through modular, reusable designs.

The modular design allows new chips to fit into existing rack systems, reducing deployment friction and speeding up production times.

This approach allows the company to adopt emerging AI techniques and hardware improvements more quickly than its competitors, which typically cycle one to two years per generation.

Unlike most consumer AI chips that prioritize large-scale GenAI pre-training and then scale for inference, Meta’s MTIA 450 and 500 focus on inference workloads first.

The chips can also support other tasks, including ranking and recommendation training or GenAI training, but their design keeps them in tune with the expected growth in demand for inference.

Meta’s system-level design aligns with Open Compute Project standards, enabling frictionless deployment in data centers while maintaining high compute efficiency.

The company recognizes that no single chip can handle all of its AI workloads.

That’s why it’s deploying multiple generations of MTIA alongside complementary silicon from other vendors.

The strategy aims to balance flexibility and performance while accelerating innovation towards personal superintelligence.

Follow TechRadar on Google News And add us as your favorite source to get our news, reviews and expert opinions in your feeds. Make sure to click the Follow button!

And of course you can too follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp Also.

Must Read

Leave a Comment Cancel Reply