Sambanova hits 198 tokens per second on the Deepseek-R1 671B complete and not distilled with only 16 RDU SN40L chips


  • Sambanova executes Deepseek-R1 at 198 tokens / dry using 16 personalized chips
  • The RDU SN40L chip would have 3x faster, 5x more efficient than GPUs
  • 5x speed boost is promised soon, with a capacity of 100 times at the end of the year on the cloud

The Chinese AI Upstart Deepseek very quickly made a name for himself in 2025, with its large -scale open source R1 model model, built for advanced reasoning tasks, showing equal performance with the higher models of industry , while being more profitable.

Sambanova Systems, an AI startup founded in 2017 by experts from Sun / Oracle and the University of Stanford, has now announced what it claims to be the fastest deployment in the world of Deepseek-R1 671B LLM to date .

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top