Deepseek X Cerebras: how the most controversial AI model at the moment is to be overeating by the most powerful superchip ever built


  • The fastest manufacturer of the world has a splash with deep integration
  • Cerebras says that the solution will classify 57x faster than on GPUs but does not mention which GPU
  • Deepseek R1 will operate on Cerebras Cloud and the data will remain in the United States

Cerebras has announced that he would support Deepseek in a not so surprised decision, more specifically the R1 70B reasoning model. This decision comes after Groq and Microsoft confirmed that they would also bring the new child of the AI ​​block in their respective clouds. AWS and Google Cloud have not yet done so, but anyone can run the open source model anywhere, even locally.

The specialist in an inference chips AI will execute Deepseek R1 70B to 1,600 tokens / second, which, according to him, is 57x faster than any R1 supplier using GPUs; We can deduce that 28 tokens / seconds are what the GPU-in-the-Cloud solution (in this case Deepinfra) apparently reaches. Originally, the last Cerebras chip is 57x larger than the H100. I contacted Cerebras to find out more about this assertion.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top