- The UK-based fractile is supported by NATO and wishes to build more quickly and cheaper in memory
- Nvidia’s BruteForce GPU approach consumes too much power and is retained by memory
- Fractile figures have focused on a H100 GPU comparison group, not the H200 for the general public
NVIDIA is comfortably at the top of the AI hardware food chain, dominating the market with its high performance GPUs and its CUDA software stack, which quickly became the default tools for the training and management of large AI models – but this domination has a cost – namely a growing target on the back.
Hyperscalers like Amazon, Google, Microsoft and Meta depict resources in the development of their own personalized silicon in order to reduce their dependence on Nvidia chips and reduce costs. At the same time, a wave of AI hardware startups is trying to capitalize on the increase in demand for specialized accelerators, in the hope of offering more efficient or affordable alternatives and, in the end, to move Nvidia.
You may not have yet heard of Fractile based in the United Kingdom, but the startup, which affirms that its revolutionary approach to computer science can manage the largest linguistic models in the world 100x faster and 1/10 the cost of existing systems, has fairly remarkable donors, including NATO and the old CEO of Intel, Pat Gelsinger.
Remove each bottleneck
“We build the equipment that will remove each bottleneck with the fastest possible inference of the largest networks of transformers,” explains Fratile.
“This means that the largest LLM in the world work faster than you can read, and a universe of completely new capacities and possibilities for the way we work which will be unlocked by an almost instantly instant inference with superhuman intelligence.”
It should be emphasized, before you excite you too much, that the fractile performance numbers are based on comparisons with clusters of NVIDIA H100 GPU using 8 -bit and Tensorrt -LLM, Running LLAMA 2 70B quantification – Not the new H200 chips.
In a LinkedIn publication, Gelsinger, who recently joined VC Firm Playground Global as a general partner, wrote: “The inference of the AI frontier models is in a drop by the equipment. Even before the scaling of test calculations, the cost and the latency are enormous challenges for LLM deployments on a large scale … to carry out our aspirations for aspiration. ”.
“I am happy to share that I recently invested in Fractile, an AI equipment company founded in the United Kingdom which continues a fairly radical path to offer such a jump,” he then revealed.
“Their approach to calculating in memory of the acceleration of inference is jointly attacks the two bottlenecks to scale the inference, overcoming both the bottleneck of the memory which retains during the next decade of the data center capacity.




