This little chip could dethrone Nvidia’s GPUs by merging compute and memory for mind-blowing AI speed and efficiency.

GSI Gemini-I APU reduces constant data shuffling between processor and memory systems
Performs recovery tasks up to 80% faster than comparable processors
The GSI Gemini-II APU will offer ten times greater throughput

GSI Technology promotes a new approach to artificial intelligence processing that places computation directly in memory.

A new study from Cornell University draws attention to this design, known as the associative processing unit (APU).

It aims to overcome long-standing performance and efficiency limitations, suggesting it could challenge the dominance of top GPUs currently used in AI tools and data centers.

A new competitor in AI hardware

Published in the ACM journal and presented at the recent Micro ’25 conference, the Cornell research evaluated GSI’s Gemini-I APU against leading CPUs and GPUs, including Nvidia’s A6000, using fetch augmented generation (RAG) workloads.

Testing covered datasets ranging from 10 to 200 GB, representing realistic inference conditions for AI.

By performing calculations in static RAM, the APU reduces the constant shuffling of data between the processor and memory.

This is a key source of energy loss and latency in conventional GPU architectures.

The results showed that the APU could achieve GPU-class throughput while consuming significantly less power.

GSI said its APU uses up to 98% less power than a standard GPU and performs recovery tasks up to 80% faster than comparable processors.

Such efficiency could make it attractive for edge devices such as drones, IoT systems and robotics, as well as defense and aerospace, where limits on power and cooling are tight.

Despite these results, it remains unclear whether in-memory computing technology can achieve the same level of maturity and support that top GPU platforms enjoy.

GPUs currently benefit from well-developed software ecosystems that enable seamless integration with leading AI tools.

For in-memory computing devices, optimization and scheduling remain emerging areas that could slow broader adoption, particularly in large data center operations.

GSI Technology says it continues to refine its hardware, with the Gemini-II generation expected to deliver ten times greater throughput and lower latency.

Another design, named Plato, is under development to further expand the computing performance of embedded edge systems.

“Cornell’s independent validation confirms what we have long believed: in-memory computing has the potential to disrupt the $100 billion AI inference market,” said Lee-Lean Shu, president and CEO of GSI Technology.

“The APU delivers GPU-class performance at a fraction of the energy cost, thanks to its highly efficient memory-centric architecture. Our recently launched second-generation APU silicon, Gemini-II, can deliver approximately 10x faster throughput and even lower latency for memory-intensive AI workloads.”

Via TechPowerUp

Follow TechRadar on Google News And add us as your favorite source to get our news, reviews and expert opinions in your feeds. Make sure to click the Follow button!

And of course you can too follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp Also.

Must Read

Leave a Comment Cancel Reply