- HBM 3D Design on GPU Achieves Record Computing Density for Demanding AI Workloads
- Maximum GPU temperatures exceeded 140°C without thermal mitigation strategies
- Halving the GPU clock rate reduced temperatures but slowed down AI training by 28%
Imec presented a review of a 3D HBM design on GPU aimed at increasing computing density for demanding AI workloads at the 2025 IEEE International Electron Devices Meeting (IEDM).
The thermal system-technology co-optimization approach places four high-bandwidth memory stacks directly on top of a GPU via microbump connections.
Each stack consists of twelve hybrid-bonded DRAM chips and cooling is applied on top of the HBMs.
Thermal mitigation attempts and performance trade-offs
The solution applies power maps derived from industry-relevant workloads to test how the configuration responds under realistic AI training conditions.
This 3D arrangement promises an increase in computing density and memory per GPU.
It also offers higher GPU memory bandwidth than 2.5D integration, where HBM stacks are placed around the GPU on a silicon interposer.
However, thermal simulations reveal serious challenges for 3D HBM design on GPU.
Without mitigation, maximum GPU temperatures reached 141.7°C, well above operational limits, while the 2.5D benchmark peaked at 69.1°C under the same cooling conditions.
Imec has explored technology-level strategies such as HBM battery fusion and thermal silicon optimization.
System-level strategies included dual-sided cooling and GPU frequency scaling.
Reducing the GPU clock rate by 50% lowered maximum temperatures below 100°C, but this change slowed down AI training workloads.
Despite these limitations, Imec claims that the 3D framework can deliver higher computational density and performance than the 2.5D reference design.
“Halving the GPU core frequency brought the maximum temperature from 120°C to below 100°C, achieving a key goal for memory operation. Although this step comes with a 28% workload penalty…” said James Myers, system technology program director at Imec.
“…the overall package outperforms the 2.5D baseline thanks to the higher throughput density offered by the 3D configuration. We are currently using this approach to study other GPU and HBM configurations…”
The organization suggests that this approach could support thermally resilient hardware for AI tools in dense data centers.
Imec presents this work as part of a broader effort to link technology decisions to system behavior.
This includes the Cross-Technology Co-Optimization (XTCO) program, launching in 2025, which combines STCO and DTCO mindsets to align technology roadmaps with system scaling challenges.
Imec said XTCO enables collaborative problem solving for critical bottlenecks in the semiconductor ecosystem, including fabless and systems companies.
However, these technologies will likely remain confined to specialized installations with controlled energy and thermal budgets.
Via TechPowerUp
Follow TechRadar on Google News And add us as your favorite source to get our news, reviews and expert opinions in your feeds. Make sure to click the Follow button!
And of course you can too follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp Also.




