Raspberry Pi 5 gets AI HAT+ 2 with LLM and VLM support, finally running generative AI entirely on-device

Raspberry Pi AI HAT+ 2 allows Raspberry Pi 5 to run LLMs locally
Hailo-10H accelerator delivers 40 TOPS of INT4 inference power
PCIe interface enables high-bandwidth communication between the board and the Raspberry Pi 5

Raspberry Pi has expanded its edge computing ambitions with the release of AI HAT+ 2, an add-on board designed to integrate generative AI workloads onto the Raspberry Pi 5.

Previous AI HAT hardware focused almost entirely on accelerating computer vision, handling tasks like object detection and scene segmentation.

The new card expands this scope by supporting large language models and vision language models that run locally, without relying on cloud infrastructure or persistent network access.

Hardware changes that enable local language models

At the center of the upgrade is the Hailo-10H neural network accelerator, which delivers 40TOPS of INT4 inference performance.

Unlike its predecessor, the AI HAT+ 2 has 8 GB of dedicated onboard memory, allowing larger models to run without consuming system RAM on the Raspberry Pi host.

This change allows LLM and VLM to run directly on the device and maintains low latency and local data, which is a key requirement for many Edge deployments.

Using a standard Raspberry Pi distribution, users can install supported models and access them through familiar interfaces such as browser-based chat tools.

The AI HAT+ 2 connects to the Raspberry Pi 5 via the GPIO header and relies on the system’s PCIe interface for data transfer, which excludes compatibility with the Raspberry Pi 4.

This connection supports high-bandwidth data transfer between the accelerator and the host, which is essential for efficiently moving camera inputs, outputs and data.

Demonstrations include answering text questions with Qwen2, generating code using Qwen2.5-Coder, basic translation tasks, and visual descriptions of scenes from live camera feeds.

These workloads leverage AI tools designed to work within the Pi software stack, including containerized backends and local inference servers.

All processing takes place on the device, without external computing resources.

Supported models range from one to one and a half billion parameters, which is modest compared to cloud-based systems that operate at much larger scales.

These smaller LLMs target limited memory and power envelopes rather than broad, general knowledge.

To address this constraint, AI HAT+ 2 supports tuning methods such as low-rank adaptation, which allows developers to customize models for narrow tasks while keeping most parameters unchanged.

Vision models can also be retrained using application-specific datasets through Hailo’s toolchain.

The AI HAT+ 2 is available for $130, putting it above previous vision-focused accessories while providing similar computer vision throughput.

For workloads focused solely on image processing, the upgrade offers limited gains because its appeal relies largely on local LLM execution and privacy-sensitive applications.

In practical terms, the hardware shows that generative AI on Raspberry Pi hardware is now feasible, although limited memory headroom and small model sizes remain an issue.

Follow TechRadar on Google News And add us as your favorite source to get our news, reviews and expert opinions in your feeds. Make sure to click the Follow button!

And of course you can too follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp Also.

Must Read

Leave a Comment Cancel Reply