- Tiiny AI Pocket Lab runs large models locally, avoiding cloud dependency
- Mini PC runs advanced inference tasks without discrete GPU support
- Parameter models 10B to 120B work offline with 65W power
Tiiny, an American startup, has launched AI Pocket Lab, a pocket-sized AI supercomputer capable of running large language models locally.
The device is a mini PC designed to run advanced inference workloads without cloud access, external servers, or discrete accelerators.
The company says all processing remains offline, removing network latency and limiting external data exposure.
Built to run large models without the cloud
“Cloud AI has brought remarkable advancements, but it has also created issues of dependency, vulnerability and sustainability,” said Samar Bhoj, Director GTM of Tiiny AI.
“With Tiiny AI Pocket Lab, we believe that intelligence should not belong to data centers, but to people. This is the first step towards truly accessible, private and personal advanced AI, bringing the power of big cloud models to every individual device.
The Pocket Lab targets large personal models designed for complex reasoning and long context tasks while operating within a limited 65W power envelope.
Tiiny claims consistent performance for models in the 10B-100B parameter range, with support extending up to 120B.
This upper limit approximates the capacity of leading cloud systems, allowing advanced reasoning and broad context to be executed locally.
Guinness World Records reportedly certified the material for local execution of a Class 100B model.
The system uses a 12-core ARMv9.2 processor coupled with a custom heterogeneous AI module that provides approximately 190 TOPS of computation.
The system includes 80GB of LPDDR5X memory as well as a 1TB SSD, with total power consumption apparently remaining within a 65W system envelope.
Its physical size resembles a large external drive more than a workstation, reinforcing its pocket-oriented branding.
Although the specs resemble a Houmo Manjie M50-style chip, independent real-world performance data is not yet available.
Tiiny also emphasizes an open source ecosystem that supports one-click installation of major agent models and frameworks.
The company says it will provide ongoing updates, including what it describes as OTA hardware upgrades.
This formulation is problematic, because wireless mechanisms are traditionally applied to software.
The statement suggests either imprecise wording or a marketing error rather than a literal modification of the material.
The technical approach relies on two software optimizations rather than raw silicon performance scaling.
TurboSparse focuses on selectively activating neurons to reduce inference cost without changing the model structure.
PowerInfer distributes workloads across heterogeneous components, coordinating the CPU with a dedicated NPU to approach server-level throughput with lower consumption.
The system does not include any discrete GPUs, with the company claiming that careful planning eliminates the need for expensive accelerators.
These claims indicate that efficiencies, rather than brute force hardware, are the key differentiator.
Tiiny AI positions the Pocket Lab as a response to the sustainability, privacy and cost pressures affecting centralized AI services.
Running large language models locally could reduce recurring cloud expenses and limit exposure of sensitive data.
However, claims about capacity, server-grade performance, and seamless scaling on such constrained hardware remain difficult to independently verify.
Via TechPowerUp
Follow TechRadar on Google News And add us as your favorite source to get our news, reviews and expert opinions in your feeds. Make sure to click the Follow button!
And of course you can too follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp Also.




