Within the first few months for the reason that Arduino UNO Q was launched, individuals have shaped many alternative opinions about it. Some love the improved computational horsepower and the flexibility to run Linux, whereas others discover the App Lab atmosphere complicated and restrictive. No matter aspect of the fence you end up on, one factor is definite — it is extremely completely different from the Arduino boards that got here earlier than.
Together with the change has come a variety of uncertainty about what this board is de facto good for. With its STM32H5 coprocessor, it can do all of the issues an UNO is often used for. Nevertheless, given the additional price and complexity, you most likely wouldn’t need to use an UNO Q to blink some LEDs. If you’re going to make investments on this new board, you’re going to need to use it for extra advanced initiatives.
Extra than simply blinking LEDs
Alongside these traces, Edge Impulse’s Marc Pous has simply demonstrated a really fascinating means to make use of the UNO Q that may have been unthinkable earlier than the addition of the Dragonwing processor. He has written up a transient tutorial explaining how one can run LLMs — and even VLMs — regionally on the board.
The mission is constructed round yzma, a Go wrapper for llama.cpp created by Ron Evans, well-known for initiatives like Gobot and TinyGo. yzma offers a clear interface that permits builders to combine high-performance inference into Go functions with out wrestling with CGo bindings. This offers a streamlined path to operating fashionable AI fashions straight on the UNO Q’s Debian-based Linux atmosphere.
AI on the edge
The tutorial walks customers by means of putting in Go on the board, organising yzma, and pulling in appropriate GGUF fashions from Hugging Face. For text-only inference, Pous demonstrates the compact SmolLM2-135M-Instruct mannequin, which weighs in at roughly 135 million parameters. Due to quantization and the effectivity of llama.cpp, the mannequin can run regionally on the UNO Q’s Arm-based system, enabling absolutely offline chat interactions.
This picture was used to check the VLM (📷: Marc Pous)
Much more spectacular is the demonstration of a multimodal mannequin: SmolVLM2-500M-Video-Instruct. At round 500 million parameters, it’s small by fashionable AI requirements however nonetheless able to processing photographs and quick video inputs alongside textual content prompts. In Pous’ instance, the board analyzes a photograph of markers scattered throughout a desk and generates an in depth description — all with out sending information to the cloud.
As an alternative of counting on distant APIs, builders can construct privacy-conscious edge methods that interpret photographs, reply to voice instructions, or analyze sensor information regionally. For robotics and sensible dwelling experiments particularly, the flexibility to mix real-time microcontroller management with Linux-based AI inference opens up new design potentialities. When you construct a few of your individual nice concepts with an UNO Q, be sure you tell us.


