Darml
model.tflite DARML compile · check · flash AVR STM32 ESP32 RPi Jetson

Model in. Firmware out.

One CLI from 8 KB AVRs to Jetson Orin. No vendor SDK switching. TFLite, ONNX, scikit-learn in — flashable firmware out.

$pip install darml
$darml build model.tflite --target esp32-s3
$darml flash darml-build.zip --port /dev/ttyUSB0
$screen /dev/ttyUSB0 115200
[Darml] Ready.
[Darml] pred=2 conf=0.94 latency=12µs
[Darml] pred=2 conf=0.93 latency=11µs
# Model running on hardware. Total wall time: ~6 minutes (first build).

Built by an embedded engineer who got tired of writing the same toolchain glue for every project. Maintained out in the open.

For

Embedded engineers who already have a trained model and need it running on a real device. ML researchers shipping a demo to a chip. Teams replacing custom toolchain glue with one binary.

Not for

Training new models — bring your own. Cloud-only inference — we ship to silicon. AutoML — we don't pick architectures, we deploy yours.

11 targets. 5 hardware tiers. One CLI.

AVR for the smallest sensors, Cortex-M for industrial control, Xtensa for connected devices, Linux-class for vision and audio. Edge Impulse covers some of these. STM32Cube.AI covers one. Darml covers them all — through a single uniform pipeline.

2–8 KB RAM

AVR (Arduino-class)

emlearn · pure C · <2 KB at runtime
avr-mega328
avr-mega2560
320 KB – 1.5 MB RAM

STM32 (Cortex-M)

TFLite Micro · ARM GCC
stm32f4 · stm32h7 · stm32n6
520 KB – 8.5 MB RAM

ESP32 (Xtensa)

TFLite Micro · WiFi/BT
esp32 · esp32-s3 (PSRAM)
4–8 GB RAM

Raspberry Pi

TFLite · Python + Docker
rpi4 · rpi5
4–8 GB RAM

NVIDIA Jetson

TensorRT / TFLite · Docker
jetson-nano · jetson-orin
Don't see your board? If it runs PlatformIO, we can probably add it. Open an issue with the chip + dev-board name and we'll prioritize.

From your model to your device, in five steps.

No notebooks. No configuration files. No vendor SDKs. The CLI takes your model file and your target board, tells you whether it'll fit, and hands you something flashable.

01 / UPLOAD

Hand it your model

.tflite, .onnx, or .pkl — Darml auto-detects the format.

02 / PICK

Pick your board

11 supported targets across 5 hardware tiers. darml targets lists them all.

03 / CHECK

Get warned before you wait

Will it fit in flash? Will it fit in RAM? You find out before the toolchain spins up.

04 / BUILD

Get a binary

firmware.bin for MCUs, .tar.gz for Linux boards. Plus a manifest with target-specific flash instructions.

05 / FLASH

Run it

darml flash <artifact.zip> --port … — auto-detects the right tool: esptool, STM32_Programmer_CLI, avrdude, ssh+docker.

Step 03 — the part that saves your afternoon

Find out it won't fit before you wait 8 minutes for it to fail.

✗ Without Darml
00:00   start build
00:15   fetch toolchain
03:42   compile sources…
08:11   linker…
08:27   overflow: .text exceeds region by 156 KB
✓ With Darml
00:00   parse model
00:00.2   won't fit. model is 412 KB, stm32f4 has 256 KB flash.
00:00.2   suggested: stm32h7 (2 MB) or quantize to INT8 (~110 KB)

You move on. No toolchain spins up.

Real models, real targets, real numbers.

Every release runs through a 5-model × 11-target acceptance matrix (55 parse + size-check combinations) plus end-to-end firmware builds. These are the actual outputs from the latest passing run.

Model
Target
Format → output
Output size
Build time
micro_speech (1D-Conv kw spotter)
esp32-s3
tflite → firmware.bin
566 KB
273 s
MNIST CNN (Conv2D + Dense)
stm32f4
tflite → firmware.bin
345 KB
176 s
MLP (multi-layer perceptron)
esp32-s3
onnx → tflite → firmware.bin
522 KB
270 s
Random forest (iris)
avr-mega328
sklearn → C library
1.5 KB
~1 s
MobileNetV2 INT8
rpi5
tflite → tarball + Dockerfile
597 KB
~2 s

micro_speech (1D-Conv kw spotter)

targetesp32-s3
formattflite → firmware.bin
output566 KB
build273 s

MNIST CNN (Conv2D + Dense)

targetstm32f4
formattflite → firmware.bin
output345 KB
build176 s

MLP (multi-layer perceptron)

targetesp32-s3
formatonnx → tflite → firmware.bin
output522 KB
build270 s

Random forest (iris)

targetavr-mega328
formatsklearn → C library
output1.5 KB
build~1 s

MobileNetV2 INT8

targetrpi5
formattflite → tarball + Dockerfile
output597 KB
build~2 s

Plus 73 passing unit tests covering parser, size estimator, pipeline, build cache, license validation, and inference roundtrip (FP32 ONNX vs INT8 TFLite, argmax-stable on ≥ 80% of samples). Run the same matrix yourself: ./scripts/pre_publish.sh.

Pricing

Free to start. Self-host the same code your customers are running.

Core

€0 / forever

MIT — fully open source. Use it for anything, commercial or not.

  • All 11 hardware targets
  • CLI with serial output
  • 5 builds/day (per-machine)
  • Community support
View on GitHub

Pro Cloud

€49 / month ≈ $52 / mo

We host the build farm. No PlatformIO, no toolchain setup, no maintenance.

  • Unlimited builds
  • INT8 post-training quantization (PTQ, no QAT)
  • ONNX → TFLite auto-convert
  • Web dashboard + shared cache
  • HTTP / MQTT result output
  • Email support
Start free trial

Pro Self-Hosted

€499 / year / seat ≈ $530 / yr

air-gapped on-prem offline Run inside your network. For factory floors, secure dev environments, on-prem CI.

  • Everything in Pro Cloud
  • Runs offline / air-gapped
  • Offline license key (HMAC)
  • 14-day free trial built in
  • Commercial license
  • Email support, 1-day SLA
Start free trial

Early-access pricing — locked in for renewal. Industrial buyers get a real PO and a one-page security questionnaire response on request.

Enterprise

Custom

Dedicated build farm, signed firmware, SOC 2, fleet-management add-ons.

  • Everything in Pro Self-Hosted
  • Signed firmware artifacts
  • Dedicated build runners
  • SSO / SAML
  • Volume seat pricing
  • Phone + email SLA
Talk to us

What people ask before they install.

If your question isn't here, hello@darml.dev goes straight to the maintainer.

How is this different from Edge Impulse / STM32Cube.AI / TensorFlow Lite for Microcontrollers?

Edge Impulse is end-to-end (data + training + deployment) but biased toward their cloud and a curated subset of boards. STM32Cube.AI deploys to STM32 only. TFLite-Micro is a runtime, not a build tool — you wire the toolchain yourself. Darml is the missing piece: bring-your-own-model, span every reasonable target, one CLI. If you've already trained your model and don't want to live inside any vendor's ecosystem, that's the gap Darml fills.

Why these specific STM32 / AVR boards and not others?

We picked one board per performance tier — Cortex-M4 (stm32f4), Cortex-M7 (stm32h7), M55 + NPU (stm32n6); Arduino Uno-class (avr-mega328) and Mega-class (avr-mega2560) — rather than enumerate every variant. The full STM32 family has hundreds of chips (F0 / F1 / F3 / G0 / G4 / L0 / L4 / L5 / U5 / WL / WB / MP1 / N6 …) that differ in core, FPU presence, peripherals, and memory layout, so each one needs a verified runtime template and CI build, not just a board-ID change. We add on demand: open an issue with chip + dev-board name and we'll prioritize.

Is the firmware I get auditable? Can I read what gets flashed?

Yes. The output is a standard PlatformIO project (for MCU targets) or a Dockerfile + Python script (for Linux targets). No precompiled blobs, no obfuscation. The C/C++ glue is generated from templates you can read in the source tree (darml/infrastructure/templates/). Inspect, diff, fork as you want.

Does my model leave my machine?

On the free Core tier, no — every build runs locally. On Pro Cloud, your model is uploaded to our build farm (EU/Frankfurt), held only for the duration of the build, and deleted when the artifact is downloaded. On Pro Self-Hosted, the entire stack runs in your network; nothing ever leaves.

What's the accuracy delta after INT8 quantization?

We do post-training quantization (PTQ) only — no quantization-aware training, no fine-tuning, no knowledge distillation. Expect 1–3 % accuracy drop for dense networks, 3–8 % for typical CNNs, 5–15 % for larger CNNs without per-channel calibration. The verified-builds matrix above includes an inference roundtrip test (FP32 vs INT8) that asserts argmax-stable on ≥ 80 % of samples — if your model deviates more than that, the test fails loudly. Bring your own calibration data via --calibration-data for tighter results.

Can I bring my own quantized model?

Yes. Darml auto-detects already-quantized .tflite and .onnx (QDQ format) inputs and skips its own quantize step — you'll see a "Model is already quantized; skipping." warning, then convert / compile / package run normally. If you've done QAT or your own PTQ with real calibration data in your training pipeline, hand us the result and we deploy it as-is. The free Core tier is enough to ship if you quantize upstream.

Can I use Darml in safety-critical code (IEC 62304 / ISO 26262 / DO-178C)?

The runtime libraries we wrap (TFLite Micro, emlearn) have been used in regulated environments, but Darml itself is not certified. For functional-safety pipelines, the path is: use Pro Self-Hosted (for traceability), pin a Darml release, and keep the generated code under your normal change-control process. Talk to hello@darml.dev if you need a signed reproducibility statement.

What does "5 builds per day" actually mean on the free tier?

A counter in ~/.darml/counter increments when a build hits the compile step. Failed pre-flight checks (won't-fit) don't count. Rolling 24-hour window. The cap exists to keep the free tier sustainable, not to gate features — every target, every model format, every architecture is available on Core.

Why MIT for Core but proprietary for Pro?

Core is genuinely free software — fork it, ship it, sell derivatives, no strings. Pro contains the parts that take real ongoing work to maintain (auto-quantization across formats, ONNX→TFLite conversion, hosted build farm, license signing). Selling Pro is what funds Core. Standard open-core.

What happens if Darml the company disappears?

Core stays MIT — fork it, run it forever. Pro Self-Hosted licenses are HMAC-signed and validated locally; they keep working as long as your existing key hasn't expired. The license verification code is in the public Pro client, so you can audit it. We don't phone home.