Debugging Multi-MCU Firmware with Claude Code, OpenOCD and STLink: A Workflow That Actually Ships Fixes

How our firmware team debugs STM32WB55, Nuvoton M031 and ESP32 code using Claude Code, OpenOCD, STLink and GCC. A practical multi-MCU debugging workflow with know-how files, bug protocols and live register reads.

Firmware-Apr 29, 2026

Debugging Multi-MCU Firmware with Claude Code, OpenOCD and STLink: A Workflow That Actually Ships Fixes

Firmware debugging is where most AI coding assistants quietly fall apart. A register value flips, a peripheral misbehaves, two MCUs disagree on timing, and suddenly the "helpful assistant" starts inventing code paths that do not exist. We have spent the last several months hardening a debugging workflow that treats this problem seriously, and we want to share what finally worked.

Most of our current stack runs on a three-MCU architecture: an STM32WB55 handling BLE and core logic, a Nuvoton M031 managing low-level control, and an ESP32 for WiFi and cloud connectivity. Debugging across three silicon vendors, three toolchains and three debug probes used to be a full working day of setup before you even got to the actual bug. With Claude Code integrated into VS Code and a proper tool chain underneath, that setup now takes minutes and the debugging itself is noticeably faster.

Here is how we run it.

The tool chain we settled on

On the IDE side, we run Claude Code as a VS Code extension. That matters more than it sounds, because VS Code can read the COM ports directly, which means the AI has a live window into what the MCU is actually doing rather than guessing from static code.

Underneath the IDE:

OpenOCD as the debug server. This is the workhorse. It talks to the probe, exposes the registers, and lets the AI query live memory state.
STLink for the STM32WB55. Rock solid. Build, flash, read registers, read the COM port, repeat. No surprises.
NuLink for the Nuvoton M031. Works with OpenOCD, but we still have an open issue where the probe resets the target on debug start, which means any state the code had built up before the reset is lost. We are working around it for now and will write that up separately.
GCC as the compiler, even for the Nuvoton. We used to run Keil for the M031, and we still do for some projects, but porting to GCC opened up the whole Linux/open-source debugging flow. Binaries are slightly larger, but the workflow gain is worth it.

With this combination, Claude Code can compile, flash, read live register values, read the COM port, and reason about what it sees, all inside one loop.

The know-how file is not optional

The single biggest quality jump came from writing a proper know-how file before touching any bug.

Large language models hallucinate aggressively in firmware work. Peripheral register names, bit positions, clock tree dependencies, interrupt priorities, vendor quirks. None of this is safe to assume. So before we debug anything, we make the AI read the architecture first.

The know-how file contains:

A map of the three MCUs and what each one owns
The inter-MCU communication protocol and timing expectations
Clock configuration and power domain dependencies
Known vendor quirks, especially for BLE and WiFi coexistence on the STM32WB55
The bug workflow itself (see below)
Any previously fixed bugs with the root cause, so the same class of issue does not come back

We always instruct the AI to read the know-how file before it executes anything. If a new lesson comes up during a debug session, we write it back into the know-how file immediately. Context that lives only in a chat session disappears when the context window compacts. Context that lives in a file survives.

One bug at a time, with evidence

Early on we made the obvious mistake: hand the AI a list of five bugs and ask it to fix them in one pass. What happens is predictable in hindsight. It fixes bug one, modifies the code, then moves to bug two while still reasoning about the original code it had in its head. The fixes collide. You end up with more bugs than you started with.

Our bug workflow now enforces a strict protocol:

Analyze one bug in isolation. No code changes yet. List every related function, every caller, every dependency.
Predict the blast radius. If this fix goes in, what else could break? Which state machines touch this code path?
Find the evidence. The fix only proceeds once there is concrete proof the bug exists where we think it exists. Proof can come from three places:
- The code itself, when the logic is clearly wrong on inspection
- Debug serial prints on the COM port
- Live register values read through OpenOCD
Move to the next bug.

The evidence requirement is the most important rule. The AI is instructed never to assume. If it cannot prove the bug, it goes and gathers proof. In practice this means Claude will add temporary debug prints to the firmware, recompile, flash, open the COM port for a defined window, read the output, and only then propose a fix. We have watched it catch its own wrong theories this way, which is exactly what you want.

Live register reads are the unlock

OpenOCD exposing live registers to the AI changes the nature of debugging. Hard faults, stack pointers, peripheral status bits, interrupt pending flags, all of it becomes queryable in real time. For a hard fault on the STM32, instead of guessing from a stack trace, the AI pulls the fault status registers, the faulting address, the link register, and usually lands on the root cause in one pass.

We are adding a logic analyzer to this loop next. The goal is to give the AI decoded bus traffic alongside register state, so SPI, I2C and UART issues between the three MCUs can be diagnosed with signal-level proof instead of inference. Early experiments are promising.

Where it falls down, and how we manage it

Being honest about the failure modes matters more than the wins.

It over-engineers simple fixes. Left unchecked, Claude will refactor three files to handle a problem that needed a one-line change. We push back explicitly in the workflow: prefer the smallest viable fix, and justify anything larger. This alone cuts a lot of noise.

It gets optimistic. It will declare a bug fixed before the evidence actually supports that conclusion. We cross-check significant fixes with a second pass, sometimes using a different model, and we require the debug logs or register state to confirm the fix before we close the loop.

It forgets. After enough turns, the context window compacts and detail is lost. When we feel a session getting heavy, we fork into a new session and carry forward only the essential state plus a pointer to the know-how file. This is cheaper than fighting a confused context.

NuLink resets on debug start. Specific to the Nuvoton flow. Still open. STLink does not have this problem, which tells us the fix is probably in the OpenOCD config for NuLink. We will post the resolution once we have it verified.

A bonus that was not expected: test script generation

Once the AI has a live COM port, live registers and the know-how file, it turns out to be genuinely good at writing Python test scripts on the fly. We describe a feature we want to stress, and it produces a script that exercises the device over serial, captures the output, and reports pass or fail against defined criteria. This has collapsed a lot of our manual bench-testing time, especially during regression checks after a fix.

What this means for a firmware team

If you are running a multi-MCU product and still debugging the old way, there is a real productivity gap opening up. The combination that works for us is specific:

Claude Code inside VS Code
OpenOCD as the debug server
STLink as the primary probe, NuLink as secondary
GCC as the compiler, even on MCUs that traditionally shipped with Keil
A written know-how file that survives across sessions
A strict one-bug-at-a-time workflow with evidence requirements
Register-level and serial-level proof before any fix is accepted

None of the individual pieces are new. What is new is the discipline of wiring them together in a way the AI can actually operate inside, and the honesty about where it still fails.

For teams building hardware products with IoT and BLE stacks on STM32, Nuvoton or ESP32, this workflow is worth the week it takes to set up. It pays back inside the first sprint