Running Bare-Metal Rust Alongside ESP-IDF on the ESP32-S3's Second Core

Building a Hot-Swappable, Dual-Paradigm Environment on Espressif Silicon

I've been working with the RP2350 and no_std Rust for a while now, and I've really come to appreciate how Rust is designed — safe yet surprisingly straightforward. But my latest project needs Wi-Fi and BLE, and the RP2350 doesn't have wireless hardware built in. That meant switching to the ESP32-S3.

The ESP32-S3 is a great chip, but here's the catch: most Wi-Fi and Bluetooth functionality lives inside Espressif's ESP-IDF framework, which is a C-based SDK built on top of FreeRTOS. There are community Rust wrappers for parts of ESP-IDF, and Espressif themselves offer some Rust support, but both are a moving target — documentation is sparse compared to the mature C API, and there's always one or two critical features missing.

So I was stuck choosing between two imperfect options:

Go all-in on Rust. I'd get the language features and crates I love, but the no_std ecosystem on ESP32-S3 is still young. In a shipping product, I didn't want to risk hitting undefined behavior in an immature HAL at 2 AM.
Go all-in on ESP-IDF (C). I'd get battle-tested Wi-Fi and BLE stacks, but I'd be writing C for everything — including the business logic, audio processing, and data handling where Rust really shines.

Then I remembered something: the ESP32-S3 has two CPU cores.

There's an option buried in ESP-IDF's Kconfig called CONFIG_FREERTOS_UNICORE. When you enable it, FreeRTOS only runs on Core 0. Core 1 just... sits there, stalled, doing nothing. That got me thinking: what if I let ESP-IDF own Core 0 for all the Wi-Fi, BLE, and system tasks, and then wake up Core 1 to run my own bare-metal Rust code — completely outside the RTOS?

Both cores share the same memory space, so passing data between them should be straightforward (though it does require some unsafe Rust). And since Core 1 wouldn't be managed by FreeRTOS, there'd be no scheduler preempting my time-critical audio processing loop.

After convincing myself this wasn't completely insane, I got to work. Here's how it all fits together.

Background: Why Not Just Pin a FreeRTOS Task?

Before diving in, it's worth addressing the obvious question: ESP-IDF already provides xTaskCreatePinnedToCore, which can pin a task to a specific core:

// FreeRTOS provides this function to create a task on a specific core.
// You could pin a Rust function to Core 1 this way — but FreeRTOS
// would still manage the scheduler on that core.
BaseType_t xTaskCreatePinnedToCore(
    TaskFunction_t pvTaskCode,       // Function that implements the task
    const char * const pcName,       // Human-readable name for debugging
    const uint32_t usStackDepth,     // Stack size in words (not bytes)
    void * const pvParameters,       // Arbitrary pointer passed to the task
    UBaseType_t uxPriority,          // Priority (higher = more CPU time)
    TaskHandle_t * const pvCreatedTask, // Output: handle to the created task
    const BaseType_t xCoreID         // 0 = PRO core, 1 = APP core
);

You could absolutely compile your Rust code as a static library, export a pub extern "C" fn, and have FreeRTOS run it on Core 1 via this API. The ESP-IDF build system would statically link your Rust .a file into the firmware.

The problem is that FreeRTOS's scheduler is still running on Core 1. Your task can be preempted at any time by higher-priority tasks or system ticks. For a high-performance audio processing loop where every microsecond of jitter matters, that's a non-starter. I needed a guarantee that nothing would interrupt my code once it started running.

By disabling FreeRTOS on Core 1 entirely (via CONFIG_FREERTOS_UNICORE=y), we get an empty CPU that we can control directly at the hardware level — no scheduler, no context switching, no surprises.

Part 0: Statically Linked Rust on a Bare Core

Let's start with the simpler approach: building Rust as a static library, linking it into the ESP-IDF firmware at compile time, and manually booting Core 1 to run it. This is the foundation everything else builds on.

Step 1: Reserve Memory for the Bare-Metal Core (C Side)

When Core 1 wakes up outside of FreeRTOS, it doesn't get a dynamically allocated stack from the OS — because there is no OS on that core. We need to manually set aside a chunk of RAM that ESP-IDF's heap allocator won't touch.

ESP-IDF provides the SOC_RESERVE_MEMORY_REGION macro for exactly this. It tells the bootloader and memory allocator to treat a specific address range as off-limits:

#include "heap_memory_layout.h"

// Reserve 128KB of internal SRAM for Core 1's stack and data.
// The two hex values define the start and end addresses of the reserved region.
// 0x3FCE9710 - 0x3FCC9710 = 0x20000 = 131072 bytes = 128KB.
// "rust_app" is just a label for debugging — it shows up in boot logs.
SOC_RESERVE_MEMORY_REGION(0x3FCC9710, 0x3FCE9710, rust_app);

Why 128KB? It's a reasonable default for an embedded stack plus some working memory. You can adjust this range depending on how much RAM your Rust code needs — just make sure the addresses fall within the ESP32-S3's internal SRAM region and don't overlap with anything ESP-IDF is using.

Step 2: Wake Up Core 1 from the C Side

This is the main ESP-IDF application running on Core 0. Its job is to:

Set up the system (Wi-Fi, peripherals, etc. — or in our test case, just boot).
Wake up Core 1 and point it at our Rust code.
Go about its normal FreeRTOS business.

Instead of using xTaskCreatePinnedToCore, we're talking directly to the ESP32-S3's hardware registers to boot Core 1. We set a boot address, enable the clock, release the stall, and pulse the reset line. Core 1 wakes up completely independent of FreeRTOS.

To verify that everything is working, Core 0 will read a shared counter variable (RUST_CORE1_COUNTER) that the Rust code on Core 1 increments in a loop.

#include <stdio.h>
#include <stdint.h>
#include "esp_log.h"
#include "esp_cpu.h"
#include "heap_memory_layout.h"
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "soc/system_reg.h"
#include "soc/soc.h"

static const char *TAG = "rust_app_core";

// Reserve memory so ESP-IDF's heap allocator doesn't use it.
// (Same macro from Step 1 — it must appear in a compiled C file.)
SOC_RESERVE_MEMORY_REGION(0x3FCC9710, 0x3FCE9710, rust_app);

// ---- External symbols ----
// These are defined in other files and resolved at link time:
//   rust_app_core_entry  — the Rust function (from our .a library)
//   app_core_trampoline  — tiny assembly stub that sets the stack pointer
//   _rust_stack_top      — address from our linker script (top of reserved 128KB)
//   ets_set_appcpu_boot_addr — ROM function that tells Core 1 where to start
extern void rust_app_core_entry(void);
extern void ets_set_appcpu_boot_addr(uint32_t);
extern uint32_t _rust_stack_top;
extern void app_core_trampoline(void);

/*
 * Boot Core 1 by directly manipulating ESP32-S3 hardware registers.
 * This bypasses FreeRTOS entirely — Core 1 will run our code with
 * no scheduler, no interrupts (unless we set them up), and no OS.
 */
static void start_rust_on_app_core(void)
{
    ESP_LOGI(TAG, "Starting Rust on Core 1...");
    ESP_LOGI(TAG, "  Stack: 0x3FCC9710 - 0x3FCE9710 (128K)");

    /* 1. Tell Core 1 where to begin executing after it resets.
     *    This ROM function writes the address into a register that the
     *    CPU reads on boot. We point it at our assembly trampoline. */
    ets_set_appcpu_boot_addr((uint32_t)app_core_trampoline);

    /* 2. Hardware-level wake-up sequence for Core 1.
     *    These register writes control the clock, stall, and reset
     *    signals for the second CPU core. */

    // Enable the clock gate — Core 1 can't run without a clock signal.
    SET_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG,
                      SYSTEM_CONTROL_CORE_1_CLKGATE_EN);

    // Clear the RUNSTALL bit. While stalled, the core is frozen mid-instruction.
    CLEAR_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG,
                        SYSTEM_CONTROL_CORE_1_RUNSTALL);

    // Pulse the reset line: assert it, then immediately de-assert.
    // This causes Core 1 to reboot and jump to the address we set above.
    SET_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG,
                      SYSTEM_CONTROL_CORE_1_RESETING);
    CLEAR_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG,
                        SYSTEM_CONTROL_CORE_1_RESETING);

    ESP_LOGI(TAG, "Core 1 released");
}

// This counter lives in the Rust code. Because it's an AtomicU32 with
// #[no_mangle], the C linker can find it by this exact name.
extern volatile uint32_t RUST_CORE1_COUNTER;

void app_main(void)
{
    ESP_LOGI(TAG, "Core 0: Starting IDF app");

    // Wake up Core 1 and start the Rust code
    start_rust_on_app_core();

    // Core 0 continues running FreeRTOS as normal.
    // Here we just monitor the shared counter to prove both cores are alive.
    while (1)
    {
        ESP_LOGI(TAG, "Rust Core 1 counter: %lu", (unsigned long)RUST_CORE1_COUNTER);
        vTaskDelay(pdMS_TO_TICKS(1000)); // Print once per second
    }
}

Step 3: The Assembly Trampoline

When a CPU core wakes up from reset, it doesn't have a stack yet. And without a stack, it can't call any C or Rust functions — function calls need somewhere to store return addresses and local variables.

The ESP32-S3 uses the Xtensa instruction set architecture, where register a1 serves as the stack pointer. Our tiny assembly stub loads the address of our reserved memory into a1, then jumps into Rust. That's all it does — just two instructions.

We place this code in the .iram1 section, which maps to Internal RAM. This is important because when a core first boots, it may not have flash caching set up yet. Code in IRAM is always accessible.

app_core_trampoline.S

/*
 * app_core_trampoline.S
 *
 * Minimal startup code for Core 1. Sets the stack pointer to our
 * reserved memory region, then jumps to the Rust entry point.
 *
 * Placed in IRAM (.iram1) so it's available immediately after core
 * reset, before flash cache is configured.
 */

    .section .iram1, "ax"       /* "ax" = allocatable + executable */
    .global  app_core_trampoline
    .type    app_core_trampoline, @function
    .align   4                  /* Xtensa requires 4-byte alignment */

app_core_trampoline:
    /* Load the top of our 128KB reserved stack into register a1.
     * Stacks grow downward on Xtensa, so "top" means the highest
     * address — the stack will grow toward lower addresses from here. */
    movi  a1, _rust_stack_top

    /* Jump to the Rust entry function. call0 is a "windowless" call
     * (no register window rotation), suitable for bare-metal startup.
     * This function never returns — it contains an infinite loop. */
    call0 rust_app_core_entry

    .size app_core_trampoline, . - app_core_trampoline

Step 4: Gluing It Together with CMake and a Linker Script

ESP-IDF uses CMake as its build system. We need to tell it about three extra things: our assembly file, our pre-compiled Rust library, and a custom linker script that defines where _rust_stack_top lives.

CMakeLists.txt

# Register our C source and the assembly trampoline as component sources.
# ESP-IDF builds each directory under "main/" as a "component."
idf_component_register(
    SRCS "main.c" "app_core_trampoline.S"
    INCLUDE_DIRS "."
)

# Tell the linker about our pre-compiled Rust static library.
# This .a file is produced by `cargo build` and copied into main/lib/.
add_prebuilt_library(rust_app "${CMAKE_CURRENT_SOURCE_DIR}/lib/libesp_rust_app.a")

# Link the Rust library into our component. INTERFACE means anything
# that depends on this component also gets the Rust symbols.
target_link_libraries(${COMPONENT_LIB} INTERFACE rust_app)

# Inject our custom linker script. This is how the assembly trampoline
# knows the numeric value of _rust_stack_top.
target_link_options(${COMPONENT_LIB}
    INTERFACE "-T${CMAKE_CURRENT_SOURCE_DIR}/rust_stack.ld")

rust_stack.ld

/*
 * Custom linker script fragment.
 *
 * Defines _rust_stack_top as the END of our reserved 128KB block.
 * Stacks grow downward, so the "top" is the highest address.
 * The assembly trampoline loads this value into register a1.
 */
_rust_stack_top = 0x3FCE9710;

The connection here is: the linker script provides a symbol (_rust_stack_top) → the assembly trampoline references that symbol to set the stack pointer → the C code triggers the hardware boot sequence that starts Core 1 at the trampoline.

Step 5: The Bare-Metal Rust Application

Finally, here's the code that actually runs on Core 1. It's entirely no_std — there's no operating system, no allocator, no standard library. Just raw hardware access.

The key technique here is AtomicU32. Atomics are special CPU instructions that read and write memory in a way that's safe even when two cores access the same address simultaneously. By using AtomicU32 for our shared counter, we avoid race conditions without needing a mutex (which wouldn't work easily across the OS/bare-metal boundary anyway).

The spin_loop hint tells the CPU "I'm intentionally busy-waiting" — on some architectures this reduces power consumption or yields resources to other hardware threads. Here it also serves as a simple delay so the counter doesn't overflow instantly.

// no_std: we're running without the Rust standard library.
// There's no OS below us — no heap, no threads, no println!.
#![no_std]

// no_main: we don't use Rust's normal main() entry point.
// Instead, Core 1 enters via rust_app_core_entry(), called from assembly.
#![no_main]

use core::panic::PanicInfo;
use core::sync::atomic::{AtomicU32, Ordering};

// Every no_std binary needs a panic handler. When something goes wrong
// (array out of bounds, unwrap on None, etc.), this function is called.
// On a bare-metal core with no debugger attached, there's not much we
// can do — so we just loop forever. A production system might toggle
// an LED or write to a shared error flag that Core 0 can read.
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

// The shared counter. Both cores can see this variable because it lives
// in the same memory space.
//
// #[unsafe(no_mangle)] prevents Rust from renaming this symbol during
// compilation. Without it, Rust would generate something like
// "_ZN12esp_rust_app18RUST_CORE1_COUNTER17h..." — and the C code
// wouldn't be able to find it by name.
//
// AtomicU32 ensures that reads and writes are atomic at the CPU level,
// so Core 0 will never see a "torn" (half-written) value.
#[unsafe(no_mangle)]
pub static RUST_CORE1_COUNTER: AtomicU32 = AtomicU32::new(0);

// The entry point called by the assembly trampoline after it sets
// up the stack pointer. The `-> !` return type means "this function
// never returns" — it runs an infinite loop.
//
// `extern "C"` uses the C calling convention so the assembly code
// (and the C linker) can call this function correctly.
#[unsafe(no_mangle)]
pub extern "C" fn rust_app_core_entry() -> ! {
    loop {
        // Atomically increment the counter by 1.
        // Ordering::Relaxed means we don't need any memory ordering
        // guarantees beyond the atomicity of this single operation.
        // (For a simple counter, Relaxed is sufficient.)
        RUST_CORE1_COUNTER.fetch_add(1, Ordering::Relaxed);

        // Busy-wait loop as a simple delay. spin_loop() is a CPU hint
        // that says "I'm spinning, not doing real work" — on some
        // architectures this saves power or avoids starving other
        // hardware threads.
        for _ in 0..1_000_000 {
            core::hint::spin_loop();
        }
    }
}

Step 6: Configuring the Rust Build (Cargo.toml)

ESP-IDF's build system expects a standard C-compatible static archive (.a file). By default, cargo build produces Rust-specific .rlib files that only the Rust toolchain understands. We need to tell Cargo to output a staticlib instead.

We also apply aggressive size optimizations — on a microcontroller with limited flash, every kilobyte matters.

Cargo.toml

[package]
edition      = "2024"
name         = "esp_rust_app"
rust-version = "1.88"
version      = "0.1.0"

# Output a C-compatible static library (.a file).
# This is what lets us link Rust code into an ESP-IDF project
# the same way you'd link any C library.
[lib]
crate-type = ["staticlib"]

[dependencies]
# esp-hal provides low-level hardware access for the ESP32-S3.
# Even though we're not using most of its features yet, it sets up
# the critical-section implementation we need for atomics.
esp-hal = { version = "~1.0", features = ["esp32s3"] }
# Provides the critical-section implementation needed for safe
# interrupt handling in no_std environments.
critical-section = "1.2.0"

[profile.dev]
# Rust's default debug builds are unoptimized and produce huge binaries.
# On embedded, even dev builds should use "s" (optimize for size) to
# keep things manageable. Without this, you might overflow flash.
opt-level = "s"

[profile.release]
# Force the compiler to use a single codegen unit. This is slower to
# compile, but allows LLVM to see the entire crate at once and perform
# better cross-function optimizations (inlining, dead code elimination).
codegen-units    = 1
debug            = 2     # Keep debug symbols (useful for GDB on-device)
debug-assertions = false # Disable assert!() checks in release
incremental      = false # Disable incremental compilation for cleaner builds

# "fat" Link-Time Optimization. The linker analyzes ALL code (including
# dependencies) as a single unit, aggressively removing unused functions
# and inlining across crate boundaries. This can dramatically reduce
# binary size — often 30-50% smaller than without LTO.
lto              = 'fat'
opt-level        = 's'   # Optimize for size over speed
overflow-checks  = false # Disable integer overflow checks in release

Building and Testing

Build the Rust library, then copy it into the ESP-IDF project:

# Build the Rust code targeting the ESP32-S3's Xtensa CPU.
# This produces a .a file in target/xtensa-esp32s3-none-elf/release/
cargo build --release --target xtensa-esp32s3-none-elf

# Copy the compiled library to where our CMakeLists.txt expects it.
cp target/xtensa-esp32s3-none-elf/release/libesp_rust_app.a \
   /path/to/idf-project/main/lib/

Then build and flash the ESP-IDF project as usual (idf.py build flash monitor). You should see the counter incrementing on your serial monitor — proof that Core 1 is running your Rust code independently of FreeRTOS.

Part 1: Loading Rust at Runtime (Hot-Swappable Programs)

The static linking approach from Part 0 works well, but it has a limitation: the Rust code is baked into the firmware at compile time. Every time you change the Rust program, you have to rebuild the entire ESP-IDF project, re-link everything, and reflash the whole firmware.

What if the Rust program could be swapped at runtime? Imagine this: the ESP-IDF firmware acts like a bootloader, setting up the hardware environment (Wi-Fi, BLE, peripherals). The Rust program lives in its own flash partition and can be updated independently. Core 0 could even write a new Rust program to flash and reset Core 1 to run it — no full firmware rebuild required.

This is especially useful if the Rust code is user-provided content — for example, a customizable audio processing pipeline that end users can update.

To make this work, we need to change several things.

Step 1: Build Rust as a Standalone Binary

In Part 0, Cargo built a static library (.a file) that got linked into the ESP-IDF binary. Now we need Cargo to produce a standalone executable binary with its own entry point — something that can be loaded and jumped to at a specific memory address.

First, remove the [lib] section from Cargo.toml so Cargo builds a binary instead of a library:

Cargo.toml

[package]
edition      = "2024"
name         = "esp_rust_app"
rust-version = "1.88"
version      = "0.1.0"

# No [lib] section — we want a standalone binary, not a library.
# Cargo will look for src/main.rs as the entry point.

[dependencies]
esp-hal = { version = "~1.0", features = ["esp32s3"] }
critical-section = "1.2.0"

[profile.dev]
# Even dev builds need size optimization on embedded — unoptimized Rust
# produces enormous binaries that won't fit in flash.
opt-level = "s"

[profile.release]
codegen-units    = 1     # Single codegen unit for best LLVM optimization
debug            = 2
debug-assertions = false
incremental      = false
lto              = 'fat' # Full link-time optimization across all crates
opt-level        = 's'   # Optimize for size
overflow-checks  = false

Next, we need a .cargo/config.toml to tell the Rust toolchain how to link our binary. Since we're not linking into ESP-IDF anymore, we need to supply our own linker script and disable the standard startup code:

.cargo/config.toml

[target.xtensa-esp32s3-none-elf]
rustflags = [
    "-Clink-arg=-Tlink.x",             # Use our custom linker script
    "-Clink-arg=-nostdlib",             # Don't link the C standard library
    "-Clink-arg=-nostartfiles",         # Don't include default startup code
    "-Clink-arg=-Wl,--no-gc-sections", # Keep all sections (don't garbage-collect)
    "-Clink-arg=-Wl,--no-check-sections", # Skip section overlap checks
    "-Clink-arg=-mtext-section-literals",  # Xtensa-specific: inline literal pools
    "-Clink-arg=-Wl,--entry=rust_app_core_entry", # Set the ELF entry point
]

[env]

[build]
# Default build target — no need to pass --target every time
target = "xtensa-esp32s3-none-elf"

[unstable]
# Build the `core` library from source for our target.
# The Xtensa target doesn't ship prebuilt standard libraries,
# so Cargo needs to compile `core` itself.
build-std = ["core"]

The Linker Script

In Part 0, the .bss (uninitialized global variables) and .data (initialized global variables) sections from our Rust code were handled by the ESP-IDF linker — they became part of the main firmware's memory layout. But now that we're building a standalone binary, we need our own linker script to tell the toolchain where everything goes.

This is a critical piece of the puzzle. The linker script defines two memory regions: FLASH_TEXT (where our code lives in flash, mapped to a virtual address via the MMU) and DRAM (our reserved 128KB of RAM from the SOC_RESERVE_MEMORY_REGION macro).

link.x

/* Declare our Rust entry function as the ELF entry point */
ENTRY(rust_app_core_entry)

MEMORY
{
    /*
     * FLASH_TEXT: Where our code will be mapped in the address space.
     * 0x42400000 is a virtual address — the MMU will map our flash
     * partition to this region at runtime (we'll set that up in C).
     * 512K should be plenty for most Rust programs.
     */
    FLASH_TEXT (rx)  : ORIGIN = 0x42400000, LENGTH = 512K

    /*
     * DRAM: The 128KB block we reserved with SOC_RESERVE_MEMORY_REGION.
     * This is physical SRAM that both cores can access directly.
     * Our stack, .data, and .bss all live here.
     */
    DRAM       (rw)  : ORIGIN = 0x3FCC9710, LENGTH = 128K
}

SECTIONS
{
    /*
     * 4-byte header at offset 0 of the binary.
     * This is a simple convention: the first 4 bytes of our binary
     * contain the address of rust_app_core_entry. The C bootloader
     * reads this to know where to jump.
     */
    .header : {
        LONG(rust_app_core_entry)
    } > FLASH_TEXT

    /*
     * Xtensa puts function literal pools (constants used by instructions)
     * in .literal sections. We place the entry function's literals and
     * code first to ensure they're near the beginning of the binary.
     */
    .entry_lit : {
        KEEP(*(.literal.rust_app_core_entry))
    } > FLASH_TEXT

    .entry : {
        KEEP(*(.text.rust_app_core_entry))
    } > FLASH_TEXT

    /* All remaining code and read-only data goes into flash */
    .text : {
        *(.literal .literal.*)    /* Xtensa literal pools */
        *(.text .text.*)          /* Executable code */
        *(.rodata .rodata.*)      /* Read-only data (strings, constants) */
    } > FLASH_TEXT

    /*
     * .data: Initialized global/static variables.
     * These live in DRAM at runtime (VMA), but their initial values
     * are stored in flash (LMA). Our Rust startup code must copy
     * them from flash to RAM before using them.
     *
     * The "AT> FLASH_TEXT" part means: "put the content in flash,
     * but assign addresses as if it's in DRAM."
     */
    .data : {
        _data_start = .;
        *(.data .data.*)
        _data_end = .;
    } > DRAM AT> FLASH_TEXT
    _data_load = LOADADDR(.data);  /* Flash address where .data content lives */

    /*
     * .bss: Uninitialized global/static variables.
     * NOLOAD means the linker doesn't store anything in the binary for
     * this section — our startup code just zeroes the region at boot.
     */
    .bss (NOLOAD) : {
        _bss_start = .;
        *(.bss .bss.* COMMON)
        _bss_end = .;
    } > DRAM

    /* Discard sections we don't need — saves space in the binary */
    /DISCARD/ : {
        *(.eh_frame)         /* Exception handling frames (unused in no_std) */
        *(.eh_frame_hdr)
        *(.stack)
        *(.xtensa.info)      /* Xtensa toolchain metadata */
        *(.comment)          /* Compiler version strings */
    }
}

Initializing .data and .bss from Rust

When our Rust code was a library linked into ESP-IDF, the IDF startup code handled copying .data from flash to RAM and zeroing .bss. Now that we're standalone, we have to do it ourselves. This must happen before any static or global variables are accessed, or we'll read garbage.

// These symbols are defined by our linker script (link.x).
// They don't contain data — their *addresses* ARE the data.
// For example, &_data_start gives us the RAM address where .data begins.
unsafe extern "C" {
    static _data_start: u8;  // Start of .data in RAM
    static _data_end: u8;    // End of .data in RAM
    static _data_load: u8;   // Start of .data's initial values in flash
    static _bss_start: u8;   // Start of .bss in RAM
    static _bss_end: u8;     // End of .bss in RAM
}

/// Copy .data initial values from flash to RAM, and zero .bss.
/// MUST be called before accessing any static/global variables.
unsafe fn init_sections() {
    // Calculate how many bytes the .data section occupies
    let data_size = &raw const _data_end as usize - &raw const _data_start as usize;
    if data_size > 0 {
        // Copy initial values from flash (where the linker stored them)
        // to RAM (where the program expects them at runtime).
        core::ptr::copy_nonoverlapping(
            &raw const _data_load,          // Source: flash
            &raw const _data_start as *mut u8, // Destination: RAM
            data_size,
        );
    }

    // Calculate how many bytes the .bss section occupies
    let bss_size = &raw const _bss_end as usize - &raw const _bss_start as usize;
    if bss_size > 0 {
        // Zero out .bss. C and Rust both assume uninitialized globals
        // start as zero. Without this, they'd contain whatever was
        // previously in RAM — likely garbage from the bootloader.
        core::ptr::write_bytes(&raw const _bss_start as *mut u8, 0, bss_size);
    }
}

The Updated Rust Entry Point

Since our Rust binary is no longer linked into the ESP-IDF project, we can't share global variables by name across the C/Rust boundary (there's no shared linker pass). Instead, both sides agree on a fixed memory address for the shared counter. The C side reads from that address; the Rust side writes to it.

For this demo, I'm using the start of our reserved memory region (0x3FCC9710) as the counter address. In a real system, you'd want a more structured approach — perhaps a shared header at a fixed address that defines the layout of all shared data.

// Fixed memory address for the shared counter.
// Both the C side and Rust side must agree on this address.
// We're using the very start of our reserved DRAM region.
const COUNTER_ADDR: usize = 0x3FCC9710;

// #[unsafe(link_section = ".text.rust_app_core_entry")] places this
// function in a specific linker section making it easy to find.
#[unsafe(no_mangle)]
#[unsafe(link_section = ".text.rust_app_core_entry")]
pub extern "C" fn rust_app_core_entry() -> ! {
    // FIRST THING: initialize .data and .bss before touching any statics.
    // If we skip this, any global variable could contain garbage.
    unsafe {
        init_sections();
    }

    // Create an atomic reference to our shared counter.
    // We cast the raw memory address to an AtomicU32 pointer.
    // This is unsafe because we're asserting that this address is:
    //   1. Valid and aligned
    //   2. Not being used for anything else
    //   3. Accessible by both cores
    let counter = unsafe { &*(COUNTER_ADDR as *const AtomicU32) };

    // Initialize the counter to zero (in case there was leftover data)
    counter.store(0, Ordering::Relaxed);

    loop {
        // Increment the shared counter atomically
        counter.fetch_add(1, Ordering::Relaxed);

        // Busy-wait delay (same as before)
        for _ in 0..1_000_000 {
            core::hint::spin_loop();
        }
    }
}

Step 2: Update the ESP-IDF Project to Load the Binary at Runtime

Now that our Rust code is a standalone binary instead of a linked library, the ESP-IDF side needs several changes.

Create a Flash Partition

The Rust binary needs its own partition in flash. We add a rust_app entry after the factory partition (where the main ESP-IDF firmware lives):

partitions.csv

nvs,         data, nvs,     0x9000,     0x6000,
phy_init,    data, phy,     0xf000,     0x1000,
factory,     app,  factory, 0x10000,    0x1F0000,
rust_app,    data, 0x40,    0x200000,   0x80000,

The rust_app partition starts at offset 0x200000 (2MB into flash) and is 0x80000 (512KB) in size. The subtype 0x40 is an arbitrary custom value — it just needs to be something ESP-IDF doesn't already use, so we can find the partition by name and type later.

Map the Partition into Memory via the MMU

On the ESP32-S3, code in flash isn't directly executable — it needs to be mapped into the CPU's address space via the Memory Management Unit (MMU). This is normally handled automatically by ESP-IDF for the main firmware, but for our separate Rust binary, we need to do it manually.

The function below finds our rust_app partition in flash and maps it page-by-page to virtual address 0x42400000 (the same address our linker script targets). After mapping, the CPU can execute code from this region as if it were regular memory.

#include <string.h>
#include "esp_partition.h"
#include "hal/mmu_hal.h"
#include "hal/cache_hal.h"

// Virtual address where the Rust binary will be mapped.
// This MUST match the FLASH_TEXT origin in link.x.
#define RUST_VADDR 0x42400000

// Will hold the entry point address read from the binary's header
uint32_t rust_entry_addr = 0;

static void load_rust_app(void)
{
    // Find the "rust_app" partition we defined in partitions.csv.
    // We search by type (DATA) and subtype (0x40, our custom value).
    const esp_partition_t *part =
        esp_partition_find_first(ESP_PARTITION_TYPE_DATA, 0x40, "rust_app");

    if (!part)
    {
        ESP_LOGE(TAG, "rust_app partition not found!");
        return;
    }

    // Map the partition into the CPU's address space page by page.
    // The MMU works in pages (typically 64KB on ESP32-S3), so we
    // calculate how many pages we need and map each one.
    uint32_t page_size = CONFIG_MMU_PAGE_SIZE;
    uint32_t pages = (part->size + page_size - 1) / page_size; // Round up
    uint32_t actual_mapped_size = 0;

    for (uint32_t i = 0; i < pages; i++)
    {
        uint32_t mapped = 0;
        // Map one page: virtual address → physical flash address
        mmu_hal_map_region(0, MMU_TARGET_FLASH0,
                           RUST_VADDR + (i * page_size),    // Virtual addr
                           part->address + (i * page_size), // Flash addr
                           page_size, &mapped);
        actual_mapped_size += mapped;
    }

    // Invalidate the cache for this region so the CPU doesn't serve
    // stale data from a previous mapping.
    cache_hal_invalidate_addr(RUST_VADDR, part->size);

    ESP_LOGI(TAG, "Rust app mapped at 0x%lx (%lu bytes, flash 0x%lx)",
             (unsigned long)RUST_VADDR, (unsigned long)actual_mapped_size,
             (unsigned long)part->address);
}

Update the Boot Function

The start_rust_on_app_core function now loads the Rust binary from flash before waking Core 1. It reads the entry point address from the first 4 bytes of the binary (that's the .header section from our linker script) and stores it in a global variable that the assembly trampoline will read.

static void start_rust_on_app_core(void)
{
    // Step 1: Map the Rust binary from flash into the address space
    load_rust_app();

    // Step 2: Read the entry point from the binary's 4-byte header.
    // Our linker script placed LONG(rust_app_core_entry) at offset 0,
    // so the first 4 bytes at RUST_VADDR contain the function's address.
    uint32_t entry = *(volatile uint32_t *)RUST_VADDR;
    rust_entry_addr = entry;  // Store globally for the trampoline to read

    ESP_LOGI(TAG, "Rust entry at 0x%lx", (unsigned long)entry);

    // Step 3: Same hardware boot sequence as before
    ets_set_appcpu_boot_addr((uint32_t)app_core_trampoline);

    SET_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG,
                      SYSTEM_CONTROL_CORE_1_CLKGATE_EN);
    CLEAR_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG,
                        SYSTEM_CONTROL_CORE_1_RUNSTALL);
    SET_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG,
                      SYSTEM_CONTROL_CORE_1_RESETING);
    CLEAR_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG,
                        SYSTEM_CONTROL_CORE_1_RESETING);

    ESP_LOGI(TAG, "Core 1 released");
}

Update the Main Function

Since we can no longer reference RUST_CORE1_COUNTER by name (the Rust binary isn't linked into our C project anymore), we read the counter from its known memory address directly:

// The Rust code writes its counter to this fixed address.
// Both sides must agree on this — it's defined as COUNTER_ADDR in the Rust code.
#define RUST_COUNTER_ADDR 0x3FCC9710

void app_main(void)
{
    ESP_LOGI(TAG, "Core 0: Starting IDF app");

    start_rust_on_app_core();

    // Create a volatile pointer to the shared counter.
    // "volatile" tells the C compiler: "this value can change at any time
    // (because another CPU core is writing to it), so always read from
    // memory — don't cache it in a register."
    volatile uint32_t *counter = (volatile uint32_t *)RUST_COUNTER_ADDR;

    while (1)
    {
        ESP_LOGI(TAG, "Rust Core 1 counter: %lu", (unsigned long)*counter);
        vTaskDelay(pdMS_TO_TICKS(1000));
    }
}

Update the Assembly Trampoline

The trampoline can no longer use call0 rust_app_core_entry because that symbol doesn't exist in the C project's link stage. Instead, it reads the entry address from the rust_entry_addr global variable (which start_rust_on_app_core populated) and does an indirect jump:

/*
 * app_core_trampoline.S (updated for runtime loading)
 *
 * Same job as before: set the stack pointer, then jump to Rust.
 * But now the Rust entry address isn't known at link time — it's
 * stored in the rust_entry_addr global variable by the C code.
 */

    .section .iram1, "ax"
    .global  app_core_trampoline
    .type    app_core_trampoline, @function
    .align   4

app_core_trampoline:
    /* Set up the stack pointer (same as before) */
    movi  a1, _rust_stack_top

    /* Load the entry address from the global variable.
     * movi loads the ADDRESS of rust_entry_addr into a2,
     * then l32i loads the VALUE at that address into a0. */
    movi  a2, rust_entry_addr
    l32i  a0, a2, 0           /* a0 = *(rust_entry_addr) */

    /* Indirect jump to the Rust entry point */
    jx    a0

    .size app_core_trampoline, . - app_core_trampoline

Step 3: Build and Flash

Now we have two separate build steps — one for the Rust binary, one for the ESP-IDF firmware — and two separate flash steps.

Build and flash the ESP-IDF side:

# Build the ESP-IDF project (which no longer includes any Rust code)
idf.py build

# Flash the main firmware and partition table
idf.py flash

Build and flash the Rust binary:

# Build the standalone Rust binary
cargo build --release --target xtensa-esp32s3-none-elf

# Convert from ELF format to raw binary.
# The ELF file contains metadata (section headers, debug info, etc.)
# that we don't need — objcopy strips all of that and outputs just
# the raw machine code that the CPU will execute.
xtensa-esp32s3-elf-objcopy -O binary \
    'target/xtensa-esp32s3-none-elf/release/esp_rust_app' \
    rust_app.bin

# Flash the raw binary to the rust_app partition.
# 0x200000 is the offset we defined in partitions.csv.
esptool.py --port /dev/ttyACM0 write_flash 0x200000 rust_app.bin

The two flash steps are independent. You can update the Rust binary without rebuilding or reflashing the ESP-IDF firmware — just flash the new rust_app.bin to the same partition offset.

Verifying It Works

Open your serial monitor (idf.py monitor or any terminal at 115200 baud) and you should see output like this:

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x1 (POWERON),boot:0x8 (SPI_FAST_FLASH_BOOT)
...
I (47) boot: Partition Table:
I (50) boot: ## Label            Usage          Type ST Offset   Length
I (56) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (62) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (69) boot:  2 factory          factory app      00 00 00010000 001f0000
I (75) boot:  3 rust_app         Unknown data     01 40 00200000 00080000
I (82) boot: End of partition table
...
I (202) heap_init: Initializing. RAM available for dynamic allocation:
I (209) heap_init: At 3FC93BD8 len 00035B38 (214 KiB): RAM
I (214) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
I (219) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (224) heap_init: At 600FE000 len 00001FE8 (7 KiB): RTCRAM
...
I (279) main_task: Calling app_main()
I (279) rust_app_core: Core 0: Starting IDF app
I (280) rust_app_core: Rust app mapped at 0x42400000 (524288 bytes, flash 0x200000)
I (283) rust_app_core: Rust entry at 0x42400024
I (287) rust_app_core: Core 1 released
I (291) rust_app_core: Rust Core 1 counter: 34538
I (1295) rust_app_core: Rust Core 1 counter: 12369571
I (2295) rust_app_core: Rust Core 1 counter: 24670917
I (3295) rust_app_core: Rust Core 1 counter: 36972284
I (4295) rust_app_core: Rust Core 1 counter: 49273651

There are several things to confirm in this output:

The partition table shows our rust_app partition at offset 0x200000.
The heap_init logs show that our reserved 128KB region (starting at 0x3FCC9710) is not listed as available for dynamic allocation — SOC_RESERVE_MEMORY_REGION worked.
The MMU mapping succeeded — the Rust binary is mapped at 0x42400000.
The counter is incrementing — Core 1 is alive, running Rust, and sharing data with Core 0 through the atomic counter at the agreed-upon memory address.

What's Next

This setup gives you the best of both worlds: ESP-IDF and FreeRTOS manage Wi-Fi, BLE, and system tasks on Core 0, while Core 1 runs your bare-metal Rust code at full speed with zero scheduler interference. Data flows between them through shared memory using atomics.

From here, there are a lot of directions you could take this: setting up interrupts on Core 1, building a proper shared memory protocol between the cores, implementing error recovery if the Rust program crashes, or even adding the ability for Core 0 to update the Rust binary over Wi-Fi and hot-restart Core 1.

The dual-core architecture of the ESP32-S3 turns out to be a surprisingly clean boundary for separating concerns — and for running two very different software paradigms side by side.