ML Inference
The oveRTOS inference module provides a portable C API for loading pre-trained .tflite FlatBuffer models and running inference via LiteRT (formerly TensorFlow Lite Micro). The same model binary runs unchanged across all four oveRTOS backends — FreeRTOS, Zephyr, NuttX, and POSIX. Models are typically compiled into flash as const C arrays using xxd or flatc.
Pipeline
graph LR
LOAD["Load model<br/><small>ove_model_create() / init()</small>"]
CREATE["Interpreter ready<br/><small>arena allocated, ops registered</small>"]
INPUT["Populate input tensor<br/><small>write to info.data pointer</small>"]
INVOKE["ove_model_invoke()"]
OUTPUT["Read output tensor<br/><small>read from info.data pointer</small>"]
AGAIN["invoke() again<br/><small>new input data</small>"]
LOAD --> CREATE --> INPUT --> INVOKE --> OUTPUT --> AGAIN
AGAIN --> INPUT
style LOAD fill:#4a9,stroke:#333,color:#fff
style CREATE fill:#48b,stroke:#333,color:#fff
style INPUT fill:#48b,stroke:#333,color:#fff
style INVOKE fill:#666,stroke:#333,color:#fff
style OUTPUT fill:#a54,stroke:#333,color:#fff
style AGAIN fill:#666,stroke:#333,color:#fff
Tensor Types
| Enum Value | Description |
|---|---|
OVE_TENSOR_FLOAT32 |
32-bit IEEE 754 float — full-precision models |
OVE_TENSOR_INT8 |
Signed 8-bit integer — standard quantised MCU models |
OVE_TENSOR_UINT8 |
Unsigned 8-bit integer — legacy TFLite quantisation |
OVE_TENSOR_INT16 |
Signed 16-bit integer — higher-precision quantisation |
OVE_TENSOR_INT32 |
Signed 32-bit integer — accumulator and index tensors |
Model Lifecycle
stateDiagram-v2
[*] --> INIT: create() / init()
INIT --> READY: model parsed,<br/>arena allocated,<br/>tensors mapped
READY --> INVOKED: invoke()
INVOKED --> READY: read outputs,<br/>write new inputs
INVOKED --> INVOKED: invoke() again<br/>with same session
READY --> [*]: destroy() / deinit()
- INIT --
ove_model_create()orove_model_init()parses the FlatBuffer, allocates the tensor arena, and registers TFLM operators. Fails withOVE_ERR_ML_FAILEDif the model is malformed or the arena is too small. - READY -- The session is ready. Call
ove_model_input()to get a pointer into the tensor arena, write input data there, then callove_model_invoke(). - INVOKED -- Inference has run. Call
ove_model_output()to read results. The session can be reused — write new inputs and callinvoke()again without re-creating the model.
API Reference
| Function | Signature | Description |
|---|---|---|
ove_model_init |
(model, storage, arena, cfg) → int |
Initialise using caller-supplied storage and arena; no heap allocation |
ove_model_deinit |
(model) → void |
Release resources; static storage and arena are not freed |
ove_model_create |
(model, cfg) → int |
Allocate and initialise from heap (or static per-call-site in zero-heap mode) |
ove_model_destroy |
(model) → void |
Destroy and free a heap-allocated model session |
ove_model_invoke |
(model) → int |
Run inference on the currently populated input tensors |
ove_model_input |
(model, index, info) → int |
Get a descriptor for input tensor at index; write to info->data before invoking |
ove_model_output |
(model, index, info) → int |
Get a descriptor for output tensor at index; read from info->data after invoking |
ove_model_last_inference_us |
(model) → uint64_t |
Return the duration of the last invoke() in microseconds (requires CONFIG_OVE_TIME) |
Model Config Struct
struct ove_model_config {
const void *model_data; /* pointer to .tflite FlatBuffer (typically in flash) */
size_t model_size; /* size of model_data in bytes */
size_t arena_size; /* tensor arena size in bytes — must fit all layers */
};
arena_size controls how much memory is reserved for intermediate tensors. The required size depends on the model architecture. Use the TFLite interpreter's profiling output or start with a generous estimate and reduce until ove_model_create() no longer fails. For CMSIS-NN kernels, the arena should be 16-byte aligned.
Tensor Info Struct
struct ove_tensor_info {
void *data; /* pointer into the tensor arena — valid for model lifetime */
size_t size; /* total size of tensor data in bytes */
enum ove_tensor_type type; /* element type (FLOAT32, INT8, etc.) */
unsigned int ndims; /* number of dimensions */
int dims[5]; /* shape, e.g. {1, 49, 40, 1} for a spectrogram */
};
Write input data directly to info->data before calling ove_model_invoke(). Read output data from info->data after the call returns. The pointer remains valid for the lifetime of the model session.
Allocation Strategies
Static (zero-heap) — init / deinit:
Supply a caller-allocated storage struct and arena buffer. Both must remain valid for the session lifetime. The arena should be 16-byte aligned for CMSIS-NN.
static ove_model_storage_t model_storage;
static uint8_t __attribute__((aligned(16))) arena[32768];
ove_model_t model;
struct ove_model_config cfg = {
.model_data = my_model_data,
.model_size = my_model_data_size,
.arena_size = sizeof(arena),
};
ove_model_init(&model, &model_storage, arena, &cfg);
Heap / unified — create / destroy:
ove_model_create() works in both heap and zero-heap modes. In heap mode it allocates storage and arena from the RTOS heap. In zero-heap mode it expands to a macro that generates per-call-site static storage; arena_size must be a compile-time constant.
ove_model_t model;
ove_model_create(&model, &cfg);
/* ... run inference ... */
ove_model_destroy(model);
Example: Keyword Detection with Audio Preprocessor and Classifier
The example_keyword_live application runs a two-stage inference pipeline on live DMIC audio. Stage 1 converts 30ms audio windows into log-mel spectral features; stage 2 classifies the 49-frame spectrogram as silence, unknown, "yes", or "no".
graph LR
AUDIO["1 second of<br/>16kHz PCM audio<br/><small>from ring buffer</small>"]
PREPROC["Audio Preprocessor<br/><small>int8 model<br/>480-sample windows x 49<br/>produces 49 x 40 spectral features</small>"]
FEATS["int8 feature map<br/><small>49 x 40 bytes</small>"]
CLASSIFY["Keyword Classifier<br/><small>micro_speech int8 model<br/>outputs: silence, unknown, yes, no</small>"]
RESULT["Detection Result<br/><small>label + confidence</small>"]
AUDIO --> PREPROC --> FEATS --> CLASSIFY --> RESULT
style AUDIO fill:#4a9,stroke:#333,color:#fff
style PREPROC fill:#48b,stroke:#333,color:#fff
style FEATS fill:#666,stroke:#333,color:#fff
style CLASSIFY fill:#48b,stroke:#333,color:#fff
style RESULT fill:#a54,stroke:#333,color:#fff
#include "ove/ove.h"
#include "ove/infer.h"
#define ARENA_SIZE 32768
/* Static storage reused for both model stages */
static ove_model_storage_t model_storage;
static uint8_t __attribute__((aligned(16))) arena[ARENA_SIZE];
/* Stage 1: raw audio → spectral features (int8) */
static int generate_features(const int16_t *audio, unsigned int len,
int8_t *features_out)
{
struct ove_model_config cfg = {
.model_data = g_audio_preprocessor_int8_model_data,
.model_size = g_audio_preprocessor_int8_model_data_size,
.arena_size = ARENA_SIZE,
};
ove_model_t preproc;
if (ove_model_init(&preproc, &model_storage, arena, &cfg) != OVE_OK)
return OVE_ERR_ML_FAILED;
struct ove_tensor_info in_info, out_info;
ove_model_input(preproc, 0, &in_info);
ove_model_output(preproc, 0, &out_info);
/* Run one 30ms window per spectrogram row */
for (unsigned int frame = 0; frame < 49; frame++) {
memcpy(in_info.data, audio + frame * 320, 480 * sizeof(int16_t));
ove_model_invoke(preproc);
memcpy(features_out + frame * 40, out_info.data, 40);
}
ove_model_deinit(preproc);
return OVE_OK;
}
/* Stage 2: spectral features → keyword label */
static int classify(const int8_t *features, const char **label_out)
{
static const char *labels[] = { "silence", "unknown", "yes", "no" };
struct ove_model_config cfg = {
.model_data = g_micro_speech_quantized_model_data,
.model_size = g_micro_speech_quantized_model_data_size,
.arena_size = ARENA_SIZE,
};
ove_model_t classifier;
if (ove_model_init(&classifier, &model_storage, arena, &cfg) != OVE_OK)
return OVE_ERR_ML_FAILED;
struct ove_tensor_info in_info, out_info;
ove_model_input(classifier, 0, &in_info);
ove_model_output(classifier, 0, &out_info);
memcpy(in_info.data, features, 49 * 40);
ove_model_invoke(classifier);
/* Find highest-scoring class */
const int8_t *scores = (const int8_t *)out_info.data;
int best = 0;
for (int i = 1; i < 4; i++)
if (scores[i] > scores[best]) best = i;
*label_out = labels[best];
OVE_LOG_INF("Inference time: %llu us",
(unsigned long long)ove_model_last_inference_us(classifier));
ove_model_deinit(classifier);
return OVE_OK;
}
Kconfig Options
| Option | Default | Description |
|---|---|---|
CONFIG_OVE_INFER |
n |
Enable the ML inference subsystem (requires LiteRT/TFLM) |
CONFIG_OVE_INFER_CMSIS_NN |
n |
Use CMSIS-NN optimised kernels on Arm Cortex-M (requires CMSIS-NN library) |
CONFIG_OVE_INFER_ARENA_SIZE |
32768 |
Default tensor arena size in bytes when using the unified create() macro without an explicit config |
Headers
| Header | Contents |
|---|---|
ove/infer.h |
ove_tensor_type enum, ove_tensor_info struct, ove_model_config struct, all model lifecycle functions |
ove/storage.h |
ove_model_storage_t opaque type (selected per backend) |