Edge Impulse Docs

Edge Impulse Documentation

Welcome to the Edge Impulse documentation. You'll find comprehensive guides and documentation to help you start working with Edge Impulse as quickly as possible, as well as support if you get stuck. Let's jump right in!

Inference performance metrics

This is an overview of the performance metrics (time per inference, RAM and ROM usage) of typical models built with Edge Impulse, for both DSP code, neural networks, and other ML blocks. This page should give some guidance on which microcontroller to use for which task. Note that this page is only applicable to general purpose microcontrollers, performance numbers on specialized silicon like the Eta Compute ECM3532 will look different.

Some notes:

  • The memory usage numbers exclude boot code, peripheral drivers, printf, and memory tracking functions. This is done by first compiling a basic benchmarking application and subtracting the RAM and ROM used.
  • The models were compiled in bare-metal mode (no RTOS), compiled with a release profile.
  • All neural networks are 8-bit quantized, and were compiled with the Edge Impulse EON compiler.
  • On the Cortex-M4F and Cortex-M7 MCUs CMSIS-DSP and CMSIS-NN are enabled to take advantage of the vector extensions on the platform (this is done automatically by the Edge Impulse SDK).
  • All DSP code uses floating point math.
  • RAM usage denotes the combined static RAM and the peak heap usage - the Edge Impulse SDK frees all allocated memory on the heap after each inference.
  • The RAM usage does not include the input buffer, which contains your raw sensor data. Depending on your device you can either keep this in RAM, or in (external) flash and page the data in (the signal_t structure has methods to do so).

Continuous gestures

Model built in the Continuous gestures tutorial. Consists of a spectral analysis DSP block (lowpass filter, FFT length 128), a neural network (33x20x10x4 fully connected layers), and an anomaly detection block (3 axes selected), analyzing 2 seconds of accelerometer data.

RAM: 6.4K
ROM: 42.5K

MCUDSP LatencyNeural Network LatencyAnomaly LatencyTotal Latency
Cortex-M0+ 48MHz370ms.2ms.4ms.376ms.
Cortex-M4F 80MHz15ms.1ms.1ms.17ms.
Cortex-M7 216MHz2ms.<1ms.<1ms.2ms.

Keyword spotting / scene recognition

A model similar to Recognize sounds from audio for detecting keywords or scene recognition in a realtime audio stream. Consists of an MFCC DSP block (13 coefficients, 0.02 frame length / stride, FFT length 256), a neural network (two 2D convolutional / pooling layers of 10 and 5 neurons, and two dense layers of 12 and 3 neurons), analyzing 1 second of audio data.

You can disable filterbank quantization for targets with more memory. This takes ~12K more RAM, but reduces inference time significantly. You disable filterbank quantization by setting the EIDSP_QUANTIZE_FILTERBANK=0 macro (see here).

Filterbank quantization enabled

RAM: 19.6K
ROM: 47.3K

MCUDSP LatencyNeural Network LatencyTotal Latency
Cortex-M4F 80MHz168ms.57ms.225ms.
Cortex-M7 216MHz39ms.15ms.54ms.

Filterbank quantization disabled

RAM: 31.7K
ROM: 47.3K

MCUDSP LatencyNeural Network LatencyTotal Latency
Cortex-M4F 80MHz284ms.57ms.341ms.
Cortex-M7 216MHz61ms.15ms.76ms.

Continuous audio inferencing

See Continuous audio sampling to enable realtime audio classification multiple times a second, even on the Cortex-M4F mentioned above.

Image recognition (32x32 grayscale)

Model similarly built in the Adding sight to your sensors tutorial. Consists of a 32x32 input image (grayscale), trained with the MobileNetV2 0.05 transfer learning block with additionally two dense layers of 10 and 3 neurons, analyzing a single image.

RAM: 70.2K
ROM: 164.2K

MCUNeural Network Latency
Cortex-M4F 80MHz186ms.
Cortex-M7 216MHz39ms.
Cortex-M7 480MHz13ms.

Image recognition (96x96 color)

Model similarly built in the Adding sight to your sensors tutorial. Consists of a 96x96 input image (RGB), trained with the MobileNetV2 0.35 transfer learning block with additionally two dense layers of 10 and 3 neurons, analyzing a single image.

RAM: 297.0K
ROM: 577.5K

MCUNeural Network Latency
Cortex-M7 480MHz140ms.

Updated about a month ago

Inference performance metrics


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.