LogoLogo
HomeAPI & SDKsProjectsForumStudio
  • Getting started
    • For beginners
    • For ML practitioners
    • For embedded engineers
  • Frequently asked questions (FAQ)
  • Tutorials
    • End-to-end tutorials
      • Computer vision
        • Image classification
        • Object detection
          • Object detection with bounding boxes
          • Detect objects with centroid (FOMO)
        • Visual anomaly detection
        • Visual regression
      • Audio
        • Sound recognition
        • Keyword spotting
      • Time-series
        • Motion recognition + anomaly detection
        • Regression + anomaly detection
        • HR/HRV
        • Environmental (Sensor fusion)
    • Data
      • Data ingestion
        • Collecting image data from the Studio
        • Collecting image data with your mobile phone
        • Collecting image data with the OpenMV Cam H7 Plus
        • Using the Edge Impulse Python SDK to upload and download data
        • Trigger connected board data sampling
        • Ingest multi-labeled data using the API
      • Synthetic data
        • Generate audio datasets using Eleven Labs
        • Generate image datasets using Dall-E
        • Generate keyword spotting datasets using Google TTS
        • Generate physics simulation datasets using PyBullet
        • Generate timeseries data with MATLAB
      • Labeling
        • Label audio data using your existing models
        • Label image data using GPT-4o
      • Edge Impulse Datasets
    • Feature extraction
      • Building custom processing blocks
      • Sensor fusion using embeddings
    • Machine learning
      • Classification with multiple 2D input features
      • Visualize neural networks decisions with Grad-CAM
      • Sensor fusion using embeddings
      • FOMO self-attention
    • Inferencing & post-processing
      • Count objects using FOMO
      • Continuous audio sampling
      • Multi-impulse (C++)
      • Multi-impulse (Python)
    • Lifecycle management
      • CI/CD with GitHub Actions
      • Data aquisition from S3 object store - Golioth on AI
      • OTA model updates
        • with Arduino IDE (for ESP32)
        • with Arduino IoT Cloud
        • with Blues Wireless
        • with Docker on Allxon
        • with Docker on Balena
        • with Docker on NVIDIA Jetson
        • with Espressif IDF
        • with Nordic Thingy53 and the Edge Impulse app
        • with Particle Workbench
        • with Zephyr on Golioth
    • API examples
      • Customize the EON Tuner
      • Ingest multi-labeled data using the API
      • Python API bindings example
      • Running jobs using the API
      • Trigger connected board data sampling
    • Python SDK examples
      • Using the Edge Impulse Python SDK to run EON Tuner
      • Using the Edge Impulse Python SDK to upload and download data
      • Using the Edge Impulse Python SDK with Hugging Face
      • Using the Edge Impulse Python SDK with SageMaker Studio
      • Using the Edge Impulse Python SDK with TensorFlow and Keras
      • Using the Edge Impulse Python SDK with Weights & Biases
    • Expert network projects
  • Edge Impulse Studio
    • Organization hub
      • Users
      • Data campaigns
      • Data
        • Cloud data storage
      • Data pipelines
      • Data transformation
        • Transformation blocks
      • Upload portals
      • Custom blocks
        • Custom AI labeling blocks
        • Custom deployment blocks
        • Custom learning blocks
        • Custom processing blocks
        • Custom synthetic data blocks
        • Custom transformation blocks
      • Health reference design
        • Synchronizing clinical data with a bucket
        • Validating clinical data
        • Querying clinical data
        • Transforming clinical data
    • Project dashboard
      • Select AI hardware
    • Devices
    • Data acquisition
      • Uploader
      • Data explorer
      • Data sources
      • Synthetic data
      • Labeling queue
      • AI labeling
      • CSV Wizard (time-series)
      • Multi-label (time-series)
      • Tabular data (pre-processed & non-time-series)
      • Metadata
      • Auto-labeler | deprecated
    • Impulses
    • EON Tuner
      • Search space
    • Processing blocks
      • Audio MFCC
      • Audio MFE
      • Audio Syntiant
      • Flatten
      • HR/HRV features
      • Image
      • IMU Syntiant
      • Raw data
      • Spectral features
      • Spectrogram
      • Custom processing blocks
      • Feature explorer
    • Learning blocks
      • Anomaly detection (GMM)
      • Anomaly detection (K-means)
      • Classification
      • Classical ML
      • Object detection
        • MobileNetV2 SSD FPN
        • FOMO: Object detection for constrained devices
      • Object tracking
      • Regression
      • Transfer learning (images)
      • Transfer learning (keyword spotting)
      • Visual anomaly detection (FOMO-AD)
      • Custom learning blocks
      • Expert mode
      • NVIDIA TAO | deprecated
    • Retrain model
    • Live classification
    • Model testing
    • Performance calibration
    • Deployment
      • EON Compiler
      • Custom deployment blocks
    • Versioning
    • Bring your own model (BYOM)
    • File specifications
      • deployment-metadata.json
      • ei-metadata.json
      • ids.json
      • parameters.json
      • sample_id_details.json
      • train_input.json
  • Tools
    • API and SDK references
    • Edge Impulse CLI
      • Installation
      • Serial daemon
      • Uploader
      • Data forwarder
      • Impulse runner
      • Blocks
      • Himax flash tool
    • Edge Impulse for Linux
      • Linux Node.js SDK
      • Linux Go SDK
      • Linux C++ SDK
      • Linux Python SDK
      • Flex delegates
      • Rust Library
    • Rust Library
    • Edge Impulse Python SDK
  • Run inference
    • C++ library
      • As a generic C++ library
      • On Android
      • On your desktop computer
      • On your Alif Ensemble Series Device
      • On your Espressif ESP-EYE (ESP32) development board
      • On your Himax WE-I Plus
      • On your Raspberry Pi Pico (RP2040) development board
      • On your SiLabs Thunderboard Sense 2
      • On your Spresense by Sony development board
      • On your Syntiant TinyML Board
      • On your TI LaunchPad using GCC and the SimpleLink SDK
      • On your Zephyr-based Nordic Semiconductor development board
    • Arm Keil MDK CMSIS-PACK
    • Arduino library
      • Arduino IDE 1.18
    • Cube.MX CMSIS-PACK
    • Docker container
    • DRP-AI library
      • DRP-AI on your Renesas development board
      • DRP-AI TVM i8 on Renesas RZ/V2H
    • IAR library
    • Linux EIM executable
    • OpenMV
    • Particle library
    • Qualcomm IM SDK GStreamer
    • WebAssembly
      • Through WebAssembly (Node.js)
      • Through WebAssembly (browser)
    • Edge Impulse firmwares
    • Hardware specific tutorials
      • Image classification - Sony Spresense
      • Audio event detection with Particle boards
      • Motion recognition - Particle - Photon 2 & Boron
      • Motion recognition - RASynBoard
      • Motion recognition - Syntiant
      • Object detection - SiLabs xG24 Dev Kit
      • Sound recognition - TI LaunchXL
      • Keyword spotting - TI LaunchXL
      • Keyword spotting - Syntiant - RC Commands
      • Running NVIDIA TAO models on the Renesas RA8D1
      • Two cameras, two models - running multiple object detection models on the RZ/V2L
  • Edge AI Hardware
    • Overview
    • Production-ready
      • Advantech ICAM-540
      • Seeed SenseCAP A1101
      • Industry reference design - BrickML
    • MCU
      • Ambiq Apollo4 family of SoCs
      • Ambiq Apollo510
      • Arducam Pico4ML TinyML Dev Kit
      • Arduino Nano 33 BLE Sense
      • Arduino Nicla Sense ME
      • Arduino Nicla Vision
      • Arduino Portenta H7
      • Blues Wireless Swan
      • Espressif ESP-EYE
      • Himax WE-I Plus
      • Infineon CY8CKIT-062-BLE Pioneer Kit
      • Infineon CY8CKIT-062S2 Pioneer Kit
      • Nordic Semi nRF52840 DK
      • Nordic Semi nRF5340 DK
      • Nordic Semi nRF9160 DK
      • Nordic Semi nRF9161 DK
      • Nordic Semi nRF9151 DK
      • Nordic Semi nRF7002 DK
      • Nordic Semi Thingy:53
      • Nordic Semi Thingy:91
      • Open MV Cam H7 Plus
      • Particle Photon 2
      • Particle Boron
      • RAKwireless WisBlock
      • Raspberry Pi RP2040
      • Renesas CK-RA6M5 Cloud Kit
      • Renesas EK-RA8D1
      • Seeed Wio Terminal
      • Seeed XIAO nRF52840 Sense
      • Seeed XIAO ESP32 S3 Sense
      • SiLabs Thunderboard Sense 2
      • Sony's Spresense
      • ST B-L475E-IOT01A
      • TI CC1352P Launchpad
    • MCU + AI accelerators
      • Alif Ensemble
      • Arduino Nicla Voice
      • Avnet RASynBoard
      • Seeed Grove - Vision AI Module
      • Seeed Grove Vision AI Module V2 (WiseEye2)
      • Himax WiseEye2 Module and ISM Devboard
      • SiLabs xG24 Dev Kit
      • STMicroelectronics STM32N6570-DK
      • Synaptics Katana EVK
      • Syntiant Tiny ML Board
    • CPU
      • macOS
      • Linux x86_64
      • Raspberry Pi 4
      • Raspberry Pi 5
      • Texas Instruments SK-AM62
      • Microchip SAMA7G54
      • Renesas RZ/G2L
    • CPU + AI accelerators
      • AVNET RZBoard V2L
      • BrainChip AKD1000
      • i.MX 8M Plus EVK
      • Digi ConnectCore 93 Development Kit
      • MemryX MX3
      • MistyWest MistySOM RZ/V2L
      • Qualcomm Dragonwing RB3 Gen 2 Dev Kit
      • Renesas RZ/V2L
      • Renesas RZ/V2H
      • IMDT RZ/V2H
      • Texas Instruments SK-TDA4VM
      • Texas Instruments SK-AM62A-LP
      • Texas Instruments SK-AM68A
      • Thundercomm Rubik Pi 3
    • GPU
      • Advantech ICAM-540
      • NVIDIA Jetson
      • Seeed reComputer Jetson
    • Mobile phone
    • Porting guide
  • Integrations
    • Arduino Machine Learning Tools
    • AWS IoT Greengrass
    • Embedded IDEs - Open-CMSIS
    • NVIDIA Omniverse
    • Scailable
    • Weights & Biases
  • Tips & Tricks
    • Combining impulses
    • Increasing model performance
    • Optimizing compute time
    • Inference performance metrics
  • Concepts
    • Glossary
    • Course: Edge AI Fundamentals
      • Introduction to edge AI
      • What is edge computing?
      • What is machine learning (ML)?
      • What is edge AI?
      • How to choose an edge AI device
      • Edge AI lifecycle
      • What is edge MLOps?
      • What is Edge Impulse?
      • Case study: Izoelektro smart grid monitoring
      • Test and certification
    • Data engineering
      • Audio feature extraction
      • Motion feature extraction
    • Machine learning
      • Data augmentation
      • Evaluation metrics
      • Neural networks
        • Layers
        • Activation functions
        • Loss functions
        • Optimizers
          • Learned optimizer (VeLO)
        • Epochs
    • What is embedded ML, anyway?
    • What is edge machine learning (edge ML)?
Powered by GitBook
On this page
  • What does Performance Calibration do?
  • Understanding Post-processing:
  • Post-processing algorithm
  • Mean FAR (False Alarm Rate):
  • Mean FRR (False Rejection Rate):
  • Averaging window duration:
  • Detection threshold:
  • Suppression period:
  • Why is it useful?
  • How does it work?
  • Test configuration
  • Select and save a config
  • Results for selected config
  • Types of false positives

Was this helpful?

Export as PDF
  1. Edge Impulse Studio

Performance calibration

Performance calibration allows you to test, fine-tune, and simulate running event detection models using continuous real-world or synthetically generated streams of data. It is designed to provide an immediate understanding of how your model is expected to perform in the field.

Currently only available for Audio data projects

Performance calibration is currently only available for projects that contain audio data. It's designed for use with projects that are detecting specific events (such as spoken keywords), as opposed to classifying ambient conditions. Please stay tuned for future information on support for other types of sensor data!

What does Performance Calibration do?

Performance Calibration is a tool for testing and configuring embedded machine learning pipelines for event detection. It provides insight into how your pipeline will perform on streaming data, which is what your application will encounter in the real world. It works within Studio, and does not require you to deploy to a physical device.

After testing is complete, you can use Performance Calibration to configure a post-processing algorithm that will interpret the output of your ML pipeline, transforming it into a stream of actionable events. The results of testing are used to help guide selection of the optimal post-processing algorithm for your use case.

For example, a developer working on a keyword spotting application could use Performance Calibration to understand how well their ML pipeline detects keywords in a sample of real world audio, and to select the post-processing algorithm that provides the best quality output.

Understanding Post-processing:

Post-processing is the technique used to refine the raw outputs from your impulse, transforming them into actionable insights. Here’s how it’s done:

Averaging Scores Over a Window: Before any decisions are made, the model’s output scores are averaged over a specified duration to smooth out any abrupt fluctuations.

Applying a Threshold: Only the top score, after averaging, is considered. If this score surpasses a predetermined threshold, it indicates the presence of a detected event.

Suppression Period: After a positive event detection, there's a period where any subsequent detections are temporarily ignored. This avoids rapid repeated detections of the same event and reduces false positives.

Post-processing algorithm

The post-processing algorithm has a configurable set of parameters that determine the overall performance of the pipeline. These parameters can be adjusted to control the trade-off between false acceptance rate (how often an event is mistakenly detected) and false rejection rate (how often an event is mistakenly ignored). This allows you to determine how sensitive your impulse is to inputs.

Your impulse can be tailored using a specific post-processing configuration. This configuration can be adjusted to minimize either false activations (False Alarm Rate - FAR) or false rejections (False Rejection Rate - FRR). The UI provides a chart showcasing a range of recommended configurations. Once you find a configuration that meets your criteria, you can save it, ensuring it's applied whenever the impulse is deployed.

The bottom left corner of the graph represents the best possible performance, where both FAR and FRR are zero. The top right corner represents the worst possible performance, where both FAR and FRR are one.

The parameters that control the post-processing algorithm are:

Mean FAR (False Alarm Rate):

The percentage of times the system wrongly identified an event.

Mean FRR (False Rejection Rate):

The percentage of times the system failed to identify an event.

Averaging window duration:

The duration over which the model’s output scores are averaged to smooth out any abrupt fluctuations.

Detection threshold:

The minimum score required to indicate the presence of a detected event.

Suppression period:

The duration after a positive event detection where any subsequent detections are temporarily ignored. This avoids rapid repeated detections of the same event and reduces false positives.

Why is it useful?

Performance Calibration gives you an accurate prediction of how your ML pipeline will perform when it is deployed in the real world. Analyzing real world performance before deployment in the field allows you to iterate on your pipeline much more quickly, helping you identify and solve common performance issues much earlier in the process.

Interpreting the output of an ML pipeline on streaming data requires a post-processing algorithm, which edge ML developers have traditionally had to write and tune by hand, balancing the trade-off between false positives and false negatives to fit their particular use case. By quantifying and automating this process, Performance Calibration gives developers precise control over the trade-offs they select for their application.

How does it work?

Performance can be measured using either recordings of real-world data, or with realistic synthetic recordings generated using samples from your test dataset. This allows you to easily test your model’s performance under various scenarios, such as varying levels of background noise, or with different environmental sounds that might occur in your deployment environment.

When Performance Calibration runs, your ML pipeline is run across the input data with the same latency as is predicted for the target selected on the Dashboard page of your project. This results in a set of raw predictions which must be filtered by a post-processing algorithm to produce a signal every time a particular event class is detected.

The post-processing algorithm has configurable parameters that determine the overall performance of the pipeline. These parameters can be adjusted to control the trade-off between false acceptance rate (how often an event is mistakenly detected) and false rejection rate (how often an event is mistakenly ignored). This allows you to determine how sensitive your application is to inputs.

False positives and false negatives

No ML model is perfect, so developers using ML for event detection always need to pick a trade-off between false positives and false negatives. The appropriate trade-off depends on the application. For example, if you're attempting to detect a dangerous situation in an industrial facility, it may be important to minimize false negatives. On the other hand, if you're concerned about annoying users with unintentional activations of a smart home device, you may wish to minimize false positives.

The following page walks through the process of using Performance Calibration with an example project. Check out our blog post for more information!

Test configuration

First, make sure you have an audio project in your Edge Impulse account. No projects yet? Follow one of our tutorials to get started:

Or, clone the "Bird sound classifier" project that is used in this documentation to your Edge Impulse account: https://studio.edgeimpulse.com/public/16060/latest

Once you've trained your impulse, select the Performance calibration tab and set your testing configuration settings:

  1. Select noise labels. Which label is used to represent generic background noise or "silence"?

  2. Select any other labels that should be ignored by your application, i.e. other classes that are equivalent to background noise or "silence".

  3. Then, click Run test.

Simulated real world audio

Simulated real world audio is a synthetically generated audio stream consisting of samples taken from your testing dataset and layered on top of artificial background noise. For free Edge Impulse projects, you can choose to generate either 10 minutes or 30 minutes of simulated real world audio.

Upload your own in a zip file

Select and save a config

Your impulse can be configured with a post-processing algorithm that will minimize either false activations or false rejections. The chart shows a range of suggested configs. If you save one, it will be used when your impulse is deployed.

Selected config

  • Mean FAR: The mean False Acceptance Rate. Measures how often labels are mistakenly detected. Does not include statistics for noise labels.

  • Mean FRR: The mean False Rejection Rate. Measures how often events are mistakenly missed. Does not include statistics for noise labels.

  • Averaging window duration (ms): The raw inference results are averaged across this length of time.

  • Detection threshold (ms): A class is considered a positive match when it exceeds this threshold.

  • Suppression period (ms): Matches are ignored for this length of time following a positive result.

Performance overview

Shows the performance statistics for each label.

  • FAR: False Acceptance Rate. Measures how often a label is mistakenly detected.

  • FRR: False Rejection Rate. Measures how often a label is mistakenly missed.

  • True Positives: The number of times each label was correctly triggered.

  • False Positives: The number of times each label was incorrectly triggered.

  • True Negatives: The number of times each label was correctly not triggered.

  • False Negatives: The number of times each label was incorrectly not triggered.

False acceptance rate and false rejection rate

FAR is also sometimes known as the False Positive Rate, and FRR as the False Negative Rate. These industry-standard metrics are calculated as follows:

FalseAcceptanceRate=false positivestotal negatives in dataset{False Acceptance Rate} = \frac{false\ positives}{total\ negatives\ in\ dataset}FalseAcceptanceRate=total negatives in datasetfalse positives​
FalseRejectionRate=false negativestotal positives in dataset{False Rejection Rate} = \frac{false\ negatives}{total\ positives\ in\ dataset}FalseRejectionRate=total positives in datasetfalse negatives​

Results for selected config

Shows any errors your impulse makes on a sample of data, with a table of results.

  • Error: False positives are displayed in red while false negatives are displayed in blue.

  • Type: Spurious match, incorrect match, duplicate match, or blank.

  • Label: The data label the model predicted in the audio stream.

  • Start time: The timestamp starting location of the selected error in the audio data stream.

  • Play button: Preview the audio stream at the error's start time.

Types of false positives

What we refer to as "Ground Truth" in this context is the sound/label association that the synthetically generated audio contains at a given time.

  • Incorrect match: A detection matches the wrong ground truth

  • Spurious match: This match detection has not been associated with any ground truth.

  • Duplicate match: The same ground truth was detected more than once. The first correct detection is considered a true positive but subsequent detections are considered false positives.

PreviousModel testingNextDeployment

Last updated 3 months ago

Was this helpful?

.

.

Choose an audio sample type: or .

Already have a long, real-world recording of background noise which includes your target model's classes? Upload your own audio sample (.wav) in a zip file, along with its Label Tracks in (.txt).

Selecting from the various "Suggested config" icons on the FRR/FAR chart will update the Selected config information. Click on Save selected config to use the selected FAR and FRR trade off when your impulse is deployed. This config information is also accessible in the .

Recognizing sounds from audio
Keyword spotting
Audacity format
deployed Edge Impulse library
simulated real world audio
upload your own in a zip file
chart showcasing a range of recommended post-processing configurations.
Performance calibration test configuration settings.
Running performance calibration test.
Performance overview.
Selected config information.
Performance overview.
Results for selected config.