Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
The Audio MFCC blocks extracts coefficients from an audio signal. Similarly to the Audio MFE block, it uses a non-linear scale called Mel-scale. It is the reference block for speech recognition and can also perform well on some non-human voice use cases.
GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.
The "Processed features" array has the following format:
Column major, from low cepstrum to high.
Number of rows will be equal to the parameter "Number of coefficients" (or number of cepstra)
Each column represents a single frame
Compatible with the DSP Autotuner
Picking the right parameters for DSP algorithms can be difficult. It often requires a lot of experience and experimenting. The autotuning function makes this process easier by looking at the entire dataset and recommending a set of parameters that is tuned for your dataset.
Mel Frequency Cepstral Coefficients
Number of coefficients: Number of cepstral coefficients to keep after applying Discrete Cosine Transform
Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
Filter number: The number of triangular filters applied to the spectrogram
FFT length: The FFT size
Low frequency: Lowest band edge of Mel-scale filterbanks
High frequency: Highest band edge of Mel-scale filterbanks
Window size: The size of sliding window for local cepstral mean normalization. Windows size must be odd.
Pre-emphasis
Coefficient: The pre-emphasizing coefficient to apply to the input signal (0 equals to no filtering)
Note: Shift has been removed and set to 1 for all future projects. Older & existing projects can still change this value or use an existing value.
The features' extractions adds one extra step to the MFE block resulting in a compressed representation of the filterbanks. A Discrete Cosine Transform is applied on each filterbank to extract cepstral coefficients. 13 coefficients are usually retained, the rest are discarded as they represent fast changes not useful for speech recognition.
Similarly to the Spectrogram block, the Audio MFE processing block extracts time and frequency features from a signal. However it uses a non-linear scale in the frequency domain, called Mel-scale. It performs well on audio data, mostly for non-voice recognition use cases when sounds to be classified can be distinguished by human ear.
GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.
The "Processed features" array has the following format:
Column major, from low frequency to high.
Number of rows will be equal to the filter number
Each column represents a single frame
Consider a toy example where the the signal is a pure tone, and Filter number is set to 6:
Output would begin as shown. The tone is a low frequency, so it falls into the first two Mel bins. The higher frequency bins are 0. The pattern repeats at the 7th element, which is the 1st row of the 2nd column.
Compatible with the DSP Autotuner
Picking the right parameters for DSP algorithms can be difficult. It often requires a lot of experience and experimenting. The autotuning function makes this process easier by looking at the entire dataset and recommending a set of parameters that is tuned for your dataset.
Mel-filterbank energy features
Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
Filter number: The number of triangular filters applied to the spectrogram
FFT length: The FFT size
Low frequency: Lowest band edge of Mel-scale filterbanks
High frequency: Highest band edge of Mel-scale filterbanks
Normalization
Noise floor (dB): signal lower than this level will be dropped
The features' extractions is similar to the Spectrogram (Frame length, Frame stride, and FFT length parameters are the same) but it adds 2 extra steps.
After computing the spectrogram, triangular filters are applied on a Mel-scale to extract frequency bands. They are configured with parameters Filter number, Low frequency and High frequency to select the frequency band and the number of frequency features to be extracted. The Mel-scale is a perceptual scale of pitches judged by listeners to be equal in distance from one another. The idea is to extract more features (more filter banks) in the lower frequencies, and less in the high frequencies, thus it performs well on sounds that can be distinguished by human ear.
The graph titled "FFT Bin Weighting" shows how the FFT bins are scaled and summed into the output columns for your chosen parameters.
The last step clips the MFE output for noise reduction. Any sample below Noise floor is set to zero instead.
Extracting meaningful features from your data is crucial to building small and reliable machine learning models, and in Edge Impulse this is done through processing blocks. We ship a number of processing blocks for common sensor data (such as vibration and audio):
The source code of these blocks are available in the .
If you have a very specific sensor, want to apply custom filters, or are implementing the latest research in digital signal processing, follow our tutorial on .
In most of our DSP blocks, you have the option to calculate the feature importance. Edge Impulse Studio will then output a Feature Importance list that will help you determine which axes generated from your DSP block are most significant to analyze when you want to train a model.
Feature importance
For feature importance to work, you must have at least two labeled classes in your training dataset
This process of generating features and determining the most important features of your data will further reduce the amount of signal analysis needed on the device with new and unseen data.
To calculate the feature importance, a is trained on the data and the are extracted from the trained classifier.
The Flatten block performs statistical analysis on the signal. It is useful for slow-moving averages like temperature data, in combination with other blocks.
GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.
Scaling
Scale axes: Multiplies axes by this number
Method
Average: Calculates the average value for the window
Minimum: Calculates the minimum value in the window
Maximum: Calculates the maximum value in the window
Root-mean square: Calculates the RMS value of the window
Standard deviation: Calculates the standard deviation of the window
Skewness: Calculates the skewness of the window
Kurtosis: Calculates the kurtosis of the window
Moving Average Number of Windows: Calculates the moving average by maintaining a rolling average of the last N windows. Note, there is no zero padding, the block will accumulate averages up to N windows. (Ex. for the first window in a sample, the moving average will equal the average). The moving average resets for each sample during training, and during inference, when run_classifier_init() is called. Note if you enable this, you probably don't want overlapping windows for training.
The Flatten block first rescales axes of the signal if value is different than 1. Then statistical analysis is performed on each window, computing between 1 and 8 features for each axis, depending on the number of selected methods.
The Audio Syntiant processing block extracts time and frequency features from a signal. It is similar to the Audio MFE but performs additional processing specific to the Syntiant NDP101/120 chip. This block can be used only with Syntiant targets.
Log Mel-filterbank energy features
Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
Filter number (fixed): The number of triangular filters applied to the spectrogram
FFT length (fixed): The FFT size
Low frequency (fixed): Lowest band edge of Mel-scale filterbanks
High frequency (fixed): Highest band edge of Mel-scale filterbanks
Preemphasis
Coefficient: Pre-emphasis coefficient
Chip
Features extractor: Syntiant method to generate features, choose accordingly to your chip
The features' extractions is a proprietary algorithm from Syntiant. However parameters are very close to the Audio MFE. Pre-emphasis coefficient is applied first to amplify higher frequencies. The signal is then divided in overlapping frames, defined by the Frame length and Frame stride to extract speech features.
Sampling frequency
The Audio Syntiant block only supports a 16 kHz frequency. You can adjust the sampling frequency in the "Create Impulse" section.
The IMU Syntiant block rescales raw data to 8 bits values to match the NDP101/120 chip input requirements.
Scaling
Scale 16 bits to 8 bits: Scale data to 8-bits values in the [-1, 1] range, raw data is divided by 2G (2 * 9.80665). Using Edge Impulse official firmwares, this parameter should be enabled as raw data is not rescaled. If this parameter is disabled the data samples will not be rescaled, you should disable this parameter if your raw data samples are already normalized to the [-1, 1] range.
The IMU Syntiant block retrieves raw samples and applies the Scale 16 bits to 8 bits parameter.
The Image block is dedicated to computer vision applications. It normalizes image data, and optionally reduce the color depth.
GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.
Color depth: Color depth to use (RGB or grayscale)
The Image performs normalization, converting each pixel's channel of the image to a float value between 0 and 1. If Grayscale is selected, each pixel is converted to a single value following the ITU-R BT.601 conversion (Y' component only).
The HR/HRV Features block processes physiological signals like photoplethysmogram (PPG) or electrocardiogram (ECG), with optional accelerometer inputs for enhanced accuracy in motion-prone applications, to extract key metrics such as heart rate (HR) and heart rate variability (HRV). HR measures the number of beats per minute, while HRV measures the time variance between successive heartbeats, also known as the interbeat interval (IBI).
The block offers real-time HR estimation and HRV analysis on resource-constrained edge devices and leverages cutting-edge algorithms for precise feature extraction. The extracted features can be used on their own or to inform downstream machine learning tasks such as stress detection or heart health analysis.
To see a demonstration of how to use the HR/HRV Features block, refer to our tutorial: .
Evaluation available for everyone; deployment only available with Enterprise Plan
All users (Community, Professional, and Enterprise) can extract heart rate and HRV features using this block for testing purposes. However, the deployment option is only available for Enterprise users. Please contact your Solutions Engineer to enable it.
Tip: Use accelerometer data whenever possible to enhance the accuracy of heart rate and HRV estimation in dynamic environments.
By configuring the HR/HRV Features block, you can obtain critical metrics like heart rate and HRV in real-time, enabling applications such as fitness tracking, stress monitoring, and health diagnostics. The extracted features can be fine-tuned to match the performance and constraints of edge devices, ensuring both efficiency and accuracy.
When using the HR/HRV Features block, it is important to also configure the input block for your impulse appropriately.
There are minimum input block window size requirements depending on your configuration of the HR/HRV Features block. If you are using HRV features, the input block window size must be greater than or equal to the HRV update interval. If you are not using HRV features, the input block window size must be greater than or equal to 2 seconds.
The minimum window size of 2 seconds is determined by the fact that the heart rate calculation is performed once for every 2 second period. When the window size is increased beyond 2 seconds, more heart rate values will be provided to the learning block. For example, a 2 second window will pass 1 heart rate value per window whereas a 10 second window will pass 5 heart rate values per window.
For optimal performance, it is recommended to set the window increase (stride) equal to the window size.
All input signals (PPG or ECG, and accelerometer) must have a frequency of either 25 Hz (tolerance +/- 1 Hz) or 50 Hz (tolerance +/- 3 Hz).
Heart rate values, HRV features, or both can be passed to the learning block. To only send heart rate values, select none
for the HRV features parameter. To send only HRV features, select your desired HRV features parameter value (other than none
) and deselect the include calculated heart rates parameter. To send both heart rate values and HRV features, select your desired HRV features parameter value (other than none
) and select the include calculated heart rates parameter.
Compatible with the DSP Autotuner
Picking the right parameters for DSP algorithms can be difficult. It often requires a lot of experience and experimenting. The autotuning function makes this process easier by looking at the entire dataset and recommending a set of parameters that is tuned for your dataset.
Note that this applies to the heart rate and accelerometer settings, not the HRV settings.
The following parameters are available for configuring the HR/HRV Features block. Note that all heart rate and accelerometer settings can be estimated using the autotuning of parameters, and is the suggested approach.
The HR/HRV Features block outputs heart rate values and HRV features based on your configuration. The HRV features contain time-domain and frequency-domain features as shown below.
Heart rate
Heart Rate Values
HRV time-domain features
IBI Slope
HR Mean
HR Slope
RMSSD Slope
RMSSD
AVNN
SDNN
Range NN
MAD NN
pNN50
NN Percentile (10)
NN Percentile (25)
NN Percentile (75)
NN Percentile (90)
IQR
SDSD
SD1
SD2
SD2/SD1 Ratio
HRV frequency-domain features
Raw VLF Energy
Raw LF Energy
Raw HF Energy
Raw Total Energy
Relative VLF Energy
Relative LF Energy
Relative HF Energy
LF/HF Ratio
Peak VLF Energy
Peak LF Energy
Peak HF Energy
Instead of individually selecting the HRV features that are output, you can select a group that contains multiple features. The HRV features associated with each group are defined below.
The number of processed features will depend on your configuration of the input block and HR/HRV Features block. For example if you have an input window of 90 seconds, selected all
for the HRV features group (30 features), enabled including heart rate values being passed to the learning block, and have an HRV update interval of 30 seconds, there will be 135 processed features - 45 heart rate values (90 seconds input window / 2 seconds per heart rate value) and 90 HRV features (90 seconds input window / 30 seconds update interval x 30 features).
If you are an Enterprise customer, please contact your Solutions Engineer to enable deployment.
The HR/HRV Features block has industry leading efficiency for RAM and flash usage and can be deployed into a wide range of devices, including fitness trackers, health monitors, and stress detection systems. The functionality can be deployed either as C++ or C bindings.
To optimize for MCU-based systems, your enterprise representative can provide a MAP file. This file contains a detailed breakdown of the memory footprint (flash and RAM) for the HR/HRV Features block, including the IBI processing components. This data is critical for fine-tuning and optimizing the deployment of the block on resource-constrained devices.
One important note when working with the HR/HRV Features block is that you can extract heart rate values even when running a classifier. This is particularly useful if your model is performing classification tasks but you'd also like to access heart rate data. The code snippet below demonstrates how to access the heart rate information during inference and print the results in a C++ application.
If you want to use the HR/HRV Features block solely for heart rate values without any classification in Studio, you can configure a regression learning block to "pass through" the result. This can be achieved by using expert mode for the block to set up a simple neural network.
After saving and training the model (though there's effectively no training needed in this case), you can then use Model Testing or Live Classification to evaluate the heart rate estimation.
If you also would like to deploy the HR/HRV Features block without running a classifier on device, when compiling define the following macro via CMake or Makefile to avoid flash overhead for the unused learn block.
When running classifiers that make use of a large window size, such as for HRV features, you can avoid buffering the entire window of PPG or ECG data by leveraging the callback structure of signal_t. get_data()
will only ask for 2 seconds of samples on each invocation, so if you block (either via RTOS sleep or a while loop on bare metal) while waiting for each 2 seconds of PPG or ECG data, you can avoid allocating the entire input window. Note also that the SDK does not internally buffer the entire window; each 2 seconds is immediately processed down to IBIs.
The HR/HRV Features block enables real-time extraction of key metrics such as heart rate and heart rate variability from physiological signals like PPG or ECG. These metrics are critical for applications in fitness tracking, stress detection, and medical diagnostics. To go further, follow our step-by-step guides in the tutorials section.