Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
The Spectrogram processing block extracts time and frequency features from a signal. It performs well on audio data for non-voice recognition use cases, or on any sensor data with continuous frequencies.
GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.
Spectrogram
Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
FFT size: The size of the FFT for each frame. Will zero pad or clip if frame length in samples does not equal FFT size.
Normalization
Noise floor (dB): signal lower than this level will be dropped
It first divides the window in multiple overlapping frames. The size and number of frames can be adjusted with the parameters Frame length and Frame stride. For example with a window of 1 second, frame length of 0.02s and stride of 0.01s, it will create 99 time frames.
An FFT is then calculated for each frame. The number of frequency features for each frame is equal to the FFT size parameter divided by 2 plus 1. We recommend keeping the FFT size a power of 2 for performances purpose. Finally the Noise floor value is applied to the power spectrum.
The features generated by the Spectrogram block are equal to the number of generated time frames times the number of frequency features.
Frequency bands and frame length
There is a connection between the FFT size parameter and the frame length. The frame length will be cropped or padded to the FFT size value before applying the FFT. For example, with a 8kHz sampling frequency and a time frame of 0.02s, each time frame contains 160 samples (8k * 0.02). If your FFT size is set 128, time frames will be cropped to 128 samples. If your FFT size is set to 256, time frames will be padded with zeros.
The Image block is dedicated to computer vision applications. It normalizes image data, and optionally reduce the color depth.
GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.
Color depth: Color depth to use (RGB or grayscale)
The Image performs normalization, converting each pixel's channel of the image to a float value between 0 and 1. If Grayscale is selected, each pixel is converted to a single value following the ITU-R BT.601 conversion (Y' component only).
Extracting meaningful features from your data is crucial to building small and reliable machine learning models, and in Edge Impulse this is done through processing blocks. We ship a number of processing blocks for common sensor data (such as vibration and audio):
The source code of these blocks are available in the .
If you have a very specific sensor, want to apply custom filters, or are implementing the latest research in digital signal processing, follow our tutorial on .
The Spectral features block extracts frequency and power characteristics of a signal. Low-pass and high-pass filters can also be applied to filter out unwanted frequencies. It is great for analyzing repetitive patterns in a signal, such as movements or vibrations from an accelerometer.
GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.
Prior to calculating the Fast Fourier Transform (FFT), the time-series data inside the window of your sample can be filtered, which often helps to smooth out the signal or drop unwanted artifacts. In the image above, a "window" is shown inside the white box; only the readings inside that box will be used for filtering and calculating the FFT.
Edge Impulse will slide the window over your sample, as given by the time series input block parameters during Impulse creation in order to generate several training/test samples from your longer time series sample.
Scale - Multiply all raw input values by this number
Type - The type of filter to apply to the raw data (low-pass, high-pass, or none)
Cut-off frequency - Cut-off frequency of the filter in hertz. Also, this will remove unwanted frequency bins from the generated features.
Order - Order of the Butterworth filter. Must be an even number. A higher order has a sharper cutoff at the expense of latency. You can also set to zero, in which case, the signal won't be filtered, but unwanted frequency bins will still be removed from the output.
Removing frequency bins beyond the cut off reduces model size, which saves resources, and also leads to models that train well with less data
After filtering via a Butterworth IIR filter (if enabled), the mean is subtracted from the signal. Several statistical features (RMS, skewness, kurtosis) are calculated from the filtered signal after the mean has been removed. This filtered signal is passed to the Spectral power section, which computes the FFT in order to compute the spectral features.
This section controls how the FFT is applied to each filtered window from your sample. If the window from your sample is larger than the FFT size, then the window will be broken into frames (or "sub-windows"), and the FFT is calculated from each frame.
FFT length - The FFT size. This determines the number of FFT bins as well as the resolution of frequency peaks that you can separate. A lower number means more signals will average together in the same FFT bin, but also reduces the number of features and model size. A higher number will separate more signals into separate bins, but generates a larger model.
Take log of spectrum? - When selected, log (base 10) will be applied to each FFT bin. This gives more range to (ie, captures more information about) low intensity signals at the expense of range for higher intensity signals. It is enabled by default and is generally a good choice, but it ultimately depends on the kind if signal sampled.
Overlap FFT frames? - Successive frames (sub-windows) overlap by 1/2 within the larger window (given by the white box in the image) if this is checked. If unchecked, frames will not overlap. This "sliding frame" method can prevent transient events from being missed if they happen to appear on a frame boundary. Enabled by default. Disabling improves latency. No impact on model size or RAM usage.
Note that a number of FFTs will be computed, depending on the settings. For example, if you have 100 readings for a single axis in your window and set the FFT length to 16 with no overlap, then 6 FFTs will be computed (for that single axis), as we have 6 full frames (each with 16 points) that will fully cover those 100 readings/points.
For each FFT bin (i.e. range of frequencies), the maximum value from all of the frames is kept as the feature. Continuing with the example above, we throw away 1/2 of every FFT (as it's simply a mirror image of the other half). We also throw away the bin at 0 Hz (as we filter out the DC bias anyway when we subtracted the mean), but we keep the Nyquist bin. As a result, we end up with 8 usable bins from each of our 16-point FFTs. For each bin, we find the maximum value from our 6 FFTs that we computed (in that particular bin). So, the number of features would be 8.
Note that you may see fewer spectral features if you enable filtering, as we throw away any frequency bins higher than the cutoff frequency (for the low-pass filter) or lower than the cutoff frequency (for the high-pass filter).
See this video to learn more about the FFT.
Filter response - If filtering is enabled, and order is non-zero, then the frequency response of the filter is shown. This shows how much attenuation there will be across the frequency spectrum.
After filter - Shows the current window after filtering is applied (in the time domain).
Spectral power - Shows power vs. frequency as computed by the chosen FFT size. Power is either linear or log based on settings.
The spectral analysis block generates 2 types of features per axis/channel:
Statistical features
RMS
Skewness
Kurtosis
Spectral features
Maximum value from FFT frames for each bin that was not filtered out
Note that the standard deviation is not calculated because when the mean is subtracted from a signal, the RMS equals the standard deviation.
The total number of features will change, depending on how you set the filter and FFT parameters.
Let's consider an input signal sampled at 62.5 Hz with 3 axis and the following parameters:
Low-pass filter
Filter cutoff set to 3 Hz
The number of generated features per axis is:
3 values for statistics (RMS, Skewness, Kurtosis)
1 value for the FFT bin capturing 1.95 to 5.86 Hz
With 3 axes/channels, that gives us a total of 12 features are generated in total for the input signal.
The Audio MFCC blocks extracts coefficients from an audio signal. Similarly to the , it uses a non-linear scale called Mel-scale. It is the reference block for speech recognition and can also performs well on some non-human voice use cases.
GitHub repository containing all DSP block code: .
Mel Frequency Cepstral Coefficients
Number of coefficients: Number of cepstral coefficients to keep after applying Discrete Cosine Transform
Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
Filter number: The number of triangular filters applied to the spectrogram
FFT length: The FFT size
Low frequency: Lowest band edge of Mel-scale filterbanks
High frequency: Highest band edge of Mel-scale filterbanks
Window size: The size of sliding window for local cepstral mean normalization. Windows size must be odd.
Pre-emphasis
Coefficient: The pre-emphasizing coefficient to apply to the input signal (0 equals to no filtering)
Note: Shift has been removed and set to 1 for all future projects. Older & existing projects can still change this value or use an existing value.
The Raw Data block generates windows from data samples without any specific signal processing. It is great for signals that have already been pre-processed and if you just need to feed your data into the Neural Network block.
GitHub repository containing all DSP block code: .
Scaling
Scale axes: Multiplies each axis by this number. This can be used to normalize your data between 0 and 1.
The Raw Data block retrieves raw samples and applies the Scaling parameter.
The Flatten block performs statistical analysis on the signal. It is useful for slow-moving averages like temperature data, in combination with other blocks.
GitHub repository containing all DSP block code: .
Scaling
Scale axes: Multiplies axes by this number
Method
Average: Calculates the average value for the window
Minimum: Calculates the minimum value in the window
Maximum: Calculates the maximum value in the window
Root-mean square: Calculates the RMS value of the window
Standard deviation: Calculates the standard deviation of the window
Skewness: Calculates the skewness of the window
Kurtosis: Calculates the kurtosis of the window
The Flatten block first rescales axes of the signal if value is different than 1. Then statistical analysis is performed on each window, computing between 1 and 7 features for each axis, depending on the number of selected methods.
Similarly to the , the Audio MFE processing block extracts time and frequency features from a signal. However it uses a non-linear scale in the frequency domain, called Mel-scale. It performs well on audio data, mostly for non-voice recognition use cases when sounds to be classified can be distinguished by human ear.
GitHub repository containing all DSP block code: .
Mel-filterbank energy features
Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
Filter number: The number of triangular filters applied to the spectrogram
FFT length: The FFT size
Low frequency: Lowest band edge of Mel-scale filterbanks
High frequency: Highest band edge of Mel-scale filterbanks
Normalization
Noise floor (dB): signal lower than this level will be dropped
The last step is to perform a local mean normalization of the signal, applying the Noise floor value to the power spectrum.
The features' extractions adds one extra step to the resulting in a compressed representation of the filterbanks. A Discrete Cosine Transform is applied on each filterbank to extract cepstral coefficients. 13 coefficients are usually retained, the rest are discarded as they represent fast changes not useful for speech recognition.
The features' extractions is similar to the (Frame length, Frame stride, and FFT length parameters are the same) but it adds 2 extra steps.
After computing the spectrogram, triangular filters are applied on a Mel-scale to extract frequency bands. They are configured with parameters Filter number, Low frequency and High frequency to select the frequency band and the number of frequency features to be extracted. The Mel-scale is . The idea is to extract more features (more filter banks) in the lower frequencies, and less in the high frequencies, thus it performs well on sounds that can be distinguished by human ear.
The IMU Syntiant block rescales raw data to 8 bits values to match the NDP101 chip input requirements.
Scaling
Scale 16 bits to 8 bits: Scale data to 8-bits values in the [-1, 1] range, raw data is divided by 2G (2 * 9.80665). Using Edge Impulse official firmwares, this parameter should be enabled as raw data is not rescaled. If this parameter is disabled the data samples will not be rescaled, you should disable this parameter if your raw data samples are already normalized to the [-1, 1] range.
The IMU Syntiant block retrieves raw samples and applies the Scale 16 bits to 8 bits parameter.
The Audio Syntiant processing block extracts time and frequency features from a signal. It is similar to the Audio MFE but performs additional processing specific to the Syntiant NDP101 chip. This block can be used only with Syntiant targets.
Log Mel-filterbank energy features
Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
Filter number (fixed): The number of triangular filters applied to the spectrogram
FFT length (fixed): The FFT size
Low frequency (fixed): Lowest band edge of Mel-scale filterbanks
High frequency (fixed): Highest band edge of Mel-scale filterbanks
Coefficient: Pre-emphasis coefficient
The features' extractions is a proprietary algorithm from Syntiant. However parameters are very close to the Audio MFE. Pre-emphasis coefficient is applied first to amplify higher frequencies. The signal is then divided in overlapping frames, defined by the Frame length and Frame stride to extract speech features.
Sampling frequency
The Audio Syntiant block only supports a 16 kHz frequency. You can adjust the sampling frequency in the "Create Impulse" section.
Building custom processing blocks is available for everyone but has to be self-hosted. If you want to host it on Edge Impulse infrastructures, you can do that within your organization interface.
In this tutorial, you'll learn how to use Edge Impulse CLI to push your custom DSP block to your organisation and how to make this processing block available in the Studio for all users in the organization.
The Custom Processing block we are using for this tutorial can be found here: https://github.com/edgeimpulse/edge-detection-processing-block. It is written in Python. Please note that one of the beauties with custom blocks is that you can write them in any language as we will host a Docker container and we are not tied to a specific runtime.
Only available for enterprise customers
Organizational features are only available for enterprise customers. View our pricing for more information.
You'll need:
The Edge Impulse CLI. If you receive any warnings that's fine. Run edge-impulse-blocks
afterwards to verify that the CLI was installed correctly.
Docker desktop installed on your machine. Custom blocks use Docker containers, a virtualization technique which lets developers package up an application with all dependencies in a single package. If you want to test your blocks locally you'll also need (this is not a requirement):
A Custom Processing block running with Docker.
Inside your Custom DSP block folder, run the following command:
The output will look like this:
Modify or update your custom code if needed and run the following command:
The output will look similar to this:
That's it, now your custom DSP block is hosted on your organization. To make sure it is up and running, in your organisation, go to Custom blocks->DSP and you will see the following screen:
To use your DSP block, simply add it as a processing block in the Create impulse view:
Full instruction on how to build processing blocks: Building custom processing blocks
When running edge-impulse-blocks init
for hosting a custom DSP block, ensure you log into an Edge Impulse account that is a member of an Organization. If you are logged into a personal account, you will be presented with the following CLI output:
Extracting meaningful features from your data is crucial to building small and reliable machine learning models, and in Edge Impulse this is done through processing blocks. We ship a number of processing blocks for common sensor data (such as vibration and audio), but they might not be suitable for all applications. Perhaps you have a very specific sensor, want to apply custom filters, or are implementing the latest research in digital signal processing. In this tutorial you'll learn how to support these use cases by adding custom processing blocks to the studio.
There is also a complete video covering how to implement your custom DSP block:
Make sure you followed the Continuous motion recognition tutorial, and have a trained impulse.
Development flow
This tutorial shows you the development flow of building custom processing blocks, and requires you to run the processing block on your own machine or server. Enterprise customers can share processing blocks within their organization, and run these on our infrastructure. See Hosting custom DSP blocks for more details.
Processing blocks take data and configuration parameters in, and return features and visualizations like graphs or images. To communicate to custom processing blocks, Edge Impulse studio will make HTTP calls to the block, and then use the response both in the UI, while generating features, or when training a machine learning model. Thus, to load a custom processing block we'll need to run a small server that responds to these HTTP calls. You can write this in any language, but we have created an example in Python. To load this example, open a terminal and run:
This creates a copy of the example project locally. Then, you can run the example either through Docker or locally via:
Docker
Locally
Then go to http://localhost:4446 and you should be shown some information about the block.
As this block is running locally the studio cannot reach the block. To resolve this we can use ngrok which can make a local port accessible from a public URL. After you've finished development you can move the processing block to a server with a publicly accessible address (or run it on our infrastructure through your enterprise account). To set up a tunnel:
Sign up for ngrok.
Install the ngrok binary for your platform.
Get a URL to access the processing block from the outside world via:
This yields a public URL for your block under Forwarding
. Note down the address that includes https://
.
Now that the custom processing block was created, and you've made it accessible to the outside world, you can add this block to Edge Impulse. In a project, go to Create Impulse, click Add a processing block, choose Add custom block (in the bottom left corner of the modal), and paste in the public URL of the block:
After you click Add block the block will show like any other processing block.
Add a learning bloc, then click Save impulse to store the impulse.
Processing blocks have configuration options which are rendered on the block parameter page. These could be filter configurations, scaling options, or control which visualizations are loaded. These options are defined in the parameters.json
file. Let's add an option to smooth raw data. Open example-custom-processing-block-python/parameters.json
and add a new section under parameters
:
Then, open example-custom-processing-block-python/dsp.py
and replace its contents with:
Restart the Python script, and then click Custom block in the studio (in the navigation bar). You now have a new option 'Smooth'. Every time an option changes we'll re-run the block, but as we have not written any code to respond to these changes nothing will happen.
We support a number of different types for configuration fields. These are:
int
- renders a numeric textbox that expects integers.
float
- renders a numeric textbox that expects floating point numbers.
string
- renders a textbox that expects a string.
boolean
- renders a checkbox.
select
- renders a dropdown box. This also requires the parameter valid
which should be an array of valid values. E.g. this renders a dropdown box with options 'low', 'high' and 'none':
To show the user what is happening we can also draw visuals in the processing block. Right now we support graphs (linear and logarithmic) and arbitrary images. By showing a graph of the smoothed sample we can quickly identify what effect the smooth option has on the raw signal. Open dsp.py
and replace the content with the following script. It contains a very basic smoothing algorithm and draws a graph:
Restart the script, and click the Smooth toggle to observe the difference. Congratulations! You have just created your first custom processing block.
If you extract set features from the signal, like the mean, that you that return, you can also label these features. These labels will be used in the feature explorer. To do so, add a labels
array that contains strings that map back to the features you return (labels
and features
should have the same length).
In the previous step we drew a linear graph, but you can also draw logarithmic graphs or even full images. This is done through the type
parameter:
This draws a graph with a logarithmic scale:
To show an image you should return the base64 encoded image and its MIME type. Here's how you draw a small PNG image:
If you output high-dimensional data (like a spectrogram or an image) you can enable dimensionality reduction for the feature explorer. This will run UMAP over the data to compress the features into three dimensions. To do so, set:
On the info
object in parameters.json
.
For all options that you can return in a graph, see the Run DSP return types in the API documentation.
Your custom block behaves exactly the same as any of the built-in blocks. You can process all your data, train neural networks or anomaly blocks, and validate that your model works.
However, we cannot automatically generate optimized native code for the block, like we do for built-in processing blocks, but we try to help you write this code as much as possible.
In your custom DSP code, open the parameters.json
file, you should have something similar to the following:
The cppType
field is used to generate a function that you can implement on your custom C++ library that you get from the deployment page.
When you export your project to a C++ library we generate structures for all the configuration options in the model-parameters/dsp_blocks.h
header file. You only need to implement the extract_custom_block_features
function. It takes your {cppType}
and generates the following extract_{cppType}_features
function.
For example with the above cppType
parameter:
Implement your function in the main.cpp file (or somewhere else, just make sure it is referenced).
An example of this function for the spectral analysis block is listed in the inferencing sdk.
Also, please have a look at the video on the top of this page (around minute 25) where Jan explains how to how to implement your custom DSP block in with your C++ library.
Blog post: Utilize Custom Processing Blocks in Your Image ML Pipelines
With good feature extraction you can make your machine learning models smaller and more reliable, which are both very important when you want to deploy your model on embedded devices. With custom processing blocks you can now develop new feature extraction pipelines straight from Edge Impulse. Whether you're following the latest research, want to implement proprietary algorithms, or are just exploring data.
For inspiration we have published all our own blocks here: edgeimpulse/processing-blocks. If you've made an interesting block that you think is valuable for the community, please let us know on the forums or by opening a pull request. We'd be happy to help write efficient native code for the block, and then publish it as a standard block!