1 of 13

Processing blocks

Raw Data
Flatten
Image
Spectral features
Spectrogram
Audio MFE
Audio MFCC
Audio Syntiant
IMU Syntiant

The source code of these blocks are available in the Edge Impulse processing blocks GitHub repository.

Custom processing blocks

If you have a very specific sensor, want to apply custom filters, or are implementing the latest research in digital signal processing, follow our tutorial on Building custom processing blocks.

Feature importance

In most of our DSP blocks, you have the option to calculate the feature importance. Edge Impulse Studio will then output a Feature Importance list that will help you determine which axes generated from your DSP block are most significant to analyze when you want to train a model.

Feature importance

For feature importance to work, you must have at least two labeled classes in your training dataset

This process of generating features and determining the most important features of your data will further reduce the amount of signal analysis needed on the device with new and unseen data.

To calculate the feature importance, a RandomForestClassifier is trained on the data and the feature_importances_ are extracted from the trained classifier.

Raw data

The Raw Data block generates windows from data samples without any specific signal processing. It is great for signals that have already been pre-processed and if you just need to feed your data into the Neural Network block.

GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.

Raw data parameters

Scaling

Scale axes: Multiplies each axis by this number. This can be used to normalize your data between 0 and 1.

How does the raw data block work?

The Raw Data block retrieves raw samples and applies the Scaling parameter.

Flatten

The Flatten block performs statistical analysis on the signal. It is useful for slow-moving averages like temperature data, in combination with other blocks.

GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.

Flatten parameters

Scaling

Scale axes: Multiplies axes by this number

Method

Average: Calculates the average value for the window
Minimum: Calculates the minimum value in the window
Maximum: Calculates the maximum value in the window
Root-mean square: Calculates the RMS value of the window
Standard deviation: Calculates the standard deviation of the window
Skewness: Calculates the skewness of the window
Kurtosis: Calculates the kurtosis of the window
Moving Average Number of Windows: Calculates the moving average by maintaining a rolling average of the last N windows. Note, there is no zero padding, the block will accumulate averages up to N windows. (Ex. for the first window in a sample, the moving average will equal the average). The moving average resets for each sample during training, and during inference, when run_classifier_init() is called. Note if you enable this, you probably don't want overlapping windows for training.

How does the flatten block work?

The Flatten block first rescales axes of the signal if value is different than 1. Then statistical analysis is performed on each window, computing between 1 and 8 features for each axis, depending on the number of selected methods.

Image

The Image block is dedicated to computer vision applications. It normalizes image data, and optionally reduce the color depth.

Image parameters

Color depth: Color depth to use (RGB or grayscale)

How does the image block work?

Spectral features

The Spectral features block extracts frequency, power and other characteristics of a signal. Low-pass and high-pass filters can also be applied to filter out unwanted frequencies. It is great for analyzing repetitive patterns in a signal, such as movements or vibrations from an accelerometer. It is also great for complex signals that have transients or irregular waveform, such as ECG and PPG signals.

GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.

Spectral analysis parameters

Compatible with the DSP Autotuner

Picking the right parameters for DSP algorithms can be difficult. It often requires a lot of experience and experimenting. The autotuning function makes this process easier by looking at the entire dataset and recommending a set of parameters that is tuned for your dataset.

Filter

Prior to calculating the Fast Fourier Transform (FFT), the time-series data inside the window of your sample can be filtered, which often helps to smooth out the signal or drop unwanted artifacts. In the image above, a "window" is shown inside the white box; only the readings inside that box will be used for filtering and calculating the FFT.

Edge Impulse will slide the window over your sample, as given by the time series input block parameters during Impulse creation in order to generate several training/test samples from your longer time series sample.

Scale axes - Multiply all raw input values by this number.
Input decimation ratio - Decimating (downsampling) the signal reduces the number of features and improves frequency resolution in relevant bands without increasing resource usage.
Type - The type of filter to apply to the raw data (low-pass, high-pass, or none).
Cut-off frequency - Cut-off frequency of the filter in hertz. Also, this will remove unwanted frequency bins from the generated features.
Order - Order of the Butterworth filter. Must be an even number. A higher order has a sharper cutoff at the expense of latency. You can also set to zero, in which case, the signal won't be filtered, but unwanted frequency bins will still be removed from the output.

Removing frequency bins beyond the cut off reduces model size, which saves resources, and also leads to models that train well with less data

After filtering via a Butterworth IIR filter (if enabled), the mean is subtracted from the signal. Several statistical features (RMS, skewness, kurtosis) are calculated from the filtered signal after the mean has been removed. This filtered signal is passed to the Spectral power section, which computes the FFT in order to compute the spectral features.

Analysis - Spectral power

Analysis type - There are two types of analysis you can choose from.

FFT base analysis is best at analyzing repetitive patterns in a signal,
Wavelet works better for complex signals that have transients or irregular waveform.

If you are unsure which one to choose, using the autotuning function will give you a good starting point. After selecting an analysis type, relevant parameters will appear for the selected type.

FFT based analysis

This section controls how the FFT is applied to each filtered window from your sample. If the window from your sample is larger than the FFT size, then the window will be broken into frames (or "sub-windows"), and the FFT is calculated from each frame.

FFT length - The FFT size. This determines the number of FFT bins as well as the resolution of frequency peaks that you can separate. A lower number means more signals will average together in the same FFT bin, but also reduces the number of features and model size. A higher number will separate more signals into separate bins, but generates a larger model.
Take log of spectrum? - When selected, log (base 10) will be applied to each FFT bin. This gives more range to (ie, captures more information about) low intensity signals at the expense of range for higher intensity signals. It is enabled by default and is generally a good choice, but it ultimately depends on the kind if signal sampled.
Overlap FFT frames? - Successive frames (sub-windows) overlap by 1/2 within the larger window (given by the white box in the image) if this is checked. If unchecked, frames will not overlap. This "sliding frame" method can prevent transient events from being missed if they happen to appear on a frame boundary. Enabled by default. Disabling improves latency. No impact on model size or RAM usage.

Note that several FFTs will be computed, depending on the settings. For example, if you have 100 readings for a single axis in your window and set the FFT length to 16 with no overlap, then 6 FFTs will be computed (for that single axis), as we have 6 full frames (each with 16 points) that will fully cover those 100 readings/points.

For each FFT bin (i.e. range of frequencies), the maximum value from all of the frames is kept as the feature. Continuing with the example above, we throw away 1/2 of every FFT (as it's simply a mirror image of the other half). We also throw away the bin at 0 Hz (as we filter out the DC bias anyway when we subtracted the mean), but we keep the Nyquist bin. As a result, we end up with 8 usable bins from each of our 16-point FFTs. For each bin, we find the maximum value from our 6 FFTs that we computed (in that particular bin). So, the number of features would be 8.

Note that you may see fewer spectral features if you enable filtering, as we throw away any frequency bins higher than the cutoff frequency (for the low-pass filter) or lower than the cutoff frequency (for the high-pass filter).

See this video to learn more about the FFT.

Wavelet based analysis

This section controls how the wavelet based analysis is applied to your signal. We use the Discrete Wavelet Transform (DWT) to decompose a signal into multiple levels of approximations and details and then extract multiple features at each level.

Wavelet decomposition level The level at which you wish to decompose the signal. Higher level reveals more information about the signal at a cost of more computing requirement and may introduce noise due to numerical precision limitations.
Wavelet The wavelet kernel. There are many types of wavelet to choose from, the best choice is often the one that mimics the pattern of interests in the signal.

If you are unsure which one to choose, using the autotuning function will give you a good starting point.

See this video to learn more about the DWT.

Graphs

Filter response - If filtering is enabled, and order is non-zero, then the frequency response of the filter is shown. This shows how much attenuation there will be across the frequency spectrum.
After filter - Shows the current window after filtering is applied (in the time domain).
Spectral power - Shows power vs. frequency as computed by the chosen FFT size. Power is either linear or log based on settings. This is shown if the selected analysis type is FFT.
Wavelet function - Shows the wavelet kernel function. This is shown if the selected analysis type is Wavelet.
Wavelet approximation - Shows the approximation of the signal at the highest decomposition level. This is shown if the selected analysis type is Wavelet.

Output features of the spectral analysis block

Using FFTs:

The spectral analysis block generates 2 types of features per axis/channel:

Statistical features
- RMS
- Skewness
- Kurtosis
Spectral features
- Maximum value from FFT frames for each bin that was not filtered out

Note that the standard deviation is not calculated because when the mean is subtracted from a signal, the RMS equals the standard deviation.

The total number of features will change, depending on how you set the filter and FFT parameters.

For example, let's consider an input signal sampled at 62.5 Hz with 3 axis and the following parameters:

Low-pass filter
Filter cutoff set to 3 Hz

The number of generated features per axis is:

3 values for statistics (RMS, Skewness, Kurtosis)
1 value for the FFT bin capturing 1.95 to 5.86 Hz

With 3 axes/channels, that gives us a total of 12 features generated in total for the input signal.

Using Wavelets:

The Wavelet block implements the discrete wavelet decomposition plus feature extraction and dimensionality reduction. After decomposition, 14 features are calculated at each level:

Entropy
Zero cross
Mean cross
5 percentile
25 percentile
75 percentile
95 percentile
Median
Mean
Stdev
Variance
RMS
Skewness
Kurtosis

For example, for a 4-level decomposition, with 14 features per component, it will generate 70 features in total.

Spectrogram

The Spectrogram processing block extracts time and frequency features from a signal. It performs well on audio data for non-voice recognition use cases, or on any sensor data with continuous frequencies.

GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.

Spectrogram parameters

Compatible with the DSP Autotuner

Spectrogram

Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
FFT size: The size of the FFT for each frame. Will zero pad or clip if frame length in samples does not equal FFT size.

Normalization

Noise floor (dB): signal lower than this level will be dropped

How does the spectrogram block work?

It first divides the window in multiple overlapping frames. The size and number of frames can be adjusted with the parameters Frame length and Frame stride. For example with a window of 1 second, frame length of 0.02s and stride of 0.01s, it will create 99 time frames.

An FFT is then calculated for each frame. The number of frequency features for each frame is equal to the FFT size parameter divided by 2 plus 1. We recommend keeping the FFT size a power of 2 for performances purpose. Finally the Noise floor value is applied to the power spectrum.

The features generated by the Spectrogram block are equal to the number of generated time frames times the number of frequency features.

Frequency bands and frame length

There is a connection between the FFT size parameter and the frame length. The frame length will be cropped or padded to the FFT size value before applying the FFT. For example, with a 8kHz sampling frequency and a time frame of 0.02s, each time frame contains 160 samples (8k * 0.02). If your FFT size is set 128, time frames will be cropped to 128 samples. If your FFT size is set to 256, time frames will be padded with zeros.

Audio MFE

Similarly to the Spectrogram block, the Audio MFE processing block extracts time and frequency features from a signal. However it uses a non-linear scale in the frequency domain, called Mel-scale. It performs well on audio data, mostly for non-voice recognition use cases when sounds to be classified can be distinguished by human ear.

GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.

Feature output format

The "Processed features" array has the following format:

Column major, from low frequency to high.
Number of rows will be equal to the filter number
Each column represents a single frame

Consider a toy example where the the signal is a pure tone, and Filter number is set to 6:

0.1016, 0.0391, 0.0000, 0.0000, 0.0000, 0.0000, 0.0820, 0.0547, 0.0117, 0.0000, 0.0000, ...

Output would begin as shown. The tone is a low frequency, so it falls into the first two Mel bins. The higher frequency bins are 0. The pattern repeats at the 7th element, which is the 1st row of the 2nd column.

Audio MFE parameters

Compatible with the DSP Autotuner

Mel-filterbank energy features

Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
Filter number: The number of triangular filters applied to the spectrogram
FFT length: The FFT size
Low frequency: Lowest band edge of Mel-scale filterbanks
High frequency: Highest band edge of Mel-scale filterbanks

Normalization

Noise floor (dB): signal lower than this level will be dropped

How does the MFE block work?

The features' extractions is similar to the Spectrogram (Frame length, Frame stride, and FFT length parameters are the same) but it adds 2 extra steps.

After computing the spectrogram, triangular filters are applied on a Mel-scale to extract frequency bands. They are configured with parameters Filter number, Low frequency and High frequency to select the frequency band and the number of frequency features to be extracted. The Mel-scale is a perceptual scale of pitches judged by listeners to be equal in distance from one another. The idea is to extract more features (more filter banks) in the lower frequencies, and less in the high frequencies, thus it performs well on sounds that can be distinguished by human ear.

The graph titled "FFT Bin Weighting" shows how the FFT bins are scaled and summed into the output columns for your chosen parameters.

The last step clips the MFE output for noise reduction. Any sample below Noise floor is set to zero instead.

Audio MFCC

The Audio MFCC blocks extracts coefficients from an audio signal. Similarly to the Audio MFE block, it uses a non-linear scale called Mel-scale. It is the reference block for speech recognition and can also performs well on some non-human voice use cases.

GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.

Feature output format

The "Processed features" array has the following format:

Column major, from low cepstrum to high.
Number of rows will be equal to the parameter "Number of coefficients" (or number of cepstra)
Each column represents a single frame

Audio MFCC parameters

Compatible with the DSP Autotuner

Mel Frequency Cepstral Coefficients

Number of coefficients: Number of cepstral coefficients to keep after applying Discrete Cosine Transform
Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
Filter number: The number of triangular filters applied to the spectrogram
FFT length: The FFT size
Low frequency: Lowest band edge of Mel-scale filterbanks
High frequency: Highest band edge of Mel-scale filterbanks
Window size: The size of sliding window for local cepstral mean normalization. Windows size must be odd.

Pre-emphasis

Coefficient: The pre-emphasizing coefficient to apply to the input signal (0 equals to no filtering)
Note: Shift has been removed and set to 1 for all future projects. Older & existing projects can still change this value or use an existing value.

How does the MFCC block work?

The features' extractions adds one extra step to the MFE block resulting in a compressed representation of the filterbanks. A Discrete Cosine Transform is applied on each filterbank to extract cepstral coefficients. 13 coefficients are usually retained, the rest are discarded as they represent fast changes not useful for speech recognition.

Audio Syntiant

The Audio Syntiant processing block extracts time and frequency features from a signal. It is similar to the Audio MFE but performs additional processing specific to the Syntiant NDP101/120 chip. This block can be used only with Syntiant targets.

Audio Syntiant parameters

Log Mel-filterbank energy features

Frame length: The length of each frame in seconds
Frame stride: The step between successive frame in seconds
Filter number (fixed): The number of triangular filters applied to the spectrogram
FFT length (fixed): The FFT size
Low frequency (fixed): Lowest band edge of Mel-scale filterbanks
High frequency (fixed): Highest band edge of Mel-scale filterbanks

Preemphasis

Coefficient: Pre-emphasis coefficient

Chip

Features extractor: Syntiant method to generate features, choose accordingly to your chip

How does the Syntiant block work?

The features' extractions is a proprietary algorithm from Syntiant. However parameters are very close to the Audio MFE. Pre-emphasis coefficient is applied first to amplify higher frequencies. The signal is then divided in overlapping frames, defined by the Frame length and Frame stride to extract speech features.

Sampling frequency

The Audio Syntiant block only supports a 16 kHz frequency. You can adjust the sampling frequency in the "Create Impulse" section.

IMU Syntiant

The IMU Syntiant block rescales raw data to 8 bits values to match the NDP101/120 chip input requirements.

Parameters

Scaling

Scale 16 bits to 8 bits: Scale data to 8-bits values in the [-1, 1] range, raw data is divided by 2G (2 * 9.80665). Using Edge Impulse official firmwares, this parameter should be enabled as raw data is not rescaled. If this parameter is disabled the data samples will not be rescaled, you should disable this parameter if your raw data samples are already normalized to the [-1, 1] range.

How does the IMU Syntiant block work?

The IMU Syntiant block retrieves raw samples and applies the Scale 16 bits to 8 bits parameter.

Building custom processing blocks

Extracting meaningful features from your data is crucial to building small and reliable machine learning models, and in Edge Impulse this is done through processing blocks. We ship a number of processing blocks for common sensor data (such as vibration and audio), but they might not be suitable for all applications. Perhaps you have a very specific sensor, want to apply custom filters, or are implementing the latest research in digital signal processing. In this tutorial you'll learn how to support these use cases by adding custom processing blocks to the studio.

There is also a complete video covering how to implement your custom DSP block:

Prerequisites

Development flow

1. Building your first custom processing block

This creates a copy of the example project locally. Then, you can run the example either through Docker or locally via:

Docker

Locally

Exposing the processing block to the world

Install the ngrok binary for your platform.
Get a URL to access the processing block from the outside world via:

This yields a public URL for your block under Forwarding. Note down the address that includes https://.

Adding the custom block to Edge Impulse

Now that the custom processing block was created, and you've made it accessible to the outside world, you can add this block to Edge Impulse. In a project, go to Create Impulse, click Add a processing block, choose Add custom block (in the bottom left corner of the modal), and paste in the public URL of the block:

After you click Add block the block will show like any other processing block.

Add a learning bloc, then click Save impulse to store the impulse.

2. Adding configuration options

Processing blocks have configuration options which are rendered on the block parameter page. These could be filter configurations, scaling options, or control which visualizations are loaded. These options are defined in the parameters.json file. Let's add an option to smooth raw data. Open example-custom-processing-block-python/parameters.json and add a new section under parameters:

Then, open example-custom-processing-block-python/dsp.py and replace its contents with:

Restart the Python script, and then click Custom block in the studio (in the navigation bar). You now have a new option 'Smooth'. Every time an option changes we'll re-run the block, but as we have not written any code to respond to these changes nothing will happen.

2.1 Customizing parameters

3. Implementing smoothing and drawing graphs

To show the user what is happening we can also draw visuals in the processing block. Right now we support graphs (linear and logarithmic) and arbitrary images. By showing a graph of the smoothed sample we can quickly identify what effect the smooth option has on the raw signal. Open dsp.py and replace the content with the following script. It contains a very basic smoothing algorithm and draws a graph:

Restart the script, and click the Smooth toggle to observe the difference. Congratulations! You have just created your first custom processing block.

3.1 Adding features to labels

If you extract set features from the signal, like the mean, that you that return, you can also label these features. These labels will be used in the feature explorer. To do so, add a labels array that contains strings that map back to the features you return (labels and features should have the same length).

4. Other type of graphs

In the previous step we drew a linear graph, but you can also draw logarithmic graphs or even full images. This is done through the type parameter:

4.1 Logarithmic graphs

This draws a graph with a logarithmic scale:

4.2 Images

To show an image you should return the base64 encoded image and its MIME type. Here's how you draw a small PNG image:

4.3 Dimensionality reduction

If you output high-dimensional data (like a spectrogram or an image) you can enable dimensionality reduction for the feature explorer. This will run UMAP over the data to compress the features into three dimensions. To do so, set:

On the info object in parameters.json.

4.4 Full documentation

5. Running on device

Your custom block behaves exactly the same as any of the built-in blocks. You can process all your data, train neural networks or anomaly blocks, and validate that your model works.

However, we cannot automatically generate optimized native code for the block, like we do for built-in processing blocks, but we try to help you write this code as much as possible.

Export as a C++ Library:

In the Edge Impulse platform, export your project as a C++ library.
Choose the model type that suits your target device (quantized vs. float32).

Forward Declaration:

You don't need to add this part, it is automatically generated!

In the model-parameters/model_variables.h file of the exported C++ library, you can see a forward declaration for the custom DSP block you created.

For example:

The name of that function comes from the cppType field in your custom DSP parameter.json. It takes your {cppType} and generates the following extract_{cppType}_features function.

Implement the Custom DSP Block:

In the main.cpp file of the C++ library, implement the extract_my_preprocessing_features block. This block should:

Call into the Edge Impulse SDK to generate features.
Execute the rest of the DSP block, including neural network inference.

Also, please have a look at the video on the top of this page (around minute 25) where Jan explains how to implement your custom DSP block with your C++ library.

Compile and Run the App

Copy a test sample's raw features into the features[] array in source/main.cpp
Enter make -j in this directory to compile the project. If you encounter any OOM memory error try make -j4 (replace 4 with the number of cores available)
Enter ./build/app to run the application
Compare the output predictions to the predictions of the test sample in the Edge Impulse Studio.

6. Other resources

7. Conclusion

With good feature extraction you can make your machine learning models smaller and more reliable, which are both very important when you want to deploy your model on embedded devices. With custom processing blocks you can now develop new feature extraction pipelines straight from Edge Impulse. Whether you're following the latest research, want to implement proprietary algorithms, or are just exploring data.

Hosting custom DSP blocks

Building custom processing blocks is available for everyone but has to be self-hosted. If you want to host it on Edge Impulse infrastructures, you can do that within your organization interface.

In this tutorial, you'll learn how to use Edge Impulse CLI to push your custom DSP block to your organisation and how to make this processing block available in the Studio for all users in the organization.

The Custom Processing block we are using for this tutorial can be found here: https://github.com/edgeimpulse/edge-detection-processing-block. It is written in Python. Please note that one of the beauties with custom blocks is that you can write them in any language as we will host a Docker container and we are not tied to a specific runtime.

Only available in Edge Impulse Organizations

Try it out with our enterprise free trial or view our pricing for more information.

Prerequisites

You'll need:

The Edge Impulse CLI. If you receive any warnings that's fine. Run edge-impulse-blocks afterwards to verify that the CLI was installed correctly.
Docker desktop installed on your machine. Custom blocks use Docker containers, a virtualization technique which lets developers package up an application with all dependencies in a single package. If you want to test your blocks locally you'll also need (this is not a requirement):
A Custom Processing block running with Docker.

Init and upload your custom DSP block

Inside your Custom DSP block folder, run the following command:

edge-impulse-blocks init --clean

The output will look like this:

? What is your user name or e-mail address (edgeimpulse.com)? 
? What is your password? [hidden]
Edge Impulse Blocks v1.14.3
Attaching block to organization 'Demo Team'

? Choose a type of block 
  Transformation block 
  Deployment block 
❯ DSP block 
  Transfer learning block 

? Enter the name of your block Edge: Detection

? Enter the description of your block: Edge Detection processing block using Canny filters in images

Creating block with config: {
  version: 1,
  config: {
    'edgeimpulse.com': {
      name: 'Edge Detection',
      type: 'dsp',
      description: 'Edge Detection processing block using Canny filters in images',
      organizationId: XXX,
      operatesOn: undefined,
      tlObjectDetectionLastLayer: undefined,
      tlOperatesOn: undefined
    }
  }
}
Your new block 'Edge Detection' has been created in '<PATH>'.
When you have finished building your dsp block, run 'edge-impulse-blocks push' to update the block in Edge Impulse.

Modify or update your custom code if needed and run the following command:

edge-impulse-blocks push

The output will look similar to this:

Edge Impulse Blocks v1.14.3
? What port is your block listening on? 4446

Archiving 'edge-detection-processing-block'...
Archiving 'edge-detection-processing-block' OK (476 KB) /var/folders/7f/pfcmh61s3hg9c59qd0dkkw5w0000gn/T/ei-dsp-block-c729b4a3ff761b64629617c869e9d934.tar.gz

Uploading block 'Edge Detection' to organization 'Demo Team'...
Uploading block 'Edge Detection' to organization 'Demo Team' OK

Building dsp block 'Edge Detection'...
Job started
...
Building dsp block 'Edge Detection' OK

That's it, now your custom DSP block is hosted on your organization. To make sure it is up and running, in your organisation, go to Custom blocks->DSP and you will see the following screen:

Use your custom hosted DSP block in your projects

To use your DSP block, simply add it as a processing block in the Create impulse view:

Other resources

Full instruction on how to build processing blocks: Building custom processing blocks
Blog post: Utilize Custom Processing Blocks in Your Image ML Pipelines

Troubleshooting

Deploy block types are hidden

When running edge-impulse-blocks init for hosting a custom DSP block, ensure you log into an Edge Impulse account that is a member of an Organization. If you are logged into a personal account, you will be presented with the following CLI output:

$ edge-impulse-blocks init
Edge Impulse Blocks v1.16.0
? What is your user name or e-mail address (edgeimpulse.com)? jplunkett@utexas.edu
? What is your password? [hidden]
[CFG] Creating developer profile...
[CFG] Creating developer profile OK
Attaching block to organization 'Jenny Plunkett'
? Choose a type of block (transform, DSP and deploy block types are hidden because you are pushing to a personal profile)
❯ Machine learning block

Feature explorer

The feature explorer is a tool used to visualize your dataset’s features. Note that features are the output of your processing block, and not the raw data itself (see here for the data explorer, which performs a similar function on your raw data). This visualization helps you identify outliers and how well your classes are grouped and separated. A good separation among your classes usually indicates that simpler machine learning (ML) models can be used with greater accuracy.

Using the feature explorer

To access the feature explorer, go to the Processing page in your project. The name of this page depends on which processing block you used, such as Raw, Flatten, Spectral analysis, and so on.

On the processing page, configure your processing block and click Save parameters. You will be automatically transferred to the Generate features tab. From there, click Generate features and wait while your raw data is transformed into features.

If you are using the Flatten processing block, you will see a 3D representation of up to 3 axes from the features generated in that block. You can select which axes are shown by selecting them from the drop-down menus above the plot.

Other processing blocks (e.g. Spectral analysis) use a process known as "dimensionality reduction" (see the next section for more information). It essentially compresses all the information found in your features (which can be hundreds or thousands of dimensions) into two dimensions. This compression makes it much easier for our human brains to comprehend how the samples relate to each other: how they are clustered and how much distance there is between samples with different labels.

You can click on a dot in the plot to learn more about that particular sample. In both 2D and 3D plots, you can click and drag to move the plot as well as zoom.

Note that if you create your own custom processing block, you can determine if and how to display a feature explorer plot in your project.

Understanding the feature explorer

he feature explorer is an incredibly useful tool to help you analyze your dataset, your feature extraction (processing) method, and how well you should expect a machine learning model to classify new samples. We will use our example above, which consists of the spectral analysis features extracted from the continuous gestures dataset to help you get started.

First, notice that the samples fall into one of four categories: idle, snake, updown, and wave. These categories come directly from your label names for that dataset. Each sample is represented by a dot on the plot, and the color of that dot corresponds to its associated label. Note that if you have time series data, each window corresponds to one sample.

In the ideal case, you would see all the samples in each category in its own grouping separate from other clusters of samples. If that is the case, then you have a very good processing block, and your machine learning model can often be very simple.

Take a look at the grouping with the 1 annotation in the example above. A model (after "training" or "fitting" the model to this dataset) would likely have a very easy time identifying samples in this group. As a result, you can expect a high accuracy for idle samples.

If you look at the 2 group, you can see a lot of overlap among samples that belong to several categories. ML models will often struggle when samples are ambiguous like this. They will have a hard time differentiating among the samples, and you can expect lower accuracy among those groups.

If you see good separation among your class groupings, you can expect good model accuracy, and you may even be able to get away with a simpler model architecture (e.g. fewer nodes per layer, fewer layers). Be cautious of under- and over-fitting when training your model; always check its accuracy with a validation or test dataset!

If your entire dataset looks like the number 2 grouping above (all samples overlap, and it's difficult or impossible to distinguish the groupings), then you might need to take some actions to make your features more separable. Here are some things to try:

Collect more data. This will sometimes help flesh out more distinguishable clusters in your features.
Try a different processing method. You might need to change how your features are extracted. Spectral analysis not working? Try a spectrogram. Grayscale images create overlapping features? Try using color (RGB) instead to see if the extra information helps separate the groupings.
Try different processing parameters. Play with the feature extraction settings in your processing block to see if you can create better groupings.
Use a more complex ML model. If you feel you have tried your best to get the features to separate into clusters but there is still a lot of overlap, then you might need to rely on your ML model to perform the separation for you. Often, this means using more complex models (e.g. more layers, more nodes). As before, be aware of under- and over-fitting as you tweak your model's hyperparameters.

The EON Tuner is a great AutoML tool in your arsenal to help you design a good impulse. It will automatically try different combinations of processing blocks with different settings along with various ML models to find a good pipeline of feature extraction and ML model.

How does this work?

With the Flatten block, the points are drawn in a 3D space with the value for each axis coming from one of the features. For example, let’s say you chose average acceleration X, average acceleration Y, and average acceleration Z as your axes, and a sample had the following values for those features:

Avg accX = -0.24
Avg accY = 0.17
Avg accZ = -9.81

A dot would be plotted at (-0.24, 0.17, -9.81) in the 3D viewer. Note that the usage is the same as the 2D viewer: we want to visualize the groupings and separation among classes.

For other blocks, dimensionality reduction is the process of transforming data from a high-dimensional space to a low-dimensional space while retaining as much meaningful properties of the original data as possible. Popular algorithms for dimensionality reduction include PCA and tSNE.

Dimensionality reduction has a number of uses, including data compression, speeding up learning algorithms (as fewer dimensions often means smaller models), and noise reduction. The feature explorer tool in Edge Impulse uses dimensionality reduction to create a data visualization plot to help you understand how your data and features are grouped.

The feature explorer uses the Uniform Manifold Approximation and Projection (UMAP) algorithm to perform dimensionality reduction. The math behind UMAP is quite complex, but it essentially involves constructing a graph in the higher-dimensional space that connects similar data (sample) points to each other. This graph is then projected onto a lower dimensional space (2 dimensions in the case of the feature explorer) to create the final output. This blog post does a great job of explaining how UMAP works in more detail.

Questions?

If you have any questions about the feature explorer, we'd be happy to help on the forums, or reach out to your solutions engineer.

Legacy 3D feature explorer

Older versions of Edge Impulse used a 3D viewer with UMAP on non-Flatten blocks. This legacy feature explorer accomplished the same goal of performing dimensionality reduction to provide a visual representation of your extracted features. Because 2D images load faster on web pages than 3D models while retaining much of the same information, we switched to 2D images. However, if feature extraction was performed prior to this switch, some projects may still have 3D feature explorer plots.

Spectral features

GitHub repository containing all DSP block code: edgeimpulse/processing-blocks.

Spectral analysis parameters

Compatible with the DSP Autotuner

Filter

Scale axes - Multiply all raw input values by this number.
Input decimation ratio - Decimating (downsampling) the signal reduces the number of features and improves frequency resolution in relevant bands without increasing resource usage.
Type - The type of filter to apply to the raw data (low-pass, high-pass, or none).
Cut-off frequency - Cut-off frequency of the filter in hertz. Also, this will remove unwanted frequency bins from the generated features.
Order - Order of the Butterworth filter. Must be an even number. A higher order has a sharper cutoff at the expense of latency. You can also set to zero, in which case, the signal won't be filtered, but unwanted frequency bins will still be removed from the output.

Removing frequency bins beyond the cut off reduces model size, which saves resources, and also leads to models that train well with less data

Analysis - Spectral power

Analysis type - There are two types of analysis you can choose from.

FFT base analysis is best at analyzing repetitive patterns in a signal,
Wavelet works better for complex signals that have transients or irregular waveform.

If you are unsure which one to choose, using the autotuning function will give you a good starting point. After selecting an analysis type, relevant parameters will appear for the selected type.

FFT based analysis

FFT length - The FFT size. This determines the number of FFT bins as well as the resolution of frequency peaks that you can separate. A lower number means more signals will average together in the same FFT bin, but also reduces the number of features and model size. A higher number will separate more signals into separate bins, but generates a larger model.
Take log of spectrum? - When selected, log (base 10) will be applied to each FFT bin. This gives more range to (ie, captures more information about) low intensity signals at the expense of range for higher intensity signals. It is enabled by default and is generally a good choice, but it ultimately depends on the kind if signal sampled.
Overlap FFT frames? - Successive frames (sub-windows) overlap by 1/2 within the larger window (given by the white box in the image) if this is checked. If unchecked, frames will not overlap. This "sliding frame" method can prevent transient events from being missed if they happen to appear on a frame boundary. Enabled by default. Disabling improves latency. No impact on model size or RAM usage.

See this video to learn more about the FFT.

Wavelet based analysis

Wavelet decomposition level The level at which you wish to decompose the signal. Higher level reveals more information about the signal at a cost of more computing requirement and may introduce noise due to numerical precision limitations.
Wavelet The wavelet kernel. There are many types of wavelet to choose from, the best choice is often the one that mimics the pattern of interests in the signal.

If you are unsure which one to choose, using the autotuning function will give you a good starting point.

See this video to learn more about the DWT.

Graphs

Filter response - If filtering is enabled, and order is non-zero, then the frequency response of the filter is shown. This shows how much attenuation there will be across the frequency spectrum.
After filter - Shows the current window after filtering is applied (in the time domain).
Spectral power - Shows power vs. frequency as computed by the chosen FFT size. Power is either linear or log based on settings. This is shown if the selected analysis type is FFT.
Wavelet function - Shows the wavelet kernel function. This is shown if the selected analysis type is Wavelet.
Wavelet approximation - Shows the approximation of the signal at the highest decomposition level. This is shown if the selected analysis type is Wavelet.

Output features of the spectral analysis block

Using FFTs:

The spectral analysis block generates 2 types of features per axis/channel:

Statistical features
- RMS
- Skewness
- Kurtosis
Spectral features
- Maximum value from FFT frames for each bin that was not filtered out

Note that the standard deviation is not calculated because when the mean is subtracted from a signal, the RMS equals the standard deviation.

The total number of features will change, depending on how you set the filter and FFT parameters.

For example, let's consider an input signal sampled at 62.5 Hz with 3 axis and the following parameters:

Low-pass filter
Filter cutoff set to 3 Hz

The number of generated features per axis is:

3 values for statistics (RMS, Skewness, Kurtosis)
1 value for the FFT bin capturing 1.95 to 5.86 Hz

With 3 axes/channels, that gives us a total of 12 features generated in total for the input signal.

Using Wavelets:

The Wavelet block implements the discrete wavelet decomposition plus feature extraction and dimensionality reduction. After decomposition, 14 features are calculated at each level:

Entropy
Zero cross
Mean cross
5 percentile
25 percentile
75 percentile
95 percentile
Median
Mean
Stdev
Variance
RMS
Skewness
Kurtosis

For example, for a 4-level decomposition, with 14 features per component, it will generate 70 features in total.

Building custom processing blocks

There is also a complete video covering how to implement your custom DSP block:

Prerequisites

Make sure you follow the tutorial, and have a trained impulse.

Development flow

This tutorial shows you the development flow of building custom processing blocks, and requires you to run the processing block on your own machine or server. Enterprise customers can share processing blocks within their organization, and run these on our infrastructure. See for more details.

1. Building your first custom processing block

Processing blocks take data and configuration parameters in, and return features and visualizations like graphs or images. To communicate to custom processing blocks, Edge Impulse studio will make HTTP calls to the block, and then use the response both in the UI, while generating features, or when training a machine learning model. Thus, to load a custom processing block we'll need to run a small server that responds to these HTTP calls. You can write this in any language, but we have created in Python. To load this example, open a terminal and run:

$ git clone https://github.com/edgeimpulse/example-custom-processing-block-python

This creates a copy of the example project locally. Then, you can run the example either through Docker or locally via:

Docker

$ docker build -t custom-blocks-demo .
$ docker run -p 4446:4446 -it --rm custom-blocks-demo

Locally

$ pip3 install -r requirements-blocks.txt
$ python3 dsp-server.py

Then go to and you should be shown some information about the block.

Exposing the processing block to the world

As this block is running locally the studio cannot reach the block. To resolve this we can use which can make a local port accessible from a public URL. After you've finished development you can move the processing block to a server with a publicly accessible address (or run it on our infrastructure through your enterprise account). To set up a tunnel:

Sign up for .
Install the ngrok binary for your platform.
Get a URL to access the processing block from the outside world via:

$ ngrok http 4446
# or
$ ./ngrok http 4446

This yields a public URL for your block under Forwarding. Note down the address that includes https://.

Session Status                online
Account                       Edge Impulse (Plan: Free)
Version                       2.3.35
Region                        United States (us)
Web Interface                 http://127.0.0.1:4040
Forwarding                    http://4d48dca5.ngrok.io -> http://localhost:4446
Forwarding                    https://4d48dca5.ngrok.io -> http://localhost:4446

Adding the custom block to Edge Impulse

After you click Add block the block will show like any other processing block.

Add a learning bloc, then click Save impulse to store the impulse.

2. Adding configuration options

        {
            "group": "Filter",
            "items": [
                {
                    "name": "Smooth",
                    "value": false,
                    "type": "boolean",
                    "help": "Whether to smooth the data",
                    "param": "smooth"
                }
            ]
        }

Then, open example-custom-processing-block-python/dsp.py and replace its contents with:

import numpy as np

def generate_features(implementation_version, draw_graphs, raw_data, axes, sampling_freq, scale_axes, smooth):
    return { 'features': raw_data * scale_axes, 'graphs': [] }

2.1 Customizing parameters

For the full documentation on customizing parameters, and a list of all configuration options; see .

3. Implementing smoothing and drawing graphs

import numpy as np

def smoothing(y, box_pts):
    box = np.ones(box_pts) / box_pts
    y_smooth = np.convolve(y, box, mode='same')
    return y_smooth

def generate_features(draw_graphs, raw_data, axes, sampling_freq, scale_axes, smooth):
    # features is a 1D array, reshape so we have a matrix with one raw per axis
    raw_data = raw_data.reshape(int(len(raw_data) / len(axes)), len(axes))

    features = []
    smoothed_graph = {}

    # split out the data from all axes
    for ax in range(0, len(axes)):
        X = []
        for ix in range(0, raw_data.shape[0]):
            X.append(raw_data[ix][ax])

        # X now contains only the current axis
        fx = np.array(X)

        # first scale the values
        fx = fx * scale_axes

        # if smoothing is enabled, do that
        if (smooth):
            fx = smoothing(fx, 5)

        # we save bandwidth by only drawing graphs when needed
        if (draw_graphs):
            smoothed_graph[axes[ax]] = list(fx)

        # we need to return a 1D array again, so flatten here again
        for f in fx:
            features.append(f)

    # draw the graph with time in the window on the Y axis, and the values on the X axes
    # note that the 'suggestedYMin/suggestedYMax' names are incorrect, they describe
    # the min/max of the X axis
    graphs = []
    if (draw_graphs):
        graphs.append({
            'name': 'Smoothed',
            'X': smoothed_graph,
            'y': np.linspace(0.0, raw_data.shape[0] * (1 / sampling_freq) * 1000, raw_data.shape[0] + 1).tolist(),
            'suggestedYMin': -20,
            'suggestedYMax': 20
        })

    return { 'features': features, 'graphs': graphs }

Restart the script, and click the Smooth toggle to observe the difference. Congratulations! You have just created your first custom processing block.

3.1 Adding features to labels

4. Other type of graphs

In the previous step we drew a linear graph, but you can also draw logarithmic graphs or even full images. This is done through the type parameter:

4.1 Logarithmic graphs

This draws a graph with a logarithmic scale:

    graphs.append({
        'name': 'Logarithmic example',
        'X': {
            'Axis title': [ pow(10, i) for i in range(10) ]
        },
        'y': np.linspace(0, 10, 10).tolist(),
        'suggestedXMin': 0,
        'suggestedXMax': 10,
        'suggestedYMin': 0,
        'suggestedYMax': 1e+10,
        'type': 'logarithmic'
    })

4.2 Images

To show an image you should return the base64 encoded image and its MIME type. Here's how you draw a small PNG image:

    from PIL import Image, ImageDraw, ImageFont, ImageFilter

    # create a new image, and draw some text on it
    im = Image.new ('RGB', (438, 146), (248, 86, 44))
    draw = ImageDraw.Draw(im)
    draw.text((10, 10), 'Hello world!', fill=(255, 255, 255))

    # save the image to a buffer, and base64 encode the buffer
    with io.BytesIO() as buf:
        im.save(buf, format='png', bbox_inches='tight', pad_inches=0)
        buf.seek(0)
        image = (base64.b64encode(buf.getvalue()).decode('ascii'))

        # append as a new graph
        graphs.append({
            'name': 'Image from custom block',
            'image': image,
            'imageMimeType': 'image/png',
            'type': 'image'
        })

4.3 Dimensionality reduction

"visualization": "dimensionalityReduction"

On the info object in parameters.json.

4.4 Full documentation

For all options that you can return in a graph, see the return types in the API documentation.

5. Running on device

Your custom block behaves exactly the same as any of the built-in blocks. You can process all your data, train neural networks or anomaly blocks, and validate that your model works.

However, we cannot automatically generate optimized native code for the block, like we do for built-in processing blocks, but we try to help you write this code as much as possible.

Export as a C++ Library:

In the Edge Impulse platform, export your project as a C++ library.
Choose the model type that suits your target device (quantized vs. float32).

Forward Declaration:

You don't need to add this part, it is automatically generated!

In the model-parameters/model_variables.h file of the exported C++ library, you can see a forward declaration for the custom DSP block you created.

For example:

int extract_my_preprocessing_features(signal_t *signal, matrix_t *output_matrix, void *config_ptr, const float frequency);

The name of that function comes from the cppType field in your custom DSP parameter.json. It takes your {cppType} and generates the following extract_{cppType}_features function.

Implement the Custom DSP Block:

In the main.cpp file of the C++ library, implement the extract_my_preprocessing_features block. This block should:

Call into the Edge Impulse SDK to generate features.
Execute the rest of the DSP block, including neural network inference.

For examples, have a look at our official DSP blocks implementations in our

Also, please have a look at the video on the top of this page (around minute 25) where Jan explains how to implement your custom DSP block with your C++ library.

Compile and Run the App

Copy a test sample's raw features into the features[] array in source/main.cpp
Enter make -j in this directory to compile the project. If you encounter any OOM memory error try make -j4 (replace 4 with the number of cores available)
Enter ./build/app to run the application
Compare the output predictions to the predictions of the test sample in the Edge Impulse Studio.

6. Other resources

Blog post:

7. Conclusion

For inspiration we have published all our own blocks here: . If you've made an interesting block that you think is valuable for the community, please let us know on the or by opening a pull request. We'd be happy to help write efficient native code for the block, and then publish it as a standard block!

Feature explorer

Using the feature explorer

You can click on a dot in the plot to learn more about that particular sample. In both 2D and 3D plots, you can click and drag to move the plot as well as zoom.

Note that if you create your own custom processing block, you can determine if and how to display a feature explorer plot in your project.

Understanding the feature explorer

Collect more data. This will sometimes help flesh out more distinguishable clusters in your features.
Try a different processing method. You might need to change how your features are extracted. Spectral analysis not working? Try a spectrogram. Grayscale images create overlapping features? Try using color (RGB) instead to see if the extra information helps separate the groupings.
Try different processing parameters. Play with the feature extraction settings in your processing block to see if you can create better groupings.
Use a more complex ML model. If you feel you have tried your best to get the features to separate into clusters but there is still a lot of overlap, then you might need to rely on your ML model to perform the separation for you. Often, this means using more complex models (e.g. more layers, more nodes). As before, be aware of under- and over-fitting as you tweak your model's hyperparameters.

How does this work?

Avg accX = -0.24
Avg accY = 0.17
Avg accZ = -9.81

A dot would be plotted at (-0.24, 0.17, -9.81) in the 3D viewer. Note that the usage is the same as the 2D viewer: we want to visualize the groupings and separation among classes.

Questions?

If you have any questions about the feature explorer, we'd be happy to help on the forums, or reach out to your solutions engineer.