Custom learning blocks
Last updated
Was this helpful?
Last updated
Was this helpful?
Want to use a novel ML architecture, or load your own transfer learning models into Edge Impulse? Create a custom learning block! It's easy to bring in any training pipeline into the Studio, as long as you can output TFLite or ONNX files. We have end-to-end examples of doing this in Keras, PyTorch and scikit-learn.
This page describes the input and output formats if you want to bring your own model, but a good way to start building a custom learning block is by modifying one of the following example repositories:
- wraps the Ultralytics YOLOv5 repository (trained with PyTorch) to train a custom transfer learning model.
- a Keras implementation of transfer learning with EfficientNet B0.
- a basic multi-layer perceptron in Keras and TensorFlow.
- a basic multi-layer perceptron in PyTorch.
- trains a logistic regression model using scikit-learn, then outputs a TFLite file for inferencing using .
Any built-in block in the Edge Impulse Studio (e.g. classifiers, regression models or FOMO blocks) can be edited locally, and then pushed back as a custom block. This is great if you want to make heavy modifications to these training pipelines, for example to do custom data augmentation. To download a block, go to any ML block in your project, click the three dots, select Edit block locally, and follow the instructions in the README.
Training pipelines in Edge Impulse are built on top of Docker containers, a virtualization technique which lets developers package up an application with all dependencies in a single package. To train your own model you'll need to wrap all the required packages, your scripts, and (if you use transfer learning) your pre-trained weights into this container. When running in Edge Impulse the container does not have network access, so make sure you don't download dependencies while running (fine when building the container).
A typical Dockerfile
might look like (see the example repositories for more information):
Important: ENTRYPOINT
It's important to create an ENTRYPOINT
at the end of the Dockerfile to specify which file to run.
GPU Support
If you want to have GPU support (only for enterprise customers), you'll need cuda packages installed. If you export a learn block from the Studio these will already have the right base packages, so use that Dockerfile as a starting point.
The entrypoint (see above in the Dockerfile) will be called with these parameters:
--data-directory
- where you can find the data (see below for the input/output formats).
--out-directory
- where to write the TFLite or ONNX files (see below for the input/output formats).
Will be displayed as:
If you do not specify a parameters.json
file, there will be 2 default elements rendered ("Learning rate" and "Number of training cycles"), which will be passed in as:
--learning-rate
- learning rate to train with (set by the user in the UI).
--epochs
- number of epochs to train for (set by the user in the UI).
The data directory contains your dataset, after running any DSP blocks, and already split in a train/validation set:
X_split_train.npy
Y_split_train.npy
X_split_test.npy
Y_split_train.npy
The X_*.npy
files are float32 Numpy arrays, already in the right shape (e.g. if you're training on 96x96 RGB images this will be of shape (n, 96, 96, 3)
). You can typically load these without any modification into your training pipeline (see the notes after this section for caveats).
The Y_*.npy
files are either:
int32 Numpy arrays, with four columns (label_index
, sample_id
, sample_slice_start_ms
, sample_slice_end_ms
).
A JSON array in the form of:
[{ "sampleId": 234731, "boundingBoxes": [{ "label": 1, "x": 260, "y": 313, "w": 234, "h": 261 }] } ]
2) is sent if your dataset has bounding boxes, in all other cases 1) is sent.
This regenerates features (if necessary) and then downloads the updated dataset.
The input features for vision models are a 4D vector of shape (n, WIDTH, HEIGHT, CHANNELS)
, where the channel data is in RGB
format. We support three ways of scaling the input:
Pixels ranging 0..1 - just the raw pixels, without any normalization.
Pixels ranging 0..255 - just the raw pixels, without any normalization.
PyTorch - the default way that inputs are scaled in most torchvision models, first it takes the raw pixels 0..1 then normalizes per-channel using the ImageNet mean and standard deviation:\
The input scaling is applied:
In the input features vector; so the inputs are already scaled correctly, no need to re-scale yourself. If you're converting the input features vector into images before training, as your training pipeline requires this, then make sure to un-normalize first.
When running inference, both in the Studio and on-device. So also, no need to re-scale yourself.
You can control the image input scaling when you create the block in the CLI (1.19.1 or higher), or by editing the block in the UI.
If you need data in channels-first (NCHW
) mode, then you'll need to transpose the input feature vector yourself before training. You can still just write out an NCHW
model, Edge Impulse supports both NHWC
and NCHW
models.
Edge Impulse only supports RGB models. If you have a model that requires BGR
input, rather than RGB
input (e.g. Resnet50) you'll need to transpose the first and last channels.
The training pipeline can output either TFLite or ONNX files:
If you output TFLite files
model.tflite
- a TFLite file with float32 inputs and outputs.
model_quantized_int8_io.tflite
- a quantized TFLite file with int8 inputs and outputs.
saved_model.zip
- a TensorFlow saved model (optional).
At least one of the TFLite files is required.
If you output ONNX files
model.onnx
- An ONNX file with float16 or float32 inputs and outputs.
We automatically convert this file to both unquantized and quantized TFLite files after training.
I'm using scikit-learn, I don't have TFLite or ONNX files...
To edit the block, go to:
Enterprise: go to your organization, Custom blocks > Machine learning.
Developers: click on your photo on the top right corner, select Custom blocks > Machine learning.
The block is now available from inside any of your Edge Impulse projects. Add it via Create impulse > Add a learning block.
Unfortunately object detection models typically don't have a standard way to go from neural network output layer to bounding boxes. Currently we support the following types of output layers:
MobileNet SSD
Edge Impulse FOMO
YOLOv5 (compatible with Ultralytics YOLOv5 v6)
YOLOv5 for Renesas DRP-AI
YOLOv7
YOLOX
If you have an object detection model with a different output layer then please contact your user success engineer (enterprise) or let us know on the forums (free users) with an example on how to interpret the output, and we can add it.
The profiling API expects:
A TFLite file.
A reference model (which model is closest to your architecture) - you can choose between gestures-large-f32
, gestures-large-i8
, image-32-32-mobilenet-f32
, image-32-32-mobilenet-i8
, image-96-96-mobilenet-f32
, image-96-96-mobilenet-i8
, image-320-320-mobilenet-ssd-f32
, keywords-2d-f32
, keywords-2d-i8
. Make sure to use i8
models if you have quantized your model.
Here's how you invoke the API from Python:
Additionally, you can specify custom arguments (like the learning rate, or whether to use data augmentation) by adding a file to your block. This file describes all arguments for your training pipeline, and is used to render custom UI elements for each parameter. For example, this parameters file:
And passes in --learning-rate-1 0.01 --learning-rate-2 0.001
to your script. For more information, and all options see .
To get new data for your project, just run (requires v1.16 or higher):
In Keras you do this by adding a lambda layer. .
For PyTorch you do this by first converting the trained model to ONNX, then transposing using .
If you have a training pipeline that cannot output TFLite files by default (e.g. scikit-learn), you can use jax to implement the inference function; and compile that to TFLite. See our . If there's any TFLite ops in your final model that are not supported by the EON Compiler (so you cannot run on device), then please let us know on the .
Host your block directly within Edge Impulse with the :
When training locally you can use the to get latency, RAM and ROM estimates. This is very useful as you can immediately see whether your model will fit on device. Additionally, you can use this API as part your experiment tracking (f.e. in Weights & Biases or MLFlow) to wield out models that won't fit your latency or memory constraints.
A reference device (for latency calculation) - you can get a list of all devices via in the latencyDevices
object.