1 of 1

Custom learning blocks

Want to use a novel ML architecture, or load your own transfer learning models into Edge Impulse? Create a custom learning block! It's easy to bring in any training pipeline into Studio, as long as you can output TFLite, SavedModel, or ONNX files - or if you have a scikit-learn model, a pickle file. We have end-to-end examples of doing this in Keras, PyTorch, and scikit-learn.

If you only want to modify the neural network architecture or loss function, you can also use expert mode directly in the Studio, without having to bring your own model. Go to any ML block, select three dots, and select Switch to Keras (expert) mode.

This page describes the input and output formats if you want to bring your own architecture, but a good way to start building a custom learning block is by modifying one of the following example repositories:

YOLOv5 - wraps the Ultralytics YOLOv5 repository (trained with PyTorch) to train a custom transfer learning model.
EfficientNet - a Keras implementation of transfer learning with EfficientNet B0.
Keras - a basic multi-layer perceptron in Keras and TensorFlow.
PyTorch - a basic multi-layer perceptron in PyTorch.
Scikit-learn - trains a logistic regression model using scikit-learn.

In this tutorial, we will explain how to set up a learning block, push it to an Edge Impulse organization, and use it in a project.

Understanding the learning blocks

A learning block consists of a Docker image that contains one or more scripts. The Docker image is encapsulated in a learning block with additional parameters.

Here is a diagram of how a minimal configuration for a learning block works:

Getting started

We will walk through creating a custom learning block, pushing it to our organization (enterprise accounts only), and running it in a project. To perform this, we will use the example learning block found in this repository.

Directory structure

To start, create a directory somewhere on your computer. I'll name mine my-custom-learning-block/.

mkdir my-custom-learning-block
cd my-custom-learning-block

We will be working in this directory to create our custom learning block. It will also hold data for testing locally. After working through this tutorial, you should have a directory structure like the following:

my-custom-learning-block/
├── data/
│   ├── X_split_test.npy
│   ├── X_split_train.npy
│   ├── Y_split_test.npy
│   └── Y_split_train.npy
├── out/
│   ├── model.tflite
│   ├── model_quantized_int8_io.tflite
│   ├── saved_model/
│   │   └── ...
│   └── saved_model.zip
├── Dockerfile
├── conversion.py
├── parameters.json
├── requirements.txt
└── train.py

We will explain what each of these files does in the rest of this getting started section.

Initializing the block

To initialize your block, the easiest method is to use the Edge Impulse CLI blocks command from within the my-custom-learning-block/ directory: edge-impulse-blocks init. Follow the on-screen prompts to log in to your account, select your organization, and configure your block:

$ edge-impulse-blocks init

Edge Impulse Blocks v1.21.1
? In which organization do you want to create this block?
❯ Developer Relations
  Medical Laboratories inc.
  Demo Team
Attaching block to organization 'Developer Relations'
? Choose a type of block
  Transformation block
  Deployment block
  DSP block
❯ Machine learning block
? Choose an option:
? Enter the name of your block: My learning block
? Enter the description of your block: Simple dense neural network classifier
? What type of data does this model operate on?
  Object Detection
  Image classification
  Audio classification
❯ Classification
  Regression
? Where can your model train? (Use arrow keys)
❯ Both CPU or GPU (default)
  Only on GPU (GPUs are only available for enterprise projects)

This creates a parameters.json file that describes your block:

{
    "version": 1,
    "type": "machine-learning",
    "info": {
        "name": "My learning block",
        "description": "Simple dense neural network classifier",
        "operatesOn": "other",
        "indRequiresGpu": false
    },
    "parameters": []
}

Download dataset

Learn blocks operate on data that has already been processed by an input and a processing block (e.g. if you have an audio dataset it's already split into individual samples, and then was turned into a spectrogram); and that is already split into a train and validation set. You can download this preprocessed data from any Edge Impulse project via the CLI. For this tutorial, first clone the Tutorial: Continuous motion recognition; then run:

$ edge-impulse-blocks runner --download-data data/

Edge Impulse Blocks v1.28.3
? Select a project to download training files and labels
    > Developer Relations / Tutorial: Continuous motion recognition
? Select a learn block to download files and labels
    > Neural Network (Keras)
# ... wait a little bit to generate features

This has created a data/ folder with features and labels (in NumPy format) for both your train and validation set:

$ ls -al data/

X_split_test.npy
X_split_train.npy
Y_split_test.npy
Y_split_train.npy
row_to_sample_id.json

If you add new data to your Edge Impulse project, just re-run the CLI command to update your local dataset.

Create learning script

Download the following Python scripts and requirements file:

You can also easily download these files with the following commands:

wget https://raw.githubusercontent.com/edgeimpulse/example-custom-ml-block-keras/8157a20395fef13eea1863371a880e2e66cda80a/conversion.py
wget https://raw.githubusercontent.com/edgeimpulse/example-custom-ml-block-keras/8157a20395fef13eea1863371a880e2e66cda80a/train.py
wget https://raw.githubusercontent.com/edgeimpulse/example-custom-ml-block-keras/8157a20395fef13eea1863371a880e2e66cda80a/requirements.txt

Feel free to look through these scripts to see how Keras is used to construct and train a simple dense neural network. Also, note that you are not required to use Python! You are welcome to use any language or system you wish, so long as it will run in a Docker container.

Important! Pay attention to the inputs (features) and outputs (trained model file) of your script. They must match the expected inputs and outputs of the block. See the Input format and Output format sections for more information.

Create Dockerfile

Next, we need to wrap our training script in a Docker image. To do that, we write a Dockerfile. If you are not familiar with Docker, we recommend working through Docker's getting started guide. See here to learn more about the required Dockerfile components in learning blocks.

Create a new file named Dockerfile (no extension) and copy in the following code:

FROM python:3.10.15

WORKDIR /app

# Copy Python requirements in and install them
COPY requirements.txt ./
RUN pip3 install -r requirements.txt

# Copy the rest of your training scripts in
COPY . ./

# And tell us where to run the pipeline
ENTRYPOINT ["python3", "-u", "train.py"]

Note: we are not installing CUDA for this simple example. If you wish to install CUDA in your image to enable GPU-accelerated training (which includes training inside your Edge Impulse project), please refer to the full example here.

Run learning block locally

Make sure you have Docker installed and running on your computer. Execute the following commands to build and run your image:

docker build -t custom-learning-block .
docker run --rm -v $PWD:/app custom-learning-block --data-directory /app/data/ --epochs 30 --learning-rate 0.01 --out-directory out/

You should see your model train for 30 epochs and then be converted to a .tflite file for inference. Your out/ directory should have the following files/folders:

model.tflite
model_quantized_int8_io.tflite
saved_model/
saved_model.zip

The saved_model.zip file is an archive of the saved_model/ directory, which contains your model stored in the TensorFlow SavedModel format. The model.tflite file is the float32 version of the model and converted to the LiteRT (previously Tensorflow Lite) format. The model_quantized_int8_io.tflite file is the same TFLite model, but with weights quantized to 8 bits.

Block parameters

You can expose your block's parameters to the Studio GUI by defining JSON settings in the parameters.json file. Open the parameters.json file and replace "parameters": [] with:

    "parameters": [{
        "name": "Number of training cycles",
        "value": "30",
        "type": "int",
        "help": "Number of epochs to train the neural network on.",
        "param": "epochs"
    }, {
        "name": "Learning rate",
        "value": "0.001",
        "type": "float",
        "help": "How fast the neural network learns, if the network overfits quickly, then lower the learning rate.",
        "param": "learning-rate"
    }]

This will expose the epochs and learning-rate parameters to the Studio interface so that users can make changes in the project. You can learn more about arguments in this section.

Push block to organization

Once you have verified operation of your block and configured the parameters, you will want to push it to your Edge Impulse organization. From your project directory, run the following command:

edge-impulse-blocks push

Once that command completes, head to your Organization in the Edge Impulse Studio. Click on Machine learning under Custom blocks. You should find your custom learning block listed there.

You can click on the three dots and select Edit block to view the configuration settings for your block.

Use your learning block in a project

To use your learn block go back to your clone of Tutorial: Continuous motion recognition.

Go to Create impulse, remove any existing learn blocks, then click on Add a learning block. Assuming your project is in your organization, you should see your custom learning block as one of the available blocks. Click add to use your custom learning block in your project.

Go to the My learning block page, where you should see the custom parameters you set (number of training cycles and learning rate). Feel free to change those, and select "Start training." When that is finished, you should have a trained model in your project created by your custom learning block!

You can now continue to model testing and deployment, as you would with any project.

Learning block concepts

Editing built-in blocks

Most built-in block in the Edge Impulse Studio (e.g. classifiers, regression models or FOMO blocks) can be edited locally, and then pushed back as a custom block. This is great if you want to make heavy modifications to these training pipelines, for example to do custom data augmentation. To download a block, go to any ML block in your project, click the three dots, select Edit block locally, and follow the instructions in the README.

Dockerfiles

Training pipelines in Edge Impulse are built on top of Docker containers, a virtualization technique which lets developers package up an application with all dependencies in a single package. To train your own model you'll need to wrap all the required packages, your scripts, and (if you use transfer learning) your pre-trained weights into this container. When running in Edge Impulse the container does not have network access, so make sure you don't download dependencies while running (fine when building the container).

Important: ENTRYPOINT

It's important to create an ENTRYPOINT at the end of the Dockerfile to specify which file to run.

GPU Support

If you want to have GPU support (only for enterprise customers), you'll need cuda packages installed. If you export a learn block from the Studio these will already have the right base packages, so use that Dockerfile as a starting point.

Arguments to your script

The entrypoint (see above in the Dockerfile) will be called with these parameters:

--data-directory - where you can find the data (see below for the input/output formats).
--out-directory - where to write the TFLite or ONNX files (see below for the input/output formats).

Additionally, you can specify custom arguments (like the learning rate, or whether to use data augmentation) through the parameters section in the parameters.json file in your block. This file describes all arguments for your training pipeline, and is used to render custom UI elements for each parameter. For example, this parameters file:

{
    "version": 1,
    "type": "machine-learning",
    "info": {
        "name": "My learning block",
        "description": "Simple dense neural network classifier",
        "operatesOn": "other",
        "indRequiresGpu": false
    },
    "parameters": [{
        "name": "Learning rate #1",
        "value": 0.01,
        "type": "float",
        "help": "Learning rate for the first 10 epochs",
        "param": "learning-rate-1"
    }, {
        "name": "Learning rate #2",
        "value": 0.001,
        "type": "float",
        "help": "Fine-tuning learning rate",
        "param": "learning-rate-2"
    }]
}

Will be displayed as:

And passes in --learning-rate-1 0.01 --learning-rate-2 0.001 to your script. For more information, and all options see Adding parameters to custom blocks.

Parameters.json format

This is the specification for the parameters.json type:

type MachineLearningBlockParametersJson = {
    version: 1,
    type: 'machine-learning',
    info: {
        name: string,
        description: string,
        // API model: OrganizationTransferLearningOperatesOn
        operatesOn?: 'object_detection' | 'audio' | 'image' | 'regression' | 'other',
        // API model: ObjectDetectionLastLayer
        objectDetectionLastLayer?: 'mobilenet-ssd' | 'fomo' | 'yolov2-akida' | 'yolov5' | 'yolov5v5-drpai' | 'yolox' | 'yolov7' | 'tao-retinanet' | 'tao-ssd' | 'tao-yolov3' | 'tao-yolov4',
        // API model: ImageInputScaling
        imageInputScaling?: '0..1' | '-1..1' | '-128..127' | '0..255' | 'torch' | 'bgr-subtract-imagenet-mean',
        indRequiresGpu?: boolean,
        repositoryUrl?: string,
        // API model: OrganizationTransferLearningBlockCustomVariant
        customModelVariants?: {
            key: string,
            name: string,
            modelFiles: {
                id: string,
                name: string,
                type: 'binary' | 'json' | 'text',
                description: string,
            }[],
        }[];
        // API model: BlockDisplayCategory
        displayCategory?: 'classical' | 'tao',
    },
    // see spec in https://docs.edgeimpulse.com/docs/tips-and-tricks/adding-parameters-to-custom-blocks
    parameters: DSPParameterItem[];
};

Input format

The data directory contains your dataset, after running any DSP blocks, and already split in a train/validation set:

X_split_train.npy
Y_split_train.npy
X_split_test.npy
Y_split_train.npy

The X_*.npy files are float32 Numpy arrays, already in the right shape (e.g. if you're training on 96x96 RGB images this will be of shape (n, 96, 96, 3)). You can typically load these without any modification into your training pipeline (see the notes after this section for caveats).

The Y_*.npy files are either:

int32 Numpy arrays, with four columns (label_index, sample_id, sample_slice_start_ms, sample_slice_end_ms).
A JSON array in the form of: [{ "sampleId": 234731, "boundingBoxes": [{ "label": 1, "x": 260, "y": 313, "w": 234, "h": 261 }] } ]

2) is sent if your dataset has bounding boxes, in all other cases 1) is sent.

Data format for image projects

For image projects, we automatically normalize data before passing the data to the ML block. The X_*.npy values may then be rescaled based on the selected input scaling when building the custom ML block (details in the next section).

To get new data for your project, just run (requires Edge Impulse CLI v1.16 or higher):

edge-impulse-blocks runner --download-data data/

This regenerates features (if necessary) and then downloads the updated dataset.

Input features for vision models

The input features for vision models are a 4D vector of shape (n, WIDTH, HEIGHT, CHANNELS), where the channel data is in RGB format. We support three ways of scaling the input:

Pixels ranging 0..1 - just the raw pixels, without any normalization. Data coming from the Image DSP block is unchanged.
Pixels ranging 0..255 - just the raw pixels, without any normalization. Data coming from the Image DSP block is multiplied by 255.
PyTorch - the default way that inputs are scaled in most torchvision models, first it takes the raw pixels 0..1 then normalizes per-channel using the ImageNet mean and standard deviation:\
```
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])
```

The input scaling is applied:

In the input features vector; so the inputs are already scaled correctly, no need to re-scale yourself. If you're converting the input features vector into images before training, as your training pipeline requires this, then make sure to un-normalize first.
When running inference, both in the Studio and on-device. So also, no need to re-scale yourself.

You can control the image input scaling when you create the block in the CLI (1.19.1 or higher), or by editing the block in the UI.

NCHW vs. NHWC

If you need data in channels-first (NCHW) mode, then you'll need to transpose the input feature vector yourself before training. You can still just write out an NCHW model, Edge Impulse supports both NHWC and NCHW models.

RGB vs BGR

Edge Impulse only supports RGB models. If you have a model that requires BGR input, rather than RGB input (e.g. Resnet50) you'll need to transpose the first and last channels.

In Keras you do this by adding a lambda layer. Example using Resnet50.
For PyTorch you do this by first converting the trained model to ONNX, then transposing using scc4onnx.

Output format

The training pipeline can output either TFLite files, ONNX files or pickled scikit-learn files:

If you output TFLite files

model.tflite - a TFLite file with float32 inputs and outputs.
model_quantized_int8_io.tflite - a quantized TFLite file with int8 inputs and outputs.
saved_model.zip - a TensorFlow saved model (optional).

At least one of the TFLite files is required.

If you output ONNX files

model.onnx - An ONNX file with float16 or float32 inputs and outputs.

We automatically convert this file to both unquantized and quantized TFLite files after training.

If you use scikit-learn

model.pkl - Pickled instance of the scikit-learn model. E.g.:

# instantiate model
clf = RandomForestClassifier()

# save to model.pkl
with open(os.path.join(args.out_directory, 'model.pkl'),'wb') as f:
    pickle.dump(clf, f)

Internally we use scikit-learn==1.3.2 for conversion, so pin to this scikit-learn version for best results. We also support LightGBM (3.3.5) and XGBOOST (1.7.6) models.

Hosting your custom block

Host your block directly within Edge Impulse with the Edge Impulse CLI:

$ edge-impulse-blocks init
$ edge-impulse-blocks push

To edit the block, go to:

Enterprise: go to your organization, Custom blocks > Machine learning.
Developers: click on your photo on the top right corner, select Custom blocks > Machine learning.

The block is now available from inside any of your Edge Impulse projects. Add it via Create impulse > Add a learning block.

Object detection output layers

Unfortunately object detection models typically don't have a standard way to go from neural network output layer to bounding boxes. Currently we support the following types of output layers (this list might be outdated, the ObjectDetectionLastLayer type in the API contains the latest):

MobileNet SSD
FOMO
YOLOv2 for BrainChip Akida
YOLOv5 (coordinates scaled 0..1)
YOLOv5 (coordinates in absolute values)
YOLOX
YOLOv7
NVIDIA TAO RetinaNet
NVIDIA TAO YOLOV3
NVIDIA TAO YOLOV4
NVIDIA TAO SSD

If you have an object detection model with a different output layer then please contact your user success engineer (enterprise) or let us know on the forums (free users) with an example on how to interpret the output, and we can add it.

Getting latency/memory information

When training locally you can use the profiling API to get latency, RAM and ROM estimates. This is very useful as you can immediately see whether your model will fit on device. Additionally, you can use this API as part your experiment tracking (f.e. in Weights & Biases or MLFlow) to wield out models that won't fit your latency or memory constraints.

The profiling API expects:

A TFLite file.
A reference device (for latency calculation) - you can get a list of all devices via getProjectInfo in the latencyDevices object.
A reference model (which model is closest to your architecture) - you can choose between gestures-large-f32, gestures-large-i8, image-32-32-mobilenet-f32, image-32-32-mobilenet-i8, image-96-96-mobilenet-f32, image-96-96-mobilenet-i8, image-320-320-mobilenet-ssd-f32, keywords-2d-f32, keywords-2d-i8. Make sure to use i8 models if you have quantized your model.

You can also use the Python SDK to profile your model easily. See here for an example on how to profile a model created in Keras.