1 of 3

Deployment

After training and validating your model, you can now deploy it to any device. This makes the model run without an internet connection, minimizes latency, and runs with minimal power consumption.

The Deployment page consists of a variety of deploy options to choose from depending on your target device. Regardless of whether you are using a fully supported development board or not, Edge Impulse provides deploy options through C++ library in which you can use to deploy your model on any targets (as long as the target has enough compute can handle the task).

The following are the 5 main categories of deploy options currently supported by Edge Impulse:

Deploy as a customizable library
Deploy as a pre-built firmware - for fully supported development boards
Run directly on your phone or computer
Use Edge Impulse for Linux for Linux targets
Create a custom deployment block (Enterprise feature)

Deploying as a customizable library

This deploy option lets you turn your impulse into a fully optimized source code that can be further customized and integrated with your application. This option supports the following libraries:

Arduino Library

You can run your impulse locally as an Arduino library. This packages all of your signal processing blocks, configuration and learning blocks up into a single package.

To deploy as an Arduino library, select Arduino library on the Deployment page and click Build to create the library. Download the .ZIP file and import it as a sketch in your Arduino IDE then run your application.

For a full tutorial on how to run your impulse locally as an arduino library, have a look at Running your impulse locally - Arduino.

C++ Library

You can run your Impulse as a C++ library. This packages all of your signal processing blocks, configuration and learning blocks up into a single package that can be easily ported to your custom applications.

Visit Running your impulse locally for a deep dive on how to deploy your impulse as a C++ library.

Cube.MX CMSIS-PACK library

If you want to deploy your impulse to an STM32 MCU, you can use the Cube.MX CMSIS-PACK. This packages all your signal processing blocks, configuration and learning blocks up into a single package. You can include this package in any STM32 project with a single function call.

Have a look at Running your impulse locally - using CubeAI for a deep dive on how to deploy your impulse on STM32 based targets using the Cube.MX CMSIS-PACK.

WebAssembly Library

When you want to deploy your impulse to a web app you can use the WebAssembly library.This packages all your signal processing blocks, configuration and learning blocks up into a single package that can run without any compilation.

Have a look at Running your impulse locally - through WebAssembly (Browser) fora deep dive on how you can run your impulse to classify sensor data in your Node.js application.

Deploy as a pre-built firmware

For this option, you can use a ready-to-go binary for your development board that bundles signal processing blocks, configuration and learning blocks up into a single package. This option is currently only available for fully supported development boards as shown in the image below:

To deploy your model using ready to go binaries, select your target device and click "build". Flash the downloaded firmware to your device then run the following command:

edge-impulse-run-impulse

The impulse runner shows the results of your impulse running on your development board. This only applies to ready-to-go binaries built from the studio.

Edge Impulse for Linux

If you are developing for Linux based devices, you can use Edge Impulse for Linux for deployment. It contains tools which let you collect data from any microphone or camera, can be used with the Node.js, Python, Go and C++ SDKs to collect new data from any sensor, and can run impulses with full hardware acceleration - with easy integration points to write your own applications.

For a deep dive on how to deploy your impulse to linux targets using Edge Impulse for linux, you can visit the Edge Impulse for Linux tutorial.

Deploy to your mobile phone/computer

You can run your impulse directly on your computer/mobile phone without the need of additional app. To run on your computer, you simply just need to select "computer" then click "Switch to classification mode". To run on your mobile phone, select 'Mobile Phone' then scan the QR code and click 'switch to classification mode".

Optimizations

Enabling EON Compiler

When building your impulse for deployment, Edge Impulse gives you the option of adding another layer of optimization to your impulse using the EON compiler. The EON Compiler lets you run neural networks in 25-55% less RAM, and up to 35% less flash, while retaining the same accuracy, compared to TensorFlow Lite for Microcontrollers.

To activate the EON Compiler, select you preferred deployment option then go to Enable EON™ Compiler then enable it and click 'Build' to build your impulse for deployment.

To have a peek of how your impulse would utilize compute resources of your target device, Edge Impulse also gives an estimate of latency, flash, RAM to be consumed by your target device even before deploying your impulse locally. This can really save you a lot of engineering time costs incurred by recurring iterations and experiments.

You can also select whether to run the unquantized float32 or the quantized int8 models as shown in the image below.

The above confusion matrix is only based on the test data to help you know how your model performs on unseen real world data. It can also help you know whether your model has learned to overfit on your training data which is a common occurrence.

Building deployment blocks

One of the most powerful features in Edge Impulse are the built-in deployment targets (under Deployment in the Studio), which let you create ready-to-go binaries for development boards, or custom libraries for a wide variety of targets that incorporate your trained impulse. You can also create custom deployment blocks for your organization. This lets developers quickly iterate on products without getting your embedded engineers involved, lets your customers build personalized firmware using their own data, or lets you create custom libraries.

In this tutorial you'll learn how to use custom deployment blocks to create a new deployment target, and how to make this target available in the Studio for all users in the organization.

Only available for enterprise customers

Organizational features are only available for enterprise customers. View our pricing for more information.

Prerequisites

You'll need:

The Edge Impulse CLI.
- If you receive any warnings that's fine. Run edge-impulse-blocks afterwards to verify that the CLI was installed correctly.

Deployment blocks use Docker containers, a virtualization technique which lets developers package up an application with all dependencies in a single package. If you want to test your blocks locally you'll also need (this is not a requirement):

Docker desktop installed on your machine.

Then, create a new folder on your computer named custom-deploy-block.

1. Getting basic deployment info

When a user deploys with a custom deployment block two things happen:

A package is created that contains information about the deployment (like the sensors used, frequency of the data, etc.), any trained neural network in .tflite and SavedModel formats, the Edge Impulse SDK, and all DSP and ML blocks as C++ code.
This package is then consumed by the custom deployment block, which can incorporate it with a base firmware, or repackage it into a new library.

To obtain this package go to your project's Dashboard, look for Administrative zone, enable Custom deploys, and click Save.

If you now go to the Deployment page, a new option appears under 'Create library':

Once you click Build you'll receive a ZIP file containing five items:

deployment-metadata.json - this contains all information about the deployment, like the names of all classes, the frequency of the data, full impulse configuration, and quantization parameters. A specification can be found here: Deployment metadata spec.
trained.tflite - if you have a neural network in the project this contains neural network in .tflite format. This network is already fully quantized if you choose the int8 optimization, otherwise this is the float32 model.
trained.savedmodel.zip - if you have a neural network in the project this contains the full TensorFlow SavedModel. Note that we might update the TensorFlow version used to train these networks at any time, so rely on the compiled model or the TFLite file where possible.
edge-impulse-sdk - a copy of the latest Inferencing SDK.
model-parameters - impulse and block configuration in C++ format. Can be used by the SDK to quickly run your impulse.
tflite-model - neural network as source code in a way that can be used by the SDK to quickly run your impulse.

Store the unzipped file under custom-deploy-block/input.

2. Building a new binary

With the basic information in place we can create a new deployment block. Here we'll build a standalone application that runs our impulse on Linux, very useful when running your impulse on a gateway or desktop computer. First, open a command prompt or terminal window, navigate to the custom-deploy-block folder (that you created under 1.), and run:

$ edge-impulse-blocks init

This will prompt you to log in, and enter the details for your block.

Next, we'll add the application. The base application can be found at edgeimpulse/example-standalone-inferencing.

Download the base application.
Unzip under custom-deploy-block/app.

To build this application we need to combine the application with the edge-impulse-sdk, model-parameters and tflite-model folder, and invoke the (already included) Makefile.

2.1 Creating a build script

To build the application we use Docker, a virtualization technique which lets developers package up an application with all dependencies in a single package. In this container we'll place the build tools required for this application, and scripts to combine the trained impulse with the base application.

First, let's create a small build script. As a parameter you'll receive --metadata which points to the deployment information. In here you'll also get information on the input and output folders where you need to read from and write to.

Create a new file called custom-deploy-block/build.py and add:

build.py

import argparse, json, os, shutil, zipfile, threading

# parse arguments (--metadata FILE is passed in)
parser = argparse.ArgumentParser(description='Custom deploy block demo')
parser.add_argument('--metadata', type=str)
args = parser.parse_args()

# load the metadata.json file
with open(args.metadata) as f:
    metadata = json.load(f)

# now we have two folders 'metadata.folders.input' - this is where all the SDKs etc are,
# and 'metadata.folders.output' - this is where we need to write our output
input_dir = metadata['folders']['input']
app_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'app')
output_dir = metadata['folders']['output']

print('Copying files to build directory...')

is_copying = True
def print_copy_progress():
    if (is_copying):
        threading.Timer(2.0, print_copy_progress).start()
        print("Still copying...")
print_copy_progress()

# create a build directory, the input / output folders are on network storage so might be very slow
build_dir = '/tmp/build'
if os.path.exists(build_dir):
    shutil.rmtree(build_dir)
os.makedirs(build_dir)

# copy in the data from both 'input' and 'app' folders
os.system('cp -r ' + input_dir + '/* ' + build_dir)
os.system('cp -r ' + app_dir + '/* ' + build_dir)

is_copying = False

print('Copying files to build directory OK')
print('')

print('Compiling application...')

is_compiling = True
def print_compile_progress():
    if (is_compiling):
        threading.Timer(2.0, print_compile_progress).start()
        print("Still compiling...")
print_compile_progress()

# then invoke Make
os.chdir(build_dir)
os.system('make -f Makefile.tflite')

is_compiling = False

print('Compiling application OK')

# ZIP the build folder up, and copy to output dir
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
shutil.make_archive(os.path.join(output_dir, 'deploy'), 'zip', os.path.join(build_dir, 'build'))

Next, we need to create a Dockerfile, which contains all dependencies for the build. These include GNU Make, a compiler, and both the build script and the base application.

Create a new file called custom-deploy-block/Dockerfile and add:

Dockerfile

FROM ubuntu:18.04

WORKDIR /ei

# Install base dependencies
RUN apt update && apt install -y build-essential software-properties-common wget

# Install LLVM 9
RUN wget https://apt.llvm.org/llvm.sh && chmod +x llvm.sh && ./llvm.sh 9
RUN rm /usr/bin/gcc && rm /usr/bin/g++ && ln -s $(which clang-9) /usr/bin/gcc && ln -s $(which clang++-9) /usr/bin/g++

# Install Python 3.7
RUN apt install -y python3.7

# Copy the base application in
COPY app ./app

# Copy any scripts in that we have
COPY *.py ./

# This is the script our application should run (-u to disable buffering)
ENTRYPOINT [ "python3", "-u", "build.py" ]

2.2 Testing the build script with Docker

To test the build script we first build the container, then invoke it with the files from the input directory. Open a command prompt or terminal, navigate to the custom-deploy-block folder and:

Build the container:

$ docker build -t cdb-demo .

Invoke the build script - this mounts the current directory in the container under /home, and then passes the downloaded metadata script to the container:

$ docker run --rm -it -v $PWD:/home cdb-demo --metadata /home/input/deployment-metadata.json

Voila. You now have an output folder which contains a ZIP file. Unzip output/deploy.zip and now you have a standalone application which runs your impulse. If you run Linux you can invoke this application directly (grab some data from 'Live classification' for the features, see Running your impulse locally):

$ ./output/edge-impulse-standalone "RAW FEATURES HERE"

Or if you run Windows or macOS, you can use Docker to run this application:

$ docker run --rm -v $PWD/output:/home ubuntu:18.04 /home/edge-impulse-standalone "RAW FEATURES HERE"

3. Uploading the deployment block to Edge Impulse

With the deployment block ready you can make it available in Edge Impulse. Open a command prompt or terminal window, navigate to the folder you created earlier, and run:

$ edge-impulse-blocks push

This packages up your folder, sends it to Edge Impulse where it'll be built, and finally is added to your organization. The transformation block is now available in Edge Impulse under Deployment blocks. You can go here to set the logo, update the description, and set extra command line parameters.

Privileged mode

Deployment blocks do not have access to the internet by default. If you need this, or if you need to pull additional information from the project (e.g. access to DSP blocks) you can set the 'privileged' flag on a deployment block. This will enable outside internet access, and will pass in the project.apiKey parameter in the metadata (if a development API key is set) that you can use to authenticate with the Edge Impulse API.

4. Using the deployment block

The deployment block is automatically available for all organizational projects. Go to the Deployment page on a project, and you'll find a new section 'Custom targets'. Select your new deployment target and click Build.

And now you'll have a freshly built binary from your own deployment block!

5. Conclusion

Custom deployment blocks are a powerful tool for your organization. They let you build binaries for unreleased products, let you package up impulse as custom libraries, or can let your customers deploy to private targets (if you add an external collaborator to a project they'll have access to the blocks as well). Because the deployment blocks are integrated with your project, and hosted by Edge Impulse this lets everyone, from FAE to R&D developer, now iterate on on-device models without getting your embedded engineers involved.

You can also use custom deployment blocks with the other organizational features, and can use this to set up powerful pipelines automating data ingestion from your cloud services, transforming raw data into ML-suitable data, training new impulses and then deploying back to your device - either through the UI, or via the API. If you're interested in deployment blocks or any of the other enterprise features, let us know!

Deployment metadata spec

This is the specification for the deployment-metadata.json file from Building deployment blocks.

export interface DeploymentMetadataV1 {
    version: 1;
    // Global deployment counter
    deployCounter: number;
    // The output classes (for classification)
    classes: string[];
    // The number of samples to be taken per inference (e.g. 100Hz data, 3 axis, 2 seconds => 200)
    samplesPerInference: number;
    // Number of axes ((e.g. 100Hz data, 3 axis, 2 seconds => 3)
    axesCount: number;
    // Frequency of the data
    frequency: number;
    // TFLite models (already converted and quantized)
    tfliteModels: {
        // Information about the model type, e.g. quantization parameters
        details: KerasModelIODetails;
        // Name of the input tensor
        inputTensor: string;
        // Name of the output tensor
        outputTensor: string;
        // Path of the model on disk
        modelPath: string;
        // Calculated arena size when running TFLite in interpreter mode
        arenaSize: number;
        // Number of values to be passed into the model
        inputFrameSize: number;
    }[];
    // Project information
    project: {
        name: string;
        // API key, only set for deploy blocks with privileged flag and development keys set
        apiKey: string | undefined;
    };
    // Impulse information
    impulse: DeploymentMetadataImpulse;
    // Sensor guess based on the input
    sensor: 'camera' | 'microphone' | 'accelerometer' | undefined;
    // Folder locations
    folders: {
        // Input files are here, the input folder contains 'edge-impulse-sdk', 'model-parameters', 'tflite-model'
        input: string;
        // Write your output file here
        output: string;
    };
}

export type ResizeEnum = 'squash' | 'fit-short' | 'fit-long' | 'crop';
export type CropAnchorEnum = 'top-left' | 'top-center' | 'top-right' |
                             'middle-left' | 'middle-center' | 'middle-right' |
                             'bottom-left' | 'bottom-center' | 'bottom-right';

export interface CreateImpulseStateInput {
    id: number;
    type: 'time-series' | 'image';
    name: string;
    title: string;
    windowSizeMs?: number;
    windowIncreaseMs?: number;
    imageWidth?: number;
    imageHeight?: number;
    resizeMode?: ResizeEnum;
    cropAnchor?: CropAnchorEnum;
}

export interface CreateImpulseStateDsp {
    id: number;
    type: string | 'custom';
    name: string;
    axes: string[];
    title: string;
    customUrl?: string;
}

export interface CreateImpulseStateLearning {
    id: number;
    type: string;
    name: string;
    dsp: number[];
    title: string;
}

export interface CreateImpulseState {
    inputBlocks: CreateImpulseStateInput[];
    dspBlocks: CreateImpulseStateDsp[];
    learnBlocks: CreateImpulseStateLearning[];
}

export interface DSPConfig {
    options: { [k: string ]: string | number | boolean };
}

export type DSPFeatureMetadataOutput = {
    type: 'image',
    shape: { width: number, height: number, channels: number }
} | {
    type: 'spectrogram',
    shape: { width: number, height: number }
} | {
    type: 'flat',
    shape: { width: number }
};

export interface DSPFeatureMetadata {
    created: Date;
    dspConfig: DSPConfig;
    labels: string[];   // the training labels
    featureLabels: string[];
    valuesPerAxis: number;
    windowCount: number;
    windowSizeMs: number;
    windowIncreaseMs: number;
    frequency: number;
    includedSamples: { id: number, windowCount: number }[];
    outputConfig: DSPFeatureMetadataOutput;
}

/**
 * Information necessary to quantize or dequantize the contents of a tensor
 */
export type KerasModelTensorDetails = {
    dataType: 'float32'
} | {
    dataType: 'int8';
    // Scale and zero point are used only for quantized tensors
    quantizationScale?: number;
    quantizationZeroPoint?: number;
};

export type KerasModelTypeEnum = 'int8' | 'float32' | 'requiresRetrain';

/**
 * Information required to process a model's input and output data
 */
export interface KerasModelIODetails {
    modelType: KerasModelTypeEnum;
    inputs: KerasModelTensorDetails[];
    outputs: KerasModelTensorDetails[];
}

export interface DeploymentMetadataImpulse {
    inputBlocks: CreateImpulseStateInput[];
    dspBlocks: (CreateImpulseStateDsp & { metadata: DSPFeatureMetadata | undefined })[];
    learnBlocks: CreateImpulseStateLearning[];
}