LogoLogo
HomeDocsAPIProjectsForum
  • Getting Started
    • For beginners
    • For ML practitioners
    • For embedded engineers
  • Frequently asked questions
  • Tutorials
    • End-to-end tutorials
      • Continuous motion recognition
      • Responding to your voice
      • Recognize sounds from audio
      • Adding sight to your sensors
        • Collecting image data from the Studio
        • Collecting image data with your mobile phone
        • Collecting image data with the OpenMV Cam H7 Plus
      • Object detection
        • Detect objects using MobileNet SSD
        • Detect objects with FOMO
      • Sensor fusion
      • Sensor fusion using Embeddings
      • Processing PPG input with HR/HRV Features Block
      • Industrial Anomaly Detection on Arduino® Opta® PLC
    • Advanced inferencing
      • Continuous audio sampling
      • Multi-impulse
      • Count objects using FOMO
    • API examples
      • Running jobs using the API
      • Python API Bindings Example
      • Customize the EON Tuner
      • Ingest multi-labeled data using the API
      • Trigger connected board data sampling
    • ML & data engineering
      • EI Python SDK
        • Using the Edge Impulse Python SDK with TensorFlow and Keras
        • Using the Edge Impulse Python SDK to run EON Tuner
        • Using the Edge Impulse Python SDK with Hugging Face
        • Using the Edge Impulse Python SDK with Weights & Biases
        • Using the Edge Impulse Python SDK with SageMaker Studio
        • Using the Edge Impulse Python SDK to upload and download data
      • Label image data using GPT-4o
      • Label audio data using your existing models
      • Generate synthetic datasets
        • Generate image datasets using Dall·E
        • Generate keyword spotting datasets
        • Generate physics simulation datasets
        • Generate audio datasets using Eleven Labs
      • FOMO self-attention
    • Lifecycle Management
      • CI/CD with GitHub Actions
      • OTA Model Updates
        • with Nordic Thingy53 and the Edge Impulse APP
      • Data Aquisition from S3 Object Store - Golioth on AI
    • Expert network projects
  • Edge Impulse Studio
    • Organization hub
      • Users
      • Data campaigns
      • Data
      • Data transformation
      • Upload portals
      • Custom blocks
        • Transformation blocks
        • Deployment blocks
          • Deployment metadata spec
      • Health Reference Design
        • Synchronizing clinical data with a bucket
        • Validating clinical data
        • Querying clinical data
        • Transforming clinical data
        • Buildling data pipelines
    • Project dashboard
      • Select AI Hardware
    • Devices
    • Data acquisition
      • Uploader
      • Data explorer
      • Data sources
      • Synthetic data
      • Labeling queue
      • AI labeling
      • CSV Wizard (Time-series)
      • Multi-label (Time-series)
      • Tabular data (Pre-processed & Non-time-series)
      • Metadata
      • Auto-labeler [Deprecated]
    • Impulse design & Experiments
    • Bring your own model (BYOM)
    • Processing blocks
      • Raw data
      • Flatten
      • Image
      • Spectral features
      • Spectrogram
      • Audio MFE
      • Audio MFCC
      • Audio Syntiant
      • IMU Syntiant
      • HR/HRV features
      • Building custom processing blocks
        • Hosting custom DSP blocks
      • Feature explorer
    • Learning blocks
      • Classification (Keras)
      • Anomaly detection (K-means)
      • Anomaly detection (GMM)
      • Visual anomaly detection (FOMO-AD)
      • Regression (Keras)
      • Transfer learning (Images)
      • Transfer learning (Keyword Spotting)
      • Object detection (Images)
        • MobileNetV2 SSD FPN
        • FOMO: Object detection for constrained devices
      • NVIDIA TAO (Object detection & Images)
      • Classical ML
      • Community learn blocks
      • Expert Mode
      • Custom learning blocks
    • EON Tuner
      • Search space
    • Retrain model
    • Live classification
    • Model testing
    • Performance calibration
    • Deployment
      • EON Compiler
      • Custom deployment blocks
    • Versioning
  • Tools
    • API and SDK references
    • Edge Impulse CLI
      • Installation
      • Serial daemon
      • Uploader
      • Data forwarder
      • Impulse runner
      • Blocks
      • Himax flash tool
    • Edge Impulse for Linux
      • Linux Node.js SDK
      • Linux Go SDK
      • Linux C++ SDK
      • Linux Python SDK
      • Flex delegates
    • Edge Impulse Python SDK
  • Run inference
    • C++ library
      • As a generic C++ library
      • On your desktop computer
      • On your Zephyr-based Nordic Semiconductor development board
    • Linux EIM Executable
    • WebAssembly
      • Through WebAssembly (Node.js)
      • Through WebAssembly (browser)
    • Docker container
    • Edge Impulse firmwares
  • Edge AI Hardware
    • Overview
    • MCU
      • Nordic Semi nRF52840 DK
      • Nordic Semi nRF5340 DK
      • Nordic Semi nRF9160 DK
      • Nordic Semi nRF9161 DK
      • Nordic Semi nRF9151 DK
      • Nordic Semi nRF7002 DK
      • Nordic Semi Thingy:53
      • Nordic Semi Thingy:91
    • CPU
      • macOS
      • Linux x86_64
    • Mobile Phone
    • Porting Guide
  • Integrations
    • Arduino Machine Learning Tools
    • NVIDIA Omniverse
    • Embedded IDEs - Open-CMSIS
    • Scailable
    • Weights & Biases
  • Pre-built datasets
    • Continuous gestures
    • Running faucet
    • Keyword spotting
    • LiteRT (Tensorflow Lite) reference models
  • Tips & Tricks
    • Increasing model performance
    • Data augmentation
    • Inference performance metrics
    • Optimize compute time
    • Adding parameters to custom blocks
    • Combine Impulses
  • Concepts
    • Glossary
    • Data Engineering
      • Audio Feature Extraction
      • Motion Feature Extraction
    • ML Concepts
      • Neural Networks
        • Layers
        • Activation Functions
        • Loss Functions
        • Optimizers
          • Learned Optimizer (VeLO)
        • Epochs
      • Evaluation Metrics
    • Edge AI
      • Introduction to edge AI
      • What is edge computing?
      • What is machine learning (ML)?
      • What is edge AI?
      • How to choose an edge AI device
      • Edge AI lifecycle
      • What is edge MLOps?
      • What is Edge Impulse?
      • Case study: Izoelektro smart grid monitoring
      • Test and certification
    • What is embedded ML, anyway?
    • What is edge machine learning (edge ML)?
Powered by GitBook
On this page
  • Deploying your impulse as a Docker container
  • Running offline
  • Hardware acceleration
  1. Run inference

Docker container

PreviousThrough WebAssembly (browser)NextEdge Impulse firmwares

Last updated 6 months ago

Impulses can be deployed as a Docker container. This packages all your signal processing blocks, configuration and learning blocks up into a container; and then exposes an HTTP inference server. This works great if you have a gateway or cloud runtime that supports containerized workloads. The Docker container is built on top of the deployment option, and supports full hardware acceleration on most Linux targets.

Deploying your impulse as a Docker container

To deploy your impulse, head over to your trained Edge Impulse project, and go to Deployment. Here find "Docker container":

It depends on your gateway provider or cloud vendor how you'd run this container, but typically the container, arguments and ports to expose should be enough. If you have questions contact your solutions engineer (enterprise) or drop a question on the forum (community).

To test this out locally on macOS or Linux, copy the text under "in a one-liner locally", open a terminal, and paste the command in:

$ docker run --rm -it \
>     -p 1337:1337 \
>     public.ecr.aws/g7a8t7v6/inference-container:c94e7ccaca5d3e76e7ed6b046d7a5108b8762707 \
>         --api-key ei_0d... \
>         --run-http-server 1337
Unable to find image 'public.ecr.aws/g7a8t7v6/inference-container:c94e7ccaca5d3e76e7ed6b046d7a5108b8762707' locally
c94e7ccaca5d3e76e7ed6b046d7a5108b8762707: Pulling from g7a8t7v6/inference-container
82d728d38b98: Already exists
59f33b6794af: Pull complete
...

Edge Impulse Linux runner v1.5.1

[RUN] Downloading model...
[BLD] Created build job with ID 15195010
...
[BLD] Building binary OK
[RUN] Downloading model OK
[RUN] Stored model version in /root/.ei-linux-runner/models/1/v231/model.eim
[RUN] Starting HTTP server for Edge Impulse Inc. / Continuous gestures demo (v231) on port 1337
[RUN] Parameters freq 62.5Hz window length 2000ms. classes [ 'drink', 'fistbump', 'idle', 'snake', 'updown', 'wave' ]
[RUN]
[RUN] HTTP Server now running at http://localhost:1337

The inference server exposes the following routes:

curl -v -X POST -H "Content-Type: application/json" -d '{"features": [5, 10, 15, 20]}' http://localhost:1337/api/features
curl -v -X POST -F 'file=@path-to-an-image.jpg' http://localhost:1337/api/image

The result of the inference request depends on your model type. You can always see the raw output by using "Try out inferencing" in the inference server UI.

Classification / anomaly detection

Both anomaly and classification are optional, depending on the blocks included in your impulse.

{
    "result": {
        "anomaly": -0.18470126390457153,
        "classification": {
            "drink": 0.007849072106182575,
            "fistbump": 0.0008145281462930143,
            "idle": 0.00002064668842649553,
            "snake": 0.0002238723391201347,
            "updown": 0.0015580836916342378,
            "wave": 0.9895338416099548
        }
    },
    "timing": {
        "anomaly": 0,
        "classification": 0,
        "dsp": 0,
        "json": 0,
        "stdin": 0
    }
}

Object detection

{
    "result": {
        "bounding_boxes": [
            {
                "height": 8,
                "label": "face",
                "value": 0.6704540252685547,
                "width": 8,
                "x": 48,
                "y": 40
            }
        ]
    },
    "timing": {
        "anomaly": 0,
        "classification": 1,
        "dsp": 0,
        "json": 1,
        "stdin": 1
    }
}

Running offline

When you run the container it'll use the Edge Impulse API to build and fetch your latest model version. This thus requires internet access. Alternatively you can download the EIM file (containing your complete model) and mount it in the container instead - this will remove the requirement for any internet access.

First, use the container to download the EIM file (here to a file called my-model.eim in your current working directory):

docker run --rm -it \
    -v $PWD:/data \
    public.ecr.aws/g7a8t7v6/inference-container:c94e7ccaca5d3e76e7ed6b046d7a5108b8762707 \
    --api-key ei_0de... \
    --download /data/my-model.eim

Note that the .eim file is hardware specific; so if you run the download command on an Arm machine (like your Macbook M1) you cannot run the eim file on an x86 gateway. To build for another architecture, run with --list-targets and follow the instructions.

Then, when you run the container next, mount the eim file back in (you can omit the API key now, it's no longer needed):

docker run --rm -it \
    -v $PWD:/data \
    -p 1337:1337 \
    public.ecr.aws/g7a8t7v6/inference-container:c94e7ccaca5d3e76e7ed6b046d7a5108b8762707 \
    --model-file /data/my-model.eim \
    --run-http-server 1337

Hardware acceleration

The Docker container is supported on x86 and aarch64 (64-bits Arm). When you run a model we automatically detect your hardware architecture and compile in hardware-specific optimizations so the model runs as fast as possible on the CPU.

If your device has a GPU or NPU we cannot automatically detect that from inside the container, so you'll need to manually override the target. To see a list of all available targets add --list-targets when you run the container. It'll return something like:

Listing all available targets
-----------------------------
target: runner-linux-aarch64, name: Linux (AARCH64), supported engines: [tflite]
target: runner-linux-armv7, name: Linux (ARMv7), supported engines: [tflite]
target: runner-linux-x86_64, name: Linux (x86), supported engines: [tflite]
target: runner-linux-aarch64-akd1000, name: Linux (AARCH64 with AKD1000 MINI PCIe), supported engines: [akida]
# ...

You can force a target via "edge-impulse-linux-runner --force-target <target> [--force-engine <engine>]"

To then override the target, add --force-target <target>.

Note that you also need to forward the NPU or GPU to the Docker container to make this work - and this is not always supported. F.e. for GPUs (like on an NVIDIA Jetson Nano development board):

docker run --gpus all \
    # rest of the command

This downloads the latest version of the Docker base image, builds your impulse for your current architecture, and then exposes the inference HTTP server. To view the inference server, go to .

GET - returns a JSON object with information about the model, and the inputs / outputs it expects.

POST - run inference on raw sensor data. Expects a request with a JSON body containing a features array. You can find raw features on Live classification. Example call:

POST - run inference on an image. Only available for impulses that use an image as input block. Expects a multipart/form-data request with a file object that contains a JPG or PNG image. Images that are not in a size matching your impulse are resized using resize mode . Example call:

http://localhost:1337
http://localhost:1337/api/info
http://localhost:1337/api/features
http://localhost:1337/api/image
contain
Linux EIM executable
Docker container deployment
HTTP Inference server ran from a Docker container