Docker container
Last updated
Last updated
Impulses can be deployed as a Docker container. This packages all your signal processing blocks, configuration and learning blocks up into a container; and then exposes an HTTP inference server. This works great if you have a gateway or cloud runtime that supports containerized workloads. The Docker container is built on top of the Linux EIM executable deployment option, and supports full hardware acceleration on most Linux targets.
To deploy your impulse, head over to your trained Edge Impulse project, and go to Deployment. Here find "Docker container":
It depends on your gateway provider or cloud vendor how you'd run this container, but typically the container
, arguments
and ports to expose
should be enough. If you have questions contact your solutions engineer (enterprise) or drop a question on the forum (community).
To test this out locally on macOS or Linux, copy the text under "in a one-liner locally", open a terminal, and paste the command in:
This downloads the latest version of the Docker base image, builds your impulse for your current architecture, and then exposes the inference HTTP server. To view the inference server, go to http://localhost:1337 .
The inference server exposes the following routes:
GET
http://localhost:1337/api/info - returns a JSON object with information about the model, and the inputs / outputs it expects.
POST
http://localhost:1337/api/features - run inference on raw sensor data. Expects a request with a JSON body containing a features
array. You can find raw features on Live classification. Example call:
POST
http://localhost:1337/api/image - run inference on an image. Only available for impulses that use an image as input block. Expects a multipart/form-data request with a file
object that contains a JPG or PNG image. Images that are not in a size matching your impulse are resized using resize mode contain. Example call:
The result of the inference request depends on your model type. You can always see the raw output by using "Try out inferencing" in the inference server UI.
Both anomaly
and classification
are optional, depending on the blocks included in your impulse.
When you run the container it'll use the Edge Impulse API to build and fetch your latest model version. This thus requires internet access. Alternatively you can download the EIM file (containing your complete model) and mount it in the container instead - this will remove the requirement for any internet access.
First, use the container to download the EIM file (here to a file called my-model.eim
in your current working directory):
Note that the
.eim
file is hardware specific; so if you run the download command on an Arm machine (like your Macbook M1) you cannot run the eim file on an x86 gateway. To build for another architecture, run with--list-targets
and follow the instructions.
Then, when you run the container next, mount the eim file back in (you can omit the API key now, it's no longer needed):
The Docker container is supported on x86 and aarch64 (64-bits Arm). When you run a model we automatically detect your hardware architecture and compile in hardware-specific optimizations so the model runs as fast as possible on the CPU.
If your device has a GPU or NPU we cannot automatically detect that from inside the container, so you'll need to manually override the target. To see a list of all available targets add --list-targets
when you run the container. It'll return something like:
To then override the target, add --force-target <target>
.
Note that you also need to forward the NPU or GPU to the Docker container to make this work - and this is not always supported. F.e. for GPUs (like on an NVIDIA Jetson Nano development board):