Edge Impulse Python SDK
Last updated
Was this helpful?
Last updated
Was this helpful?
The Edge Impulse Python SDK is a library to help you develop machine learning (ML) applications for edge and Internet of Things (IoT) devices. While the Edge Impulse Studio is a great interface for guiding you through the process of collecting data and training a model, the Python SDK allows you to programmatically Bring Your Own Model (BYOM), developed and trained on any platform.
Important documentation for the Python SDK:
Fastest way to get started:
See the Python SDK in action with
Use the new Python SDK to upload a trained model and utilize profiling and deployment processes. Get RAM, ROM, and inference time estimates on edge hardware, and then convert to embedded code (e.g. C++, libraries, full binaries). You can also now upload your trained models to Studio if you still prefer to work in our Studio environment. If a model or operation is not supported, the Studio or Python SDK will gracefully let you know.
Keep reading to try out BYOM yourself with the Python SDK!
The Python SDK consists of two main libraries:
The Python SDK, on the other hand, offers an easy-to-use interface to perform several common functions. For instance, you can use the SDK to profile your model, which estimates the RAM, ROM, and inference time when using your model on one of several hardware platforms. The SDK also lets you deploy your model easily, converting it from one of several formats to a C++ library (or other supported deployment format).
Install the Python SDK with:
To use the Python SDK, you need to first create a project in Edge Impulse and copy the API key. Once you have created the project, open it, navigate to Dashboard and click on the Keys tab to view your API keys. Double-click on the API key to highlight it, right-click, and select Copy.
Note that you do not actually need to use the project in the Edge Impulse Studio. We just need the API Key.
From there, import the package and set the API key:
The functions in the Python SDK can be used in your MLOps pipelines to help you develop edge ML models as well as automatically deploy your model to your target hardware.
The following input formats are supported:
You can pass a model (in one of the supported input formats) along with one of several possible hardware targets to the profile()
function. This will send the model to your Edge Impulse project, where the RAM, ROM, and inference time will be estimated based on the target hardware.
To get the available hardware targets for profiling, run the following:
You should see a list printed such as:
A common option is the 'cortex-m4f-80mhz'
, as this is a relatively low-power microcontroller family. From there, we can use the Edge Impulse Python SDK to generate a profile for your model to ensure it fits on your target hardware and meets your timing requirements.
This will produce an output such as the following:
You can then parse the output from the response variable (resp
) in your MLOPs pipeline to determine if your model will fit within your hardware constraints. For example, the following will print out the RAM and ROM requirements along with the estimated inference time (ms) for the cortex-m4f-80mhz
target (assuming you are using the float32 version of the model):
The default option downloads a .zip file containing a C++ library containing the optimized inference runtime and your trained model. As long as you have a C++ compiler for your target hardware (and enough RAM and ROM), you can run inference!
The following will convert "my_model"
(which might be a SavedModel directory) to a C++ library. Note that you need to specify the model type (Classification
, in this case).
To get the full list of available hardware targets for deployment, run the following:
You should see a list printed such as:
You can pass your desired target into ei.model.deploy()
using the deploy_target
argument, for example deploy_target='zip'
.
You can optionally quantize a model during deployment. A quantized model will use an internal int8
numeric representation rather than float32
, which can result in reduced memory usage and faster computation on many targets.
You can pass the representative data sample via the representative_data_for_quantization
argument:
Note that quantization is a form of lossy compression and may result in a reduction in model performance. It's important to evaluate your model after quantization to ensure it still performs well enough for your use case.
We offer the following tutorials to help you use the Edge Impulse Python SDK with a number of other machine-learning platforms:
You can upload models trained in a variety of frameworks, such as TensorFlow, PyTorch, or MATLAB, to the Studio or use our to script the profiling and deployment processes.
- Classes and functions built using the Python API bindings to make the process of profiling and deploying your models easier. You can view the API reference guide for the .
- Python wrappers for the that allow you to interact with projects programmatically (i.e. without needing to use the Studio graphical interface). You can view the API reference guide for the .
You can use the Python API bindings to control your account settings, create projects, add data, train models, deploy models, and so on. These functions offer the granular control found within the web API without needing to construct HTTP requests manually. You can read more about how to use the .
The Python SDK package is known as edgeimpulse and is . Note that when you install the edgeimpulse Python SDK package (e.g. with pip), the (known as the edgeimpulse_api package) will automatically be installed as a dependency.
(directory location or .zip of SavedModel directory)
(.lite or .tflite, or directly from memory)
(use to export a PyTorch model to ONNX)
You can check the limitations in the section.
You can even set up experiments (for example, see ) to see how changing the model architecture and adjusting hyperparameters affects the predicted memory and timing requirements.
Once you are ready to deploy your model, you can call the deploy()
function to convert your model from one of the available input formats to one of the Edge Impulse supported outputs. Edge Impulse can output a number of possible for a wide variety of target hardware.
Your C++ library can be found in a .zip file in the current directory. If you do not specify output_directory
, the file(s) will not be downloaded. Instead, you can use the return value of ei.model.deploy()
, which is the file as a raw set of bytes. You can then write those bytes to a file of your choosing. See for a demonstration.
You can read more about using the C++ library for inference .
Important! The deployment targets list will change depending on the values provided for model
, model_output_type
, and model_input_type
in the next part. For example, you will not see openmv
listed once you upload a model (e.g. using .profile()
or .deploy()
) if model_input_type
is not set to ei.model.input_type.ImageInput()
. If you attempt to deploy to an unavailable target, you will receive the error Could not deploy: deploy_target: ...
. If model_input_type
is not provided, it will default to . See for more information about input types.
Quantization requires a sample of data that is representative of the range (maximum and minimum) of values in your training data. It should either be an in-memory , or the path to a . Each element of the array must have the same shape as your model's input.