Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Welcome to Edge Impulse! When we started Edge Impulse, we initially focused on developing a suite of engineering tools designed to empower embedded engineers to harness the power of machine learning on edge devices. As we grew, we also started to develop advanced tools for ML practitioners to ease the collaboration between teams in organizations.
In this getting started guide, we'll walk you through the essential steps to dive into Edge Impulse and leverage it for your embedded projects.
Embedded systems are becoming increasingly intelligent, and Edge Impulse is here to streamline the integration of machine learning into your hardware projects. Here's why embedded engineers are turning to Edge Impulse:
Extend hardware capabilities: Edge Impulse can extend hardware capabilities by enabling the integration of machine learning models, allowing edge devices to process complex tasks, recognize patterns, and make intelligent decisions that are complex to develop using rule-based algorithms.
Open-source export formats: Exported models and libraries contain both digital signal processing code and machine learning models, giving you full explainability of the code.
Powerful integrations: Edge Impulse provides complete and documented integrations with various hardware platforms, allowing you to focus on the application logic rather than the intricacies of machine learning.
Support for diverse sensors: Whether you're working with accelerometers, microphones, cameras, or custom sensors, Edge Impulse accommodates a wide range of data sources for your projects.
Predict on-device performances: Models trained in Edge Impulse run directly on your edge devices, ensuring real-time decision-making with minimal latency. We provide tools to ensure the DSP and models developed with Edge Impulse can fit your device constraints.
Device-aware optimization: You have full control over model optimization, enabling you to tailor your machine-learning models to the specific requirements and constraints of your embedded systems. Our EON tuner can help you select the best model by training many different variants of models only from an existing dataset and your device constraints!
Ready to embark on your journey with Edge Impulse? Follow these essential steps to get started:
Start by creating your Edge Impulse account. Registration is straightforward, granting you immediate access to the comprehensive suite of tools and resources.
Upon logging in, initiate your first project. Select a name that resonates with your project's objectives. If you already which hardware target or system architecture you will be using, you can set it up directly in the dashboard's project info section. This will help you to make sure your model fits your device constraints.
We offer various methods to collect data from your sensors or to import datasets (see Data acquisition for all methods). For the officially supported hardware targets, we provide binaries or simple steps to attach your device to Edge Impulse Studio and collect data from the Studio. However, as an embedded engineer, you might want to collect data from sensors that are not necessarily available on these devices. To do so, you can use the Data forwarder and print out your sensor values over serial (up to 8kHz) or use our C Ingestion SDK, a portable header-only library (designed to reliably store sampled data from sensors at a high frequency in very little memory).
Edge Impulse offers an intuitive model training process through processing blocks and learning blocks. You don't need to write Python code to train your model; the platform guides you through feature extraction, model creation, and training. Customize and fine-tune your blocks for optimal performance on your hardware. Each block will provide on-device performance information showing you the estimated RAM, flash, and latency.
This is where the fun start, you can easily export your model as ready-to-flash binaries for all the officially supported hardware targets. This method will let you test your model on real hardware very quickly.
In addition, we also provide a wide variety of export methods to easily integrate your model with your application logic. See C++ library to run your model on any device that supports C++ or our guides for Arduino library, Cube.MX CMSIS-PACK, DRP-AI library, OpenMV library, Ethos-U library, Meta TF model, Simplicity Studio Component, Tensai Flow library, TensorRT library, TIDL-RT library, etc...
The C++ inferencing library is a portable library for digital signal processing and machine learning inferencing, and it contains native implementations for both processing and learning blocks in Edge Impulse. It is written in C++11 with all dependencies bundled and can be built on both desktop systems and microcontrollers. See Inferencing SDK documentation.
Building Edge AI solutions is an iterative process. Feel free to try our organization hub to automate your machine-learning pipelines, collaborate with your colleagues, and create custom blocks.
If you want to get familiar with the full end-to-end flow, please have a look at our end-to-end tutorials on continuous motion recognition, responding to your voice, recognizing sounds from audio, adding sight to your sensors, or object detection.
In the advanced inferencing tutorials section, you will discover useful techniques to leverage our inferencing libraries or how you can use the inference results in your application logic:
Edge Impulse offers a thriving community of embedded engineers, developers, and experts. Connect with like-minded professionals, share your knowledge, and collaborate to enhance your embedded machine-learning projects.
Now that you have a roadmap, it's time to explore Edge Impulse and discover the exciting possibilities of embedded machine learning. Let's get started!
The enterprise version of Edge Impulse offers team collaboration in organizations. Try it out with our enterprise free trial. To collaboration on your projects, go to Dashboard, find the Collaborators section, and click the '+' icon.
You can also create a public version of your Edge Impulse project. This makes your project available to the whole world - including your data, your impulse design, your models, and all intermediate information - and can easily be cloned by anyone in the community. To do so, go to Dashboard, and click Make this project public.
The minimum hardware requirements for the embedded device depends on the use case, anything from a Cortex-M0+ for vibration analysis to Cortex-M4F for audio, Cortex-M7 for image classification to Cortex-A for object detection in video, view our inference performance metrics for more details.
We use a wide variety of tools, depending on the machine learning model. For neural networks we typically use TensorFlow and Keras, for object detection models we use TensorFlow with Google's Object Detection API, and for 'classic' non-neural network machine learning algorithms we mainly use sklearn. For neural networks you can see (and modify) the Keras code by clicking ⋮
, and selecting Switch to expert mode.
Another big part of Edge Impulse are the processing blocks, as they clean up the data, and already extract important features from your data before passing it to a machine learning model. The source code for these processing blocks can be found on GitHub: edgeimpulse/processing-blocks (and you can build your own processing blocks as well).
It depends on the hardware.
For general-purpose MCUs we typically use EON Compiler with TFLite Micro kernels (including hardware optimization, e.g. via CMSIS-NN, ESP-NN).
On Linux, if you run the Impulse on CPU, we use TensorFlow Lite.
For accelerators we use a wide variety of other runtimes, e.g. hardcoded network in silicon for Syntiant, custom SNN-based inference engine for Brainchip Akida, DRP-AI for Renesas RZV2L, etc...
The EON Compiler compiles your neural networks to C++ source code, which then gets compiled into your application. This is great if you need the lowest RAM and ROM possible (EON typically uses 30-50% less memory than TensorFlow Lite) but you also lose some flexibility to update your neural networks in the field - as it is now part of your firmware.
By disabling EON we place the full neural network (architecture and weights) into ROM, and load it on demand. This increases memory usage, but you could just update this section of the ROM (or place the neural network in external flash, or on an SD card) to make it easier to update.
Yes you can! Check out our documentation on Bringing your own model (BYOM) into your Edge Impulse project, and using the Edge Impulse Python SDK!
Edge Impulse uses UMAP (a dimensionality reduction algorithm) to project high dimensionality input data into a 3 dimensional space. This even works for extremely high dimensionality data such as images.
Yes. The enterprise version of Edge Impulse can integrate directly with your cloud service to access and transform data.
Simple answer: To get an indication of time per inference we show performance metrics in every DSP and ML block in the Studio. Multiply this by the active power consumption of your MCU to get an indication of power cost per inference.
More complicated answer: It depends. Normal techniques to conserve power still apply to ML, so try to do as little as possible (do you need to classify every second, or can you do it once a minute?), be smart about when to run inference (can there be an external trigger like a motion sensor before you run inference on a camera?), and collect data in a lower power mode (don't run at full speed when sampling low-resolution data, and see if your sensor can use an interrupt to wake your MCU - rather than polling).
Also see Analyse Power Consumption in Embedded ML Solutions.
See .eim models? on the Edge Impulse for Linux pages.
Using the Edge Impulse Studio data acquisition tools (like the serial daemon or data forwarder), you can collect data samples manually with a pre-defined label. If you have a dataset that was collected outside of Edge Impulse, you can upload your dataset using the Edge Impulse CLI, data ingestion API, web uploader, enterprise data storage bucket tools or enterprise upload portals. You can then utilize the Edge Impulse Studio to split up your data into labeled chunks, crop your data samples, and more to create high quality machine learning datasets.
Yes! A "supported board" simply means that there is an official or community-supported firmware that has been developed specifically for that board that helps you collect data and run impulses. Edge Impulse is designed to be extensible to computers, smartphones, and a nearly endless array of microcontroller build systems.
You can collect data and upload it to Edge Impulse in a variety of ways. For example:
Transmitting data to the Data forwarder
Using the Edge Impulse for Linux SDK
By uploading files directly (e.g. CBOR, JSON, CSV, WAV, JPG, PNG)
Your trained model can be deployed as part a C++ library. It requires some effort, but most build systems will work with our C++ library, as long as that build system has a C++ compiler and there is enough flash/RAM on your device to run the library (which includes the DSP block and model).
Welcome to Edge Impulse! If you're new to the world of edge machine learning, you've come to the right place. This guide will walk you through the essential steps to get started with Edge Impulse, a suite of engineering tools for building, training, and deploying machine learning models on edge devices.
Check out our Introduction to Edge AI course to learn more about edge computing, machine learning, and edge MLOps.
Edge Impulse empowers you to bring intelligence to your embedded projects by enabling devices to understand and respond to their environment. Whether you want to recognize sounds, identify objects, or detect motion, Edge Impulse makes it accessible and straightforward. Here's why beginners like you are diving into Edge Impulse:
No Coding Required: You don't need to be a coding expert to use Edge Impulse. Our platform provides a user-friendly interface that guides you through the process - this includes many optimized preprocessing and learning blocks, various neural network architectures, and pre-trained models and can generate ready-to-flash binaries to test your models on real devices.
Edge Computing: Your machine learning models are optimized to run directly on your edge devices, ensuring low latency and real-time processing.
Support for Various Sensors: Edge Impulse supports a wide range of sensors, from accelerometers and microphones to cameras, making it versatile for different projects.
Community and Resources: You're not alone on this journey. Edge Impulse offers a supportive community and extensive documentation to help you succeed.
Ready to begin? Follow these simple steps to embark on your Edge Impulse journey:
Start by creating an Edge Impulse account. It's free to get started, and you'll gain access to all the tools and resources you need.
Once you're logged in, create your first project. Give it a name that reflects your project's goal, whether it's recognizing sounds, detecting objects, or something entirely unique.
To teach your device, you need data. Edge Impulse provides user-friendly tools for collecting data from your sensors, such as recording audio, capturing images, or reading sensor values. We recommend using a hardware target from this list or your smartphone to start collecting data when you begin with Edge Impulse.
You can also import existing datasets or clone a public project to get familiar with the platform.
Organize your data by labeling it. For example, if you're working on sound recognition, label audio clips with descriptions like "dog barking" or "car horn." You can label your data as you collect it or add labels later, our data explorer is also particularly useful to understand your data.
This is where the magic happens. Edge Impulse offers an intuitive model training process through processing blocks and learning blocks. You don't need to write complex code; the platform guides you through feature extraction, model creation, and training.
After training your model, you can easily export your model to run in a web browser or on your smartphone, but you can also run it on a wide variety of edge devices, whether it's a Raspberry Pi, Arduino, or other compatible hardware. We also provide ready-to-flash binaries for all the officially supported hardware targets. You don't even need to write embedded code to test your model on real devices!
If you have a device that is not supported, no problem, you can export your model as a C++ library that runs on any embedded device. See Running your impulse locally for more information.
Building Edge AI solutions is an iterative process. Feel free to try our organization hub to automate your machine-learning pipelines, collaborate with your colleagues, and create custom blocks.
The end-to-end tutorials are perfect for learning how to use Edge Impulse Studio. Try the tutorials:
These will let you build machine-learning models that detect things in your home or office.
Remember, you're not alone on your journey. Join the Edge Impulse community to connect with other beginners, experts, and enthusiasts. Share your experiences, ask questions, and learn from others who are passionate about embedded machine learning.
Now that you have a roadmap, it's time to explore Edge Impulse and discover the exciting possibilities of embedded machine learning. Let's get started!
In this tutorial, you'll use machine learning to build a gesture recognition system that runs on a microcontroller. This is a hard task to solve using rule-based programming, as people don't perform gestures in the exact same way every time. But machine learning can handle these variations with ease. You'll learn how to collect high-frequency data from real sensors, use signal processing to clean up data, build a neural network classifier, and how to deploy your model back to a device. At the end of this tutorial, you'll have a firm understanding of applying machine learning in embedded devices using Edge Impulse.
There is also a video version of this tutorial:
You can view the finished project, including all data, signal processing and machine learning blocks here: .
If your device is connected (green dot) under Devices in the studio you can proceed:
Data ingestion
With your device connected, we can collect some data. In the studio go to the Data acquisition tab. This is the place where all your raw data is stored, and - if your device is connected to the remote management API - where you can start sampling new data.
Under Record new data, select your device, set the label to updown
, the sample length to 10000
, the sensor to Built-in accelerometer
and the frequency to 62.5Hz
. This indicates that you want to record data for 10 seconds, and label the recorded data as updown
. You can later edit these labels if needed.
After you click Start sampling move your device up and down in a continuous motion. In about twelve seconds the device should complete sampling and upload the file back to Edge Impulse. You see a new line appear under 'Collected data' in the studio. When you click it you now see the raw data graphed out. As the accelerometer on the development board has three axes you'll notice three different lines, one for each axis.
Continuous movement
It's important to do continuous movements as we'll later slice up the data in smaller windows.
Machine learning works best with lots of data, so a single sample won't cut it. Now is the time to start building your own dataset. For example, use the following four classes, and record around 3 minutes of data per class:
Idle - just sitting on your desk while you're working.
Snake - moving the device over your desk as a snake.
Wave - waving the device from left to right.
Updown - moving the device up and down.
Variations
Make sure to perform variations on the motions. E.g. do both slow and fast movements and vary the orientation of the board. You'll never know how your user will use the device. It's best to collect samples of ~10 seconds each.
Prebuilt dataset
With the training set in place, you can design an impulse. An impulse takes the raw data, slices it up in smaller windows, uses signal processing blocks to extract features, and then uses a learning block to classify new data. Signal processing blocks always return the same values for the same input and are used to make raw data easier to process, while learning blocks learn from past experiences.
For this tutorial we'll use the 'Spectral analysis' signal processing block. This block applies a filter, performs spectral analysis on the signal, and extracts frequency and spectral power data. Then we'll use a 'Neural Network' learning block, that takes these spectral features and learns to distinguish between the four (idle, snake, wave, updown) classes.
In the studio go to Create impulse, set the window size to 2000
(you can click on the 2000 ms.
text to enter an exact value), the window increase to 80
, and add the 'Spectral Analysis' and 'Classification (Keras)' blocks. Then click Save impulse.
To configure your signal processing block, click Spectral features in the menu on the left. This will show you the raw data on top of the screen (you can select other files via the drop down menu), and the results of the signal processing through graphs on the right. For the spectral features block you'll see the following graphs:
Filter response - If you have chosen a filter (with non zero order), this will show you the response across frequencies. That is, it will show you how much each frequency will be attenuated.
After filter - the signal after applying the filter. This will remove noise.
Spectral power - the frequencies at which the signal is repeating (e.g. making one wave movement per second will show a peak at 1 Hz).
A good signal processing block will yield similar results for similar data. If you move the sliding window (on the raw data graph) around, the graphs should remain similar. Also, when you switch to another file with the same label, you should see similar graphs, even if the orientation of the device was different.
Bonus exercise: filters
Try to reason about the filter parameters. What does the cut-off frequency control? And what do you see if you switch from a low-pass to a high-pass filter?
Set the filter to low pass with the following parameters:
Once you're happy with the result, click Save parameters. This will send you to the 'Feature generation' screen. In here you'll:
Split all raw data up in windows (based on the window size and the window increase).
Apply the spectral features block on all these windows.
Calculate feature importance. We will use this later to set up the anomaly detection.
Click Generate features to start the process.
Afterward the 'Feature explorer' will load. This is a plot of all the extracted features against all the generated windows. You can use this graph to compare your complete data set. A good rule of thumb is that if you can visually identify some clusters by classes, then the machine learning model will be able to do so as well.
With all data processed it's time to start training a neural network. Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. The network that we're training here will take the signal processing data as an input, and try to map this to one of the four classes.
So how does a neural network know what to predict? A neural network consists of layers of neurons, all interconnected, and each connection has a weight. One such neuron in the input layer would be the height of the first peak of the X-axis (from the signal processing block); and one such neuron in the output layer would be wave
(one the classes). When defining the neural network all these connections are initialized randomly, and thus the neural network will make random predictions. During training, we then take all the raw data, ask the network to make a prediction, and then make tiny alterations to the weights depending on the outcome (this is why labeling raw data is important).
This way, after a lot of iterations, the neural network learns; and will eventually become much better at predicting new data. Let's try this out by clicking on NN Classifier in the menu.
Set 'Number of training cycles' to 1
. This will limit training to a single iteration. And then click Start training.
Now change 'Number of training cycles' to 2
and you'll see performance go up. Finally, change 'Number of training cycles' to 30
and let the training finish.
You've just trained your first neural networks!
100% accuracy
You might end up with 100% accuracy after training for 100 training cycles. This is not necessarily a good thing, as it might be a sign that the neural network is too tuned for the specific test set and might perform poorly on new data (overfitting). The best way to reduce this is by adding more data or reducing the learning rate.
From the statistics in the previous step we know that the model works against our training data, but how well would the network perform on new data? Click on Live classification in the menu to find out. Your device should (just like in step 2) show as online under 'Classify new data'. Set the 'Sample length' to 10000
(10 seconds), click Start sampling and start doing movements. Afterward, you'll get a full report on what the network thought you did.
If the network performed great, fantastic! But what if it performed poorly? There could be a variety of reasons, but the most common ones are:
There is not enough data. Neural networks need to learn patterns in data sets, and the more data the better.
The data does not look like other data the network has seen before. This is common when someone uses the device in a way that you didn't add to the test set. You can add the current file to the test set by clicking ⋮
, then selecting Move to training set. Make sure to update the label under 'Data acquisition' before training.
The model has not been trained enough. Up the number of epochs to 200
and see if performance increases (the classified file is stored, and you can load it through 'Classify existing validation sample').
The model is overfitting and thus performs poorly on new data. Try reducing the learning rate or add more data.
The neural network architecture is not a great fit for your data. Play with the number of layers and neurons and see if performance improves.
As you see there is still a lot of trial and error when building neural networks, but we hope the visualizations help a lot. You can also run the network against the complete validation set through 'Model validation'. Think of the model validation page as a set of unit tests for your model!
With a working model in place, we can look at places where our current impulse performs poorly.
Neural networks are great, but they have one big flaw. They're terrible at dealing with data they have never seen before (like a new gesture). Neural networks cannot judge this, as they are only aware of the training data. If you give it something unlike anything it has seen before it'll still classify as one of the four classes.
Let's look at how this works in practice. Go to 'Live classification' and record some new data, but now vividly shake your device. Take a look and see how the network will predict something regardless.
So, how can we do better? If you look at the feature explorer, you should be able to visually separate the classified data from the training data. We can use this to our advantage by training a new (second) network that creates clusters around data that we have seen before, and compares incoming data against these clusters. If the distance from a cluster is too large you can flag the sample as an anomaly, and not trust the neural network.
To add this block go to Create impulse, click Add learning block, and select 'Anomaly Detection (K-Means)'. Then click Save impulse.
To configure the clustering model click on Anomaly detection in the menu. Here we need to specify:
The number of clusters. Here use 32
.
Click Start training to generate the clusters. You can load existing validation samples into the anomaly explorer with the dropdown menu.
Axes
The anomaly explorer only plots two axes at the same time. Under 'average axis distance' you see how far away from each axis the validation sample is. Use the dropdown menu's to change axes.
If you now go back to 'Live classification' and load your last sample, it should now have tagged everything as anomaly. This is a great example where signal processing (to extract features), neural networks (for classification) and clustering algorithms (for anomaly detection) can work together.
With the impulse designed, trained and verified you can deploy this model back to your device. This makes the model run without an internet connection, minimizes latency, and runs with minimum power consumption. Edge Impulse can package up the complete impulse - including the signal processing code, neural network weights, and classification code - up in a single C++ library that you can include in your embedded software.
Mobile phone
To export your model, click on Deployment in the menu. Then under 'Build firmware' select your development board, and click Build. This will export the impulse, and build a binary that will run on your development board in a single step. After building is completed you'll get prompted to download a binary. Save this on your computer.
When you click the Build button, you'll see a pop-up with text and video instructions on how to deploy the binary to your particular device. Follow these instructions. Once you are done, we are ready to test your impulse out.
We can connect to the board's newly flashed firmware over serial. Open a terminal and run:
Serial daemon
If the device is not connected over WiFi, but instead connected via the Edge Impulse serial daemon, you'll need stop the daemon. Only one application can connect to the development board at a time.
This will sample data from the sensor, run the signal processing code, and then classify the data:
Continuous movement
We trained a model to detect continuous movement in 2 second intervals. Thus, changing your movement while sampling will yield incorrect results. Make sure you've started your movement when 'Sampling...' gets printed. In between sampling, you have two seconds to switch movements.
To run the continuous sampling, run the following command:
Victory! You've now built your first on-device machine learning model.
We can't wait to see what you'll build! 🚀
For this tutorial, you'll need a .
Alternatively, use the either or SDK to collect data from any other development board, or your .
Edge Impulse can ingest data from many sources and any device - including embedded devices that you already have in production. See the documentation for the for more information.
Alternatively, you can load an example test set that has about ten minutes of data in these classes (but how much fun is that?). See the for more information.
See the dedicated page for the pre-processing block.
See the dedicated page for the learning block.
The axes that we want to select during clustering. Click on the Select suggested axes button to harness the results of the output. Alternatively, the data separates well on the accX RMS, accY RMS and accZ RMS axes, you can also include these axes.
See the dedicated page for the learning block. We also provide the learning block that is compatible with this tutorial.
Your mobile phone can build and download the compiled impulse directly from the mobile client. See 'Deploying back to device' on the page.
Congratulations! You have used Edge Impulse to train a machine learning model capable of recognizing your gestures and understand how you can build models that classify sensor data or find anomalies. Now that you've trained your model you can integrate your impulse in the firmware of your own embedded device, see . There are examples for Mbed OS, Arduino, STM32CubeIDE, and any other target that supports a C++ compiler.
Or if you're interested in more, see our tutorials on or . If you have a great idea for a different project, that's fine too. Edge Impulse lets you capture data from any sensor, build to extract features, and you have full flexibility in your Machine Learning pipeline with the learning blocks.
This section provides detailed end-to-end tutorials to help you get started with Edge Impulse:
Detect objects using MobileNet SSD (bounding boxes)
Object detection using FOMO (centroids)
In this tutorial, you'll use machine learning to build a system that can recognize objects in your house through a camera - a task known as image classification - connected to a microcontroller. Adding sight to your embedded devices can make them see the difference between poachers and elephants, do quality control on factory lines, or let your RC cars drive themselves. In this tutorial you'll learn how to collect images for a well-balanced dataset, how to apply transfer learning to train a neural network, and deploy the system to an embedded device.
At the end of this tutorial, you'll have a firm understanding of how to classify images using Edge Impulse.
There is also a video version of this tutorial:
You can view the finished project, including all data, signal processing and machine learning blocks here: Tutorial: adding sight to your sensors.
For this tutorial, you'll need a supported device.
If you don't have any of these devices, you can also upload an existing dataset through the Uploader. After this tutorial you can then deploy your trained machine learning model as a C++ library and run it on your device.
In this tutorial we'll build a model that can distinguish between two objects in your house - we've used a plant and a lamp, but feel free to pick two other objects. To make your machine learning model see it's important that you capture a lot of example images of these objects. When training the model these example images are used to let the model distinguish between them. Because there are (hopefully) a lot more objects in your house than just lamps or plants, you also need to capture images that are neither a lamp or a plant to make the model work well.
Capture the following amount of data - make sure you capture a wide variety of angles and zoom levels:
50 images of a lamp.
50 images of a plant.
50 images of neither a plant nor a lamp - make sure to capture a wide variation of random objects in the same room as your lamp or plant.
You can collect data from the following devices:
Collecting image data from the Studio - for all other officially supported boards with camera sensors.
Or you can capture your images using another camera, and then upload them by going to Data acquisition and clicking the 'Upload' icon.
Afterwards you should have a well-balanced dataset listed under Data acquisition in your Edge Impulse project. You can switch between your training and testing data with the two buttons above the 'Data collected' widget.
With the training set in place you can design an impulse. An impulse takes the raw data, adjusts the image size, uses a preprocessing block to manipulate the image, and then uses a learning block to classify new data. Preprocessing blocks always return the same values for the same input (e.g. convert a color image into a grayscale one), while learning blocks learn from past experiences.
For this tutorial we'll use the 'Images' preprocessing block. This block takes in the color image, optionally makes the image grayscale, and then turns the data into a features array. If you want to do more interesting preprocessing steps - like finding faces in a photo before feeding the image into the network -, see the Building custom processing blocks tutorial. Then we'll use a 'Transfer Learning' learning block, which takes all the images in and learns to distinguish between the three ('plant', 'lamp', 'unknown') classes.
In the studio go to Create impulse, set the image width and image height to 96
, and add the 'Images' and 'Transfer Learning (Images)' blocks. Then click Save impulse.
To configure your processing block, click Images in the menu on the left. This will show you the raw data on top of the screen (you can select other files via the drop down menu), and the results of the processing step on the right. You can use the options to switch between 'RGB' and 'Grayscale' mode, but for now leave the color depth on 'RGB' and click Save parameters.
This will send you to the 'Feature generation' screen. In here you'll:
Resize all the data.
Apply the processing block on all this data.
Create a 3D visualization of your complete dataset.
Click Generate features to start the process.
Afterwards the 'Feature explorer' will load. This is a plot of all the data in your dataset. Because images have a lot of dimensions (here: 96x96x3=27,648 features) we run a process called 'dimensionality reduction' on the dataset before visualizing this. Here the 27,648 features are compressed down to just 3, and then clustered based on similarity. Even though we have little data you can already see some clusters forming (lamp images are all on the right), and can click on the dots to see which image belongs to which dot.
With all data processed it's time to start training a neural network. Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. The network that we're training here will take the image data as an input, and try to map this to one of the three classes.
It's very hard to build a good working computer vision model from scratch, as you need a wide variety of input data to make the model generalize well, and training such models can take days on a GPU. To make this easier and faster we are using transfer learning. This lets you piggyback on a well-trained model, only retraining the upper layers of a neural network, leading to much more reliable models that train in a fraction of the time and work with substantially smaller datasets.
To configure the transfer learning model, click Transfer learning in the menu on the left. Here you can select the base model (the one selected by default will work, but you can change this based on your size requirements), optionally enable data augmentation (images are randomly manipulated to make the model perform better in the real world), and the rate at which the network learns.
Set:
Number of training cycles to 20
.
Learning rate to 0.0005
.
Data augmentation: enabled.
Minimum confidence rating: 0.7.
Important: If you're using a development board with less memory, like the Arduino Nano 33 BLE Sense click Choose a different model and select MobileNetV1 96x96 0.25. This is a smaller transfer learning model.
And click Start training. After the model is done you'll see accuracy numbers, a confusion matrix and some predicted on-device performance on the bottom. You have now trained your model!
With the model trained let's try it out on some test data. When collecting the data we split the data up between a training and a testing dataset. The model was trained only on the training data, and thus we can use the data in the testing dataset to validate how well the model will work in the real world. This will help us ensure the model has not learned to overfit the training data, which is a common occurrence.
To validate your model, go to Model testing, select the checkbox next to 'Sample name' and click Classify selected. Here we hit 89% accuracy, which is great for a model with so little data.
To see a classification in detail, click the three dots next to an item, and select Show classification. This brings you to the Live classification screen with much more details on the file (if you collected data with your mobile phone you can also capture new testing data directly from here). This screen can help you determine why items were misclassified.
With the impulse designed, trained and verified you can deploy this model back to your device. This makes the model run without an internet connection, minimizes latency, and runs with minimum power consumption. Edge Impulse can package up the complete impulse - including the preprocessing steps, neural network weights, and classification code - in a single C++ library that you can include in your embedded software.
To run your impulse on either the OpenMV camera or your phone, follow these steps:
OpenMV Cam H7 Plus: Running your impulse on your OpenMV camera
Mobile phone: just click Switch to classification mode at the bottom of your phone screen.
For other boards: click on Deployment in the menu. Then under 'Build firmware' select your development board, and click Build. This will export the impulse, and build a binary that will run on your development board in a single step. After building is completed you'll get prompted to download a binary. Save this on your computer.
When you click the Build button, you'll see a pop-up with text and video instructions on how to deploy the binary to your particular device. Follow these instructions. Once you are done, we are ready to test your impulse out.
We can connect to the board's newly flashed firmware over serial. Open a terminal and run:
To also see a preview of the camera, run:
To run continuous (without a pause every 2 seconds), but without the preview, run:
Congratulations! You've added sight to your sensors. Now that you've trained your model you can integrate your impulse in the firmware of your own embedded device, see Running your impulse locally. There are examples for Mbed OS, Arduino, STM32CubeIDE, and any other target that supports a C++ compiler. Note that the model we trained in this tutorial is relatively big, but you can choose a smaller transfer learning model.
Or if you're interested in more, see our tutorials on Continuous motion recognition or Recognize sounds from audio. If you have a great idea for a different project, that's fine too. Edge Impulse lets you capture data from any sensor, build custom processing blocks to extract features, and you have full flexibility in your Machine Learning pipeline with the learning blocks.
We can't wait to see what you'll build! 🚀
Welcome to Edge Impulse! We enable professional developers and researchers to create the next generation of intelligent products with Edge AI. In this documentation, you'll find user guides, tutorials, and API documentation. If at any point you have questions, visit our forum.
If you are a beginner, an advanced embedded engineer, an ML engineer, or a data scientist, you may want to use Edge Impulse differently. We have tailored Edge Impulse to suit your needs. Check out the following getting-started guides for a smooth start:
If you're new to the idea of embedded machine learning, or machine learning in general, you may enjoy our quick articles: What is embedded ML, anyway? and What is edge machine learning?
For startups and enterprises looking to scale edge ML algorithm development from prototype to production, we offer an enterprise-grade version of our platform. This includes all of the tools needed to go from data collection to model deployment, such as a robust dataset builder to future-proof your data, integrations with all major cloud vendors, dedicated technical support, custom DSP and ML capabilities, and full access to the Edge Impulse APIs to automate your algorithm development.
Sign up for a FREE Enterprise Trial today!
For professionals who want additional compute time, more private projects, and more flexibility in usage, we also offer a professional tier version of our platform.
Try our Professional Plan today!
We have some great tutorials, but you have full freedom in the models that you design in Edge Impulse. You can plug in new signal processing blocks, and completely new neural networks. See Building custom processing blocks and Bring your own model.
You can access any feature in the Edge Impulse Studio through the Edge Impulse API. We also have the Ingestion service if you want to send data directly, and we have an open Remote management protocol to control devices from the Studio.
Edge Impulse offers a thriving community of engineers, developers, researchers, and machine learning experts. Connect with like-minded professionals, share your knowledge, and collaborate to enhance your embedded machine-learning projects. Head to the forum to ask questions or share your awesome ideas!
Think your model is awesome, and want to share it with the world? Go to Dashboard and click Make this project public. This will make your whole project - including all data, machine learning models, and visualizations - available, and can be viewed and cloned by anyone with the URL.
We reference all the public projects here: https://edgeimpulse.com/projects/overview. If you need some inspiration, just clone a project and fine-tune it to your needs!
Welcome to Edge Impulse! Whether you are a machine learning engineer, MLOps engineer, data scientist, or researcher, we have developed professional tools to help you build and optimize models to run efficiently on any edge device.
In this guide, we'll explore how Edge Impulse empowers you to bring your expertise and your own models to the world of edge AI using either the Edge Impulse Studio, our visual interface, and the Edge Impulse Python SDK, available as a pip package.
Flexibility: You can choose to work with the tools they are already familiar with and import your models, architecture, and feature processing algorithms into the platform. This means that you can leverage your existing knowledge and workflows seamlessly. Or, for those who prefer an all-in-one solution, Edge Impulse provides enterprise-grade tools for your entire machine-learning pipeline.
Optimized for edge devices: Edge Impulse is designed specifically for deploying machine learning models on edge devices, which are typically resource-constrained, from low-power MCUs up to powerful edge GPUs. We provide tools to optimize your models for edge deployment, ensuring efficient resource usage and peak performance. Focus on developing the best models, we will provide feedback on whether they can run on your hardware target!
Data pipelines: We developed a strong expertise in complex data pipelines (including clinical data) while working with our customers. We support data coming from multiple sources, in any format, and provide tools to perform data alignment and validation checks. All of this in customizable multi-stage pipelines. This means you can build gold-standard labeled datasets that can then be imported into your project to train your models.
In this getting started guide, we'll walk you through the two different approaches to bringing your expertise to edge devices. Either starting from your dataset or from an existing model.
First, start by creating your Edge Impulse account.
Start with existing data
You can import data using Studio Uploader, CLI Uploader, or our Ingestion API. These allow you to easily upload and manage your existing data samples and datasets to Edge Impulse Studio.
We currently accept various file types, including .cbor
, .json
, .csv
, .wav
, .jpg
, .png
, .mp4
, and .avi
.
If you are working with image datasets, the Studio uploader and the CLI uploader currently handle these types of dataset annotation formats: Edge Impulse object detection, COCO JSON, Open Images CSV, Pascal VOC XML, Plain CSV, and YOLO TXT.
Organization data
Since the creation of Edge Impulse, we have been helping our customers deal with complex data pipelines, complex data transformation methods and complex clinical validation studies.
The organizational data gives you tools to centralize, validate and transform datasets so they can be easily imported into your projects.
See the Organization data documentation.
To visualize how your labeled data items are clustered, use the Data explorer feature available for most dataset types, where we apply dimensionality reduction techniques (t-SNE or PCA) on your embeddings.
To extract features from your data items, either choose an available processing block (MFE, MFCC, spectral analysis using FFT or Wavelets, etc.) or create your own from your expertise. These can be written in any language.
Similarly, to train your machine learning model, you can choose from different learning blocks (Classification, Anomaly Detection, Regression, Image or Audio Transfer Learning, Object Detection). In most of these blocks, we expose the Keras API in an expert mode. You can also bring your own architecture/training pipeline as a custom learning block.
Each block will provide on-device performance information showing you the estimated RAM, flash, and latency.
Start with an existing model
If you already have been working on different models for your Edge AI applications, Edge Impulse offers an easy way to upload your models and profile them. This way, in just a few minutes, you will know if your model can run on real devices and what will be the on-device performances (RAM, flash usage, and latency).
You can do this directly from the Studio BYOM feature or using Edge Impulse Python SDK.
Edge Impulse Python SDK is available as a pip
package:
From there, you can profile your existing models:
And then directly generate a customizable library or any other supported deployment type
You can easily export your model in a .eim
format, a Linux executable that contains your signal processing and ML code, compiled with optimizations for your processor or GPU. This executable can then be called with our Linux inferencing libraries. We have inferencing libraries and examples for Python, Node.js, C++, and Go.
If you target MCU-based devices, you can generate ready-to-flash binaries for all the officially supported hardware targets. This method will let you test your model on real hardware very quickly.
In both cases, we will provide profiling information about your models so you can make sure your model will fit your edge device constraints.
If you want to get familiar with the full end-to-end flow using Edge Impulse Studio, please have a look at our end-to-end tutorials on continuous motion recognition, responding to your voice, recognizing sounds from audio, adding sight to your sensors, or object detection.
To understand the full potential of Edge Impulse, see our health reference design that describes an end-to-end ML workflow for building a wearable health product using Edge Impulse. It handles data coming from multiple sources, data alignment, and a multi-stage pipeline before the data is imported into an Edge Impulse project.
While the Edge Impulse Studio is a great interface for guiding you through the process of collecting data and training a model, the edgeimpulse Python SDK allows you to programmatically Bring Your Own Model (BYOM), developed and trained on any platform:
Expert mode (access Keras API in the studio)
NVIDIA TAO Toolkit (access state-of-the-art pre-trained models)
In this tutorial, you'll use machine learning to build a system that can recognize audible events, particularly your voice through audio classification. The system you create will work similarly to "Hey Siri" or "OK, Google" and is able to recognize keywords or other audible events, even in the presence of other background noise or background chatter.
You'll learn how to collect audio data from microphones, use signal processing to extract the most important information, and train a deep neural network that can tell you whether your keyword was heard in a given clip of audio. Finally, you'll deploy the system to an embedded device and evaluate how well it works.
At the end of this tutorial, you'll have a firm understanding of how to classify audio using Edge Impulse.
There is also a video version of this tutorial:
You can view the finished project, including all data, signal processing and machine learning blocks here: Tutorial: responding to your voice.
Detect non-voice audio?
We have a tutorial for that too! See Recognize sounds from audio.
For this tutorial, you'll need a supported device.
If your device is connected under Devices in the studio you can proceed:
Device compatibility
Edge Impulse can ingest data from any device - including embedded devices that you already have in production. See the documentation for the Ingestion service for more information.
In this tutorial we want to build a system that recognizes keywords, so your first job is to think of a great one. It can be your name, an action, or even a growl - it's your party. Do keep in mind that some keywords are harder to distinguish from others, and especially keywords with only one syllable (like 'One') might lead to false-positives (e.g. when you say 'Gone'). This is the reason that Apple, Google and Amazon all use at least three-syllable keywords ('Hey Siri', 'OK, Google', 'Alexa'). A good one would be "Hello world".
To collect your first data, go to Data acquisition, set your keyword as the label, set your sample length to 10s., your sensor to 'microphone' and your frequency to 16KHz. Then click Start sampling and start saying your keyword over and over again (with some pause in between).
Note: Data collection from a development board might be slow, you can use your Mobile phone as a sensor to make this much faster.
Afterwards you have a file like this, clearly showing your keywords, separated by some noise.
This data is not suitable for Machine Learning yet though. You will need to cut out the parts where you say your keyword. This is important because you only want the actual keyword to be labeled as such, and not accidentally label noise, or incomplete sentences (e.g. only "Hello"). Fortunately the Edge Impulse Studio can do this for you. Click ⋮
next to your sample, and select Split sample.
If you have a short keyword, enable Shift samples to randomly shift the sample around in the window, and then click Split. You now have individual 1s. long samples in your dataset. Perfect!
Now that you know how to collect data we can consider other data we need to collect. In addition to your keyword we'll also need audio that is not your keyword. Like background noise, the TV playing ('noise' class), and humans saying other words ('unknown' class). This is required because a machine learning model has no idea about right and wrong (unless those are your keywords), but only learns from the data you feed into it. The more varied your data is, the better your model will work.
For each of these three classes ('your keyword', 'noise', and 'unknown') you want to capture an even amount of data (balanced datasets work better) - and for a decent keyword spotting model you'll want at least 10 minutes in each class (but, the more the better).
Thus, collect 10 minutes of samples for your keyword - do this in the same manner as above. The fastest way is probably through your mobile phone, collecting 1 minute clips, then automatically splitting this data. Make sure to capture wide variations of the keyword: leverage your family and your colleagues to help you collect the data, make sure you cover high and low pitches, and slow and fast speakers.
For the noise and unknown datasets you can either collect this yourself, or make your life a bit easier by using dataset of both 'noise' (all kinds of background noise) and 'unknown' (random words) data that we built for you here: Pre-built datasets > Keyword spotting.
To import this data, go to Data acquisition, click the Upload icon, and select a number of 'noise' or 'unknown' samples (there's 25 minutes of each class, but you can select less files if you want), and clicking Begin upload. The data is automatically labeled and added to your project.
If you've collected all your training data through the 'Record new data' widget you'll have all your keywords in the 'Training' dataset. This is not great, because you want to keep 20% of your data separate to validate the machine learning model. To mitigate this you can go to Dashboard and select Perform train/test split. This will automatically split your data between a training class (80%) and a testing class (20%). Afterwards you should see something like this:
With the data set in place you can design an impulse. An impulse takes the raw data, slices it up in smaller windows, uses signal processing blocks to extract features, and then uses a learning block to classify new data. Signal processing blocks always return the same values for the same input and are used to make raw data easier to process, while learning blocks learn from past experiences.
For this tutorial we'll use the "MFCC" signal processing block. MFCC stands for Mel Frequency Cepstral Coefficients. This sounds scary, but it's basically just a way of turning raw audio—which contains a large amount of redundant information—into simplified form. Edge Impulse has many other processing blocks for audio, including "MFE" and the "Spectrogram" blocks for non-voice audio, but the "MFCC" block is great for dealing with human speech.
We'll then pass this simplified audio data into a Neural Network block, which will learn to distinguish between the three classes of audio.
In the Studio, go to the Create impulse tab, add a Time series data, an Audio (MFCC) and a Classification (Keras) block. Leave the window size to 1 second (as that's the length of our audio samples in the dataset) and click Save Impulse.
Now that we've assembled the building blocks of our Impulse, we can configure each individual part. Click on the MFCC tab in the left hand navigation menu. You'll see a page that looks like this:
This page allows you to configure the MFCC block, and lets you preview how the data will be transformed. The right of the page shows a visualization of the MFCC's output for a piece of audio, which is known as a spectrogram. An MFCC spectrogram is a specially tuned spectrogram which highlights frequencies which are common in human speech (Edge Impulse also has normal spectrograms if that's more your thing).
In the spectrogram the vertical axis represents the frequencies (the number of frequency bands is controlled by 'Number of coefficients' parameter, try it out!), and the horizontal axis represents time (controlled by 'frame stride' and 'frame length'). The patterns visible in a spectrogram contain information about what type of sound it represents. For example, the spectrogram in this image shows "Hello world":
And the spectrogram in this image shows "On":
These differences are not necessarily easy for a person to describe, but fortunately they are enough for a neural network to learn to identify.
It's interesting to explore your data and look at the types of spectrograms it results in. You can use the dropdown box near the top right of the page to choose between different audio samples to visualize, or play with the parameters to see how the spectrogram changes.
In addition, you can see the performance of the MFCC block on your microcontroller below the spectrogram. This is the complete time that it takes on a low-power microcontroller (Cortex-M4F @ 80MHz) to analyze 1 second of data.
You might think based on this number that we can only classify 2 or 3 windows per second, but we continuously build up the spectrogram (as it has a time component), which takes less time, and we can thus continuously listen for events 5-6x a second, even on an 40MHz processor. This is already implemented on all fully supported development boards, and easy to implement on your own device.
The spectrograms generated by the MFCC block will be passed into a neural network architecture that is particularly good at learning to recognize patterns in this type of tabular data. Before training our neural network, we'll need to generate MFCC blocks for all of our windows of audio. To do this, click the Generate features button at the top of the page, then click the green Generate features button. This will take a minute or so to complete.
Afterwards you're presented with one of the most useful features in Edge Impulse: the feature explorer. This is a 3D representation showing your complete dataset, with each data-item color-coded to its respective label. You can zoom in to every item, find anomalies (an item that's in a wrong cluster), and click on items to listen to the sample. This is a great way to check whether your dataset contains wrong items, and to validate whether your dataset is suitable for ML (it should separate nicely).
With all data processed it's time to start training a neural network. Neural networks are algorithms, modeled loosely after the human brain, that can learn to recognize patterns that appear in their training data. The network that we're training here will take the MFCC as an input, and try to map this to one of three classes—your keyword, noise or unknown.
Click on NN Classifier in the left hand menu. You'll see the following page:
A neural network is composed of layers of virtual "neurons", which you can see represented on the left hand side of the NN Classifier page. An input—in our case, an MFCC spectrogram—is fed into the first layer of neurons, which filters and transforms it based on each neuron's unique internal state. The first layer's output is then fed into the second layer, and so on, gradually transforming the original input into something radically different. In this case, the spectrogram input is transformed over four intermediate layers into just two numbers: the probability that the input represents your keyword, and the probability that the input represents 'noise' or 'unknown'.
During training, the internal state of the neurons is gradually tweaked and refined so that the network transforms its input in just the right ways to produce the correct output. This is done by feeding in a sample of training data, checking how far the network's output is from the correct answer, and adjusting the neurons' internal state to make it more likely that a correct answer is produced next time. When done thousands of times, this results in a trained network.
A particular arrangement of layers is referred to as an architecture, and different architectures are useful for different tasks. The default neural network architecture provided by Edge Impulse will work well for our current project, but you can also define your own architectures. You can even import custom neural network code from tools used by data scientists, such as TensorFlow and Keras (click the three dots at the top of the page).
Before you begin training, you should change some values in the configuration. Change the Minimum confidence rating to 0.6. This means that when the neural network makes a prediction (for example, that there is 0.8 probability that some audio contains "hello world") Edge Impulse will disregard it unless it is above the threshold of 0.6.
Next, enable 'Data augmentation'. When enabled your data is randomly mutated during training. For example, by adding noise, masking time or frequency bands, or warping your time axis. This is a very quick way to make your dataset work better in real life (with unpredictable sounds coming in), and prevents your neural network from overfitting (as the data samples are changed every training cycle).
With everything in place, click Start training. You'll see a lot of text flying past in the Training output panel, which you can ignore for now. Training will take a few minutes. When it's complete, you'll see the Last training performance panel appear at the bottom of the page:
Congratulations, you've trained a neural network with Edge Impulse! But what do all these numbers mean?
At the start of training, 20% of the training data is set aside for validation. This means that instead of being used to train the model, it is used to evaluate how the model is performing. The Last training performance panel displays the results of this validation, providing some vital information about your model and how well it is working. Bear in mind that your exact numbers may differ from the ones in this tutorial.
On the left hand side of the panel, Accuracy refers to the percentage of windows of audio that were correctly classified. The higher number the better, although an accuracy approaching 100% is unlikely, and is often a sign that your model has overfit the training data. You will find out whether this is true in the next stage, during model testing. For many applications, an accuracy above 85% can be considered very good.
The Confusion matrix is a table showing the balance of correctly versus incorrectly classified windows. To understand it, compare the values in each row. For example, in the above screenshot, 96 of the helloworld audio windows were classified as helloworld, while 10 of them were incorrectly classified as unknown or noise. This appears to be a great result.
The On-device performance region shows statistics about how the model is likely to run on-device. Inferencing time is an estimate of how long the model will take to analyze one second of data on a typical microcontroller (an Arm Cortex-M4F running at 80MHz). Peak RAM usage gives an idea of how much RAM will be required to run the model on-device.
The performance numbers in the previous step show that our model is working well on its training data, but it's extremely important that we test the model on new, unseen data before deploying it in the real world. This will help us ensure the model has not learned to overfit the training data, which is a common occurrence.
Fortunately we've put aside 20% of our data already in the 'Test set' (see Data acquisition). This is data that the model has never seen before, and we can use this to validate whether our model actually works on unseen data. To run your model against the test set, head to Model testing, select all items and click Classify selected.
To drill down into a misclassified sample, click the three dots (⋮
) next to a sample and select Show classification. You're then transported to the classification view, which lets you inspect the sample, and compare the sample to your training data. This way you can inspect whether this was actually a classification failure, or whether your data was incorrectly labeled. From here you can either update the label (when the label was wrong), or move the item to the training set to refine your model.
Misclassifications and uncertain results
It's inevitable that even a well-trained machine learning model will sometimes misclassify its inputs. When you integrate a model into your application, you should take into account that it will not always give you the correct answer.
For example, if you are classifying audio, you might want to classify several windows of data and average the results. This will give you better overall accuracy than assuming that every individual result is correct.
With the impulse designed, trained and verified you can deploy this model back to your device. This makes the model run without an internet connection, minimizes latency, and runs with minimum power consumption. Edge Impulse can package up the complete impulse - including the MFCC algorithm, neural network weights, and classification code - in a single C++ library that you can include in your embedded software.
Mobile phone
Your mobile phone can build and download the compiled impulse directly from the mobile client. See 'Deploying back to device' on the Using your mobile phone page.
To export your model, click on Deployment in the menu. Then under 'Build firmware' select your development board, and click Build. This will export the impulse, and build a binary that will run on your development board in a single step. After building is completed you'll get prompted to download a binary. Save this on your computer.
When you click the Build button, you'll see a pop-up with text and video instructions on how to deploy the binary to your particular device. Follow these instructions. Once you are done, we are ready to test your impulse out.
We can connect to the board's newly flashed firmware over serial. Open a terminal and run:
Serial daemon
If the device is not connected over WiFi, but instead connected via the Edge Impulse serial daemon, you'll need stop the daemon. Only one application can connect to the development board at a time.
This will capture audio from the microphone, run the MFCC code, and then classify the spectrogram:
Great work! You've captured data, trained a model, and deployed it to an embedded device. You can now control LEDs, activate actuators, or send a message to the cloud whenever you say a keyword!
Is your model working properly in the Studio, but does not recognize your keyword when running in continuous mode on your device? Then this is probably due to dataset imbalance (a lot more unknown / noise data compared to your keyword) in combination with our moving average code to reduce false positives.
When running in continuous mode we run a moving average over the predictions to prevent false positives. E.g. if we do 3 classifications per second you’ll see your keyword potentially classified three times (once at the start of the audio file, once in the middle, once at the end). However, if your dataset is unbalanced (there’s a lot more noise / unknown than in your dataset) the ML model typically manages to only find your keyword in the 'center' window, and thus we filter it out as a false positive.
You can fix this by either:
Adding more data :-)
Or, by disabling the moving average filter by going into ei_run_classifier.h (in the edge-impulse-sdk directory) and removing:
Note that this might increase the number of false positives the model detects.
Congratulations! you've used Edge Impulse to train a neural network model capable of recognizing audible events. There are endless applications for this type of model, from monitoring industrial machinery to recognizing voice commands. Now that you've trained your model you can integrate your impulse in the firmware of your own embedded device, see Running your impulse locally. There are examples for Mbed OS, Arduino, STM32CubeIDE, Zephyr, and any other target that supports a C++ compiler.
Or if you're interested in more, see our tutorials on Continuous motion recognition or Adding sight to your sensors. If you have a great idea for a different project, that's fine too. Edge Impulse lets you capture data from any sensor, build custom processing blocks to extract features, and you have full flexibility in your Machine Learning pipeline with the learning blocks.
We can't wait to see what you'll build! 🚀
This page is part of and describes how you can use your mobile phone to import image data into Edge Impulse.
To add your phone to your project, go to the Devices page, select Connect a new device and select Use your mobile phone. A QR code will pop up. Scan this code with your phone and your phone will pop up on the devices screen.
With your phone connected to your project, it's time to start capturing some images and build our dataset. We have a special UI for collecting images quickly, on your phone choose Collecting images?.
On your phone a permission prompt will show up, and then the viewfinder will be displayed. Set the label (in the top corner) to 'lamp', point your camera at your lamp and press Capture.
Afterwards the photo shows up in the studio on the Data acquisition page.
Do this until you have captured 30 images per class from a variety of angles. Also make sure to vary the things you capture for the unknown class.
In this tutorial, you'll use machine learning to build a system that can recognize when a particular sound is happening—a task known as audio classification. The system you create will be able to recognize the sound of water running from a faucet, even in the presence of other background noise.
You'll learn how to collect audio data from microphones, use signal processing to extract the most important information, and train a deep neural network that can tell you whether the sound of running water can be heard in a given clip of audio. Finally, you'll deploy the system to an embedded device and evaluate how well it works.
At the end of this tutorial, you'll have a firm understanding of how to classify audio using Edge Impulse.
There is also a video version of this tutorial:
Detecting human speech?
If your device is connected under Devices in the studio you can proceed:
Device compatibility
To build this project, you'll need to collect some audio data that will be used to train the machine learning model. Since the goal is to detect the sound of a running faucet, you'll need to collect some examples of that. You'll also need some examples of typical background noise that doesn't contain the sound of a faucet, so the model can learn to discriminate between the two. These two types of examples represent the two classes we'll be training our model to detect: background noise, or running faucet.
You can use your device to collect some data. In the studio, go to the Data acquisition tab. This is the place where all your raw data is stored, and - if your device is connected to the remote management API - where you can start sampling new data.
Let's start by recording an example of background noise that doesn't contain the sound of a running faucet. Under Record new data, select your device, set the label to noise
, the sample length to 1000
, and the sensor to Built-in microphone
. This indicates that you want to record 1 second of audio, and label the recorded data as noise
. You can later edit these labels if needed.
After you click Start sampling, the device will capture a second of audio and transmit it to Edge Impulse. The LED will light while recording is in progress, then light again during transmission.
When the data has been uploaded, you will see a new line appear under 'Collected data'. You will also see the waveform of the audio in the 'RAW DATA' box. You can use the controls underneath to listen to the audio that was captured.
Since you now know how to capture audio with Edge Impulse, it's time to start building a dataset. For a simple audio classification model like this one, we should aim to capture around 10 minutes of data. We have two classes, and it's ideal if our data is balanced equally between each of them. This means we should aim to capture the following data:
5 minutes of background noise, with the label "noise"
5 minutes of running faucet noise, with the label "faucet"
Real world data
In the real world, there are usually additional sounds present alongside the sounds we care about. For example, a running faucet is often accompanied by the sound of dishes being washed, teeth being brushed, or a conversation in the kitchen. Background noise might also include the sounds of television, kids playing, or cars driving past outside.
It's important that your training data contains these types of real world sounds. If your model is not exposed to them during training, it will not learn to take them into account, and it will not perform well during real-world usage.
For this tutorial, you should try to capture the following:
Background noise
2 minutes of background noise without much additional activity
1 minute of background noise with a TV or music playing
1 minute of background noise featuring occasional talking or conversation
1 minutes of background noise with the sounds of housework
Running faucet noise
1 minute of a faucet running
1 minute of a different faucet running
1 minute of a faucet running with a TV or music playing
1 minute of a faucet running with occasional talking or conversation
1 minute of a faucet running with the sounds of housework
It's okay if you can't get all of these, as long as you still obtain 5 minutes of data for each class. However, your model will perform better in the real world if it was trained on a representative dataset.
Dataset diversity
There's no guarantee your model will perform well in the presence of sounds that were not included in its training set, so it's important to make your dataset as diverse and representative of real-world conditions as possible.
Data capture and transmission
The amount of audio that can be captured in one go varies depending on a device's memory. The ST B-L475E-IOT01A developer board has enough memory to capture 60 seconds of audio at a time, and the Arduino Nano 33 BLE Sense has enough memory for 16 seconds. To capture 60 seconds of audio, set the sample length to 60000
. Because the board transmits data quite slowly, it will take around 7 minutes before a 60 second sample appears in Edge Impulse.
Once you've captured around 10 minutes of data, it's time to start designing an Impulse.
Prebuilt dataset
With the training set in place you can design an impulse. An impulse takes the raw data, slices it up in smaller windows, uses signal processing blocks to extract features, and then uses a learning block to classify new data. Signal processing blocks always return the same values for the same input and are used to make raw data easier to process, while learning blocks learn from past experiences.
For this tutorial we'll use the "MFE" signal processing block. MFE stands for Mel Frequency Energy. This sounds scary, but it's basically just a way of turning raw audio—which contains a large amount of redundant information—into simplified form.
Spectrogram block
Edge Impulse supports three different blocks for audio classification: MFCC, MFE and spectrogram blocks. If your accuracy is not great using the MFE block you can switch to the spectrogram block, which is not tuned to frequencies for the human ear.
We'll then pass this simplified audio data into a Neural Network block, which will learn to distinguish between the two classes of audio (faucet and noise).
In the studio, go to the Create impulse tab. You'll see a Raw data block, like this one.
As mentioned above, Edge Impulse slices up the raw samples into windows that are fed into the machine learning model during training. The Window size field controls how long, in milliseconds, each window of data should be. A one second audio sample will be enough to determine whether a faucet is running or not, so you should make sure Window size is set to 1000 ms. You can either drag the slider or type a new value directly.
Each raw sample is sliced into multiple windows, and the Window increase field controls the offset of each subsequent window from the first. For example, a Window increase value of 1000 ms would result in each window starting 1 second after the start of the previous one.
By setting a Window increase that is smaller than the Window size, we can create windows that overlap. This is actually a great idea. Although they may contain similar data, each overlapping window is still a unique example of audio that represents the sample's label. By using overlapping windows, we can make the most of our training data. For example, with a Window size of 1000 ms and a Window increase of 100 ms, we can extract 10 unique windows from only 2 seconds of data.
Make sure the Window increase field is set to 300 ms. The Raw data block should match the screenshot above.
Next, click Add a processing block and choose the 'MFE' block. Once you're done with that, click Add a learning block and select 'Classification (Keras)'. Finally, click Save impulse. Your impulse should now look like this:
Now that we've assembled the building blocks of our Impulse, we can configure each individual part. Click on the MFE tab in the left hand navigation menu. You'll see a page that looks like this:
The MFE block transforms a window of audio into a table of data where each row represents a range of frequencies and each column represents a span of time. The value contained within each cell reflects the amplitude of its associated range of frequencies during that span of time. The spectrogram shows each cell as a colored block, the intensity which varies depends on the amplitude.
The patterns visible in a spectrogram contain information about what type of sound it represents. For example, the spectrogram in this image shows a pattern typical of background noise:
You can tell that it is slightly different from the following spectrogram, which shows a pattern typical of a running faucet:
These differences are not necessarily easy for a person to describe, but fortunately they are enough for a neural network to learn to identify.
It's interesting to explore your data and look at the types of spectrograms it results in. You can use the dropdown box near the top right of the page to choose between different audio samples to visualize, and drag the white window on the audio waveform to select different windows of data:
There are a lot of different ways to configure the MFCC block, as shown in the Parameters box:
Handily, Edge Impulse provides sensible defaults that will work well for many use cases, so we can leave these values unchanged. You can play around with the noise floor to quickly see the effect it has on the spectrogram.
The spectrograms generated by the MFE block will be passed into a neural network architecture that is particularly good at learning to recognize patterns in this type of tabular data. Before training our neural network, we'll need to generate MFE blocks for all of our windows of audio. To do this, click the Generate features button at the top of the page, then click the green Generate features button. If you have a full 10 minutes of data, the process will take a while to complete:
Next, we'll configure the neural network and begin training.
With all data processed it's time to start training a neural network. Neural networks are algorithms, modeled loosely after the human brain, that can learn to recognize patterns that appear in their training data. The network that we're training here will take the MFE as an input, and try to map this to one of two classes—noise, or faucet.
Click on NN Classifier in the left hand menu. You'll see the following page:
A neural network is composed of layers of virtual "neurons", which you can see represented on the left hand side of the NN Classifier page. An input—in our case, an MFE spectrogram—is fed into the first layer of neurons, which filters and transforms it based on each neuron's unique internal state. The first layer's output is then fed into the second layer, and so on, gradually transforming the original input into something radically different. In this case, the spectrogram input is transformed over four intermediate layers into just two numbers: the probability that the input represents noise, and the probability that the input represents a running faucet.
During training, the internal state of the neurons is gradually tweaked and refined so that the network transforms its input in just the right ways to produce the correct output. This is done by feeding in a sample of training data, checking how far the network's output is from the correct answer, and adjusting the neurons' internal state to make it more likely that a correct answer is produced next time. When done thousands of times, this results in a trained network.
A particular arrangement of layers is referred to as an architecture, and different architectures are useful for different tasks. The default neural network architecture provided by Edge Impulse will work well for our current project, but you can also define your own architectures. You can even import custom neural network code from tools used by data scientists, such as TensorFlow and Keras.
The default settings should work, and to begin training, click Start training. You'll see a lot of text flying past in the Training output panel, which you can ignore for now. Training will take a few minutes. When it's complete, you'll see the Model panel appear at the right side of the page:
Congratulations, you've trained a neural network with Edge Impulse! But what do all these numbers mean?
At the start of training, 20% of the training data is set aside for validation. This means that instead of being used to train the model, it is used to evaluate how the model is performing. The Last training performance panel displays the results of this validation, providing some vital information about your model and how well it is working. Bear in mind that your exact numbers may differ from the ones in this tutorial.
On the left hand side of the panel, Accuracy refers to the percentage of windows of audio that were correctly classified. The higher number the better, although an accuracy approaching 100% is unlikely, and is often a sign that your model has overfit the training data. You will find out whether this is true in the next stage, during model testing. For many applications, an accuracy above 80% can be considered very good.
The Confusion matrix is a table showing the balance of correctly versus incorrectly classified windows. To understand it, compare the values in each row. For example, in the above screenshot, all of the faucet audio windows were classified as faucet, but a few noise windows were misclassified. This appears to be a great result though.
The On-device performance region shows statistics about how the model is likely to run on-device. Inferencing time is an estimate of how long the model will take to analyze one second of data on a typical microcontroller (here: an Arm Cortex-M4F running at 80MHz). Peak memory usage gives an idea of how much RAM will be required to run the model on-device.
The performance numbers in the previous step show that our model is working well on its training data, but it's extremely important that we test the model on new, unseen data before deploying it in the real world. This will help us ensure the model has not learned to overfit the training data, which is a common occurrence.
Edge Impulse provides some helpful tools for testing our model, including a way to capture live data from your device and immediately attempt to classify it. To try it out, click on Live classification in the left hand menu. Your device should show up in the 'Classify new data' panel. Capture 5 seconds of background noise by clicking Start sampling:
The sample will be captured, uploaded, and classified. Once this has happened, you'll see a breakdown of the results:
Once the sample is uploaded, it is split into windows–in this case, a total of 41. These windows are then classified. As you can see, our model classified all 41 windows of the captured audio as noise. This is a great result! Our model has correctly identified that the audio was background noise, even though this is new data that was not part of its training set.
Of course, it's possible some of the windows may be classified incorrectly. Since our model was 99% accurate based on its validation data, you can expect that at least 1% of windows will be classified wrongly—and likely much more than this, since our validation data doesn't represent every possible type of background or faucet noise. If your model didn't perform perfectly, don't worry. We'll get to troubleshooting later.
Misclassifications and uncertain results
It's inevitable that even a well-trained machine learning model will sometimes misclassify its inputs. When you integrate a model into your application, you should take into account that it will not always give you the correct answer.
For example, if you are classifying audio, you might want to classify several windows of data and average the results. This will give you better overall accuracy than assuming that every individual result is correct.
Using the Live classification tab, you can easily try out your model and get an idea of how it performs. But to be really sure that it is working well, we need to do some more rigorous testing. That's where the Model testing tab comes in. If you open it up, you'll see the sample we just captured listed in the Test data panel:
In addition to its training data, every Edge Impulse project also has a test dataset. Samples captured in Live classification are automatically saved to the test dataset, and the Model testing tab lists all of the test data.
To use the sample we've just captured for testing, we should correctly set its expected outcome. Click the ⋮
icon and select Edit expected outcome, then enter noise
. Now, select the sample using the checkbox to the left of the table and click Classify selected:
You'll see that the model's accuracy has been rated based on the test data. Right now, this doesn't give us much more information that just classifying the same sample in the Live classification tab. But if you build up a big, comprehensive set of test samples, you can use the Model testing tab to measure how your model is performing on real data.
Ideally, you'll want to collect a test set that contains a minimum of 25% the amount of data of your training set. So, if you've collected 10 minutes of training data, you should collect at least 2.5 minutes of test data. You should make sure this test data represents a wide range of possible conditions, so that it evaluates how the model performs with many different types of inputs. For example, collecting test audio for several different faucets is a good idea.
You can use the Data acquisition tab to manage your test data. Open the tab, and then click Test data at the top. Then, use the Record new data panel to capture a few minutes of test data, including audio for both background noise and faucet. Make sure the samples are labelled correctly. Once you're done, head back to the Model testing tab, select all the samples, and click Classify selected:
The screenshot shows classification results from a large number of test samples (there are more on the page than would fit in the screenshot). The panel shows that our model is performing at 85% accuracy, which is 5% less than how it performed on validation data. It's normal for a model to perform less well on entirely fresh data, so this is a successful result. Our model is working well!
For each test sample, the panel shows a breakdown of its individual performance. For example, one of the samples was classified with only 62% accuracy. Samples that contain a lot of misclassifications are valuable, since they have examples of types of audio that our model does not currently fit. It's often worth adding these to your training data, which you can do by clicking the ⋮
icon and selecting Move to training set. If you do this, you should add some new test data to make up for the loss!
Testing your model helps confirm that it works in real life, and it's something you should do after every change. However, if you often make tweaks to your model to try to improve its performance on the test dataset, your model may gradually start to overfit to the test dataset, and it will lose its value as a metric. To avoid this, continually add fresh data to your test dataset.
Data hygiene
It's extremely important that data is never duplicated between your training and test datasets. Your model will naturally perform well on the data that it was trained on, so if there are duplicate samples then your test results will indicate better performance than your model will achieve in the real world.
If the network performed great, fantastic! But what if it performed poorly? There could be a variety of reasons, but the most common ones are:
The data does not look like other data the network has seen before. This is common when someone uses the device in a way that you didn't add to the test set. You can add the current file to the test set by adding the correct label in the 'Expected outcome' field, clicking ⋮
, then selecting Move to training set.
The model has not been trained enough. Increase number of epochs to 200
and see if performance increases (the classified file is stored, and you can load it through 'Classify existing validation sample').
The model is overfitting and thus performs poorly on new data. Try reducing the number of epochs, reducing the learning rate, or adding more data.
The neural network architecture is not a great fit for your data. Play with the number of layers and neurons and see if performance improves.
As you see, there is still a lot of trial and error when building neural networks. Edge Impulse is continually adding features that will make it easier to train an effective model.
With the impulse designed, trained and verified you can deploy this model back to your device. This makes the model run without an internet connection, minimizes latency, and runs with minimum power consumption. Edge Impulse can package up the complete impulse - including the MFE algorithm, neural network weights, and classification code - in a single C++ library that you can include in your embedded software.
Mobile phone
To export your model, click on Deployment in the menu. Then under 'Build firmware' select your development board, and click Build. This will export the impulse, and build a binary that will run on your development board in a single step. After building is completed you'll get prompted to download a binary. Save this on your computer.
When you click the Build button, you'll see a pop-up with text and video instructions on how to deploy the binary to your particular device. Follow these instructions. Once you are done, we are ready to test your impulse out.
We can connect to the board's newly flashed firmware over serial. Open a terminal and run:
Serial daemon
If the device is not connected over WiFi, but instead connected via the Edge Impulse serial daemon, you'll need stop the daemon. Only one application can connect to the development board at a time.
This will capture audio from the microphone, run the MFE code, and then classify the spectrogram:
Great work! You've captured data, trained a model, and deployed it to an embedded device. It's time to celebrate—by pouring yourself a nice glass of water, and checking whether the sound is correctly classified by you model.
We can't wait to see what you'll build! 🚀
This page is part of and describes how you can use the OpenMV Cam H7 Plus to build a dataset, and import the data into Edge Impulse.
To set up your OpenMV camera, and collect some data:
Install the .
Follow the to clean the sensor and focus the lens.
Connect a micro-USB cable to the camera, and open the OpenMV IDE. The camera should automatically update to the latest firmware.
Verify that the camera can capture live images, by clicking on the Connect button in the bottom left corner, then pressing Play to run the application.
A live feed from your camera will be displayed in the top right corner of the IDE.
Once your camera is up and running, it's time to start capturing some images and build our dataset.
First, set up a new dataset via Tools -> Dataset Editor, select New Dataset.
This opens the 'Dataset editor' panel on the left side, and the 'dataset capture script' in the main panel of the IDE. Here, create three classes: "plant", "lamp" and "unknown". It's important to add an unknown class that contains random images which are neither lamps nor plants.
As we'll build a model that takes in square images, change the 'Dataset capture script' to read:
Now you can capture data for the three classes.
Click the Play icon to run the 'dataset capture script' on your OpenMV camera.
Select one of the classes by clicking on the folder name in the 'Dataset editor'.
Take a snap by clicking the Capture data (camera icon) button.
Do this until you have captured 30 images per class from a variety of angles. Also make sure to vary the things you capture for the unknown class.
To import the dataset into Edge Impulse go to Tools > Dataset Editor > Export > Upload to Edge Impulse project.
Then, choose the project name, and the split between training and testing data (recommended to keep this to 80/20).
A duplicate check runs when you upload new data, so you can upload your dataset multiple times (for example, when you've added new files) without adding the same data twice.
Training and testing data split
The split between training and testing data is based on the hash of the file in order to have a deterministic process. As a consequence you may not have a perfect 80/20 split between training and testing, but this process ensures samples are always placed in the same category.
Our dataset now appears under the Data acquisition section of our project.
is a brand-new approach to run object detection models on constrained devices. FOMO is a ground-breaking algorithm that brings real-time object detection, tracking and counting to microcontrollers for the first time. FOMO is 30x faster than MobileNet SSD and can run in <200K of RAM.
In this tutorial, we will explain how to count cars to estimate parking occupancy using FOMO.
View the finished project, including all data, signal processing and machine learning blocks here: .
Limitations of FOMO
FOMO does not output bounding boxes but will give you the object's location using centroids. Hence the size of the object is not available.
FOMO works better if the objects have a similar size.
Objects shouldn’t be too close to each other, although this can be optimized when increasing the image input resolution.
If you need the size of the objects for your project, head to the default . tutorial.
For this tutorial, you'll need a .
If you don't have any of these devices, you can also upload an existing dataset through the or use your to connect your device to Edge Impulse. After this tutorial, you can then deploy your trained machine learning model as a C++ library or as a WebAssembly package and run it on your device.
You can collect data from the following devices:
- for the Raspberry Pi 4 and the Jetson Nano.
Collecting image data from any of the that have a camera.
With the data collected, we need to label this data. Go to Data acquisition, verify that you see your data, then click on the 'Labeling queue' to start labeling.
Why use bounding box inputs?
To keep the interoperability with other models, your training image input will use bounding boxes although we will output centroids in the inference process. As such FOMO will use in the background translation between bounding boxes and segmentation maps in various parts of the end-to-end flow. This includes comparing sets between the bounding boxes and the segmentation maps to run profiling and scoring.
Using your own trained model - Useful when you already have a trained model with classes similar to your new task.
Using Object tracking - Useful when you have objects that are similar in size and common between images/frames.
For our case, since the 'car' object is part of the COCO dataset, we will use the YoloV5 pre-trained model to accelerate this process. To enable this feature, we will first click the Label suggestions dropdown,then select “Classify using YOLOv5.”
From the image above, the YOLOV5 model can already help us annotate more than 90% of the cars without us having to do it manually by our hands.
To validate whether a model works well you want to keep some data (typically 20%) aside, and don't use it to build your model, but only to validate the model. This is called the 'test set'. You can switch between your training and test sets with the two buttons above the 'Data collected' widget. If you've collected data on your development board there might be no data in the testing set yet. You can fix this by going to Dashboard > Perform train/test split.
To configure this, go to Create impulse, set the image width and image height to 96, the 'resize mode' to Fit shortest axis and add the 'Images' and 'Object Detection (Images)' blocks. Then click Save Impulse.
To configure your processing block, click Images in the menu on the left. This will show you the raw data on top of the screen (you can select other files via the drop-down menu), and the results of the processing step on the right. You can use the options to switch between RGB
and Grayscale
modes. Finally, click on Save parameters.
This will send you to the 'Feature generation' screen. In here you'll:
Resize all the data.
Apply the processing block on all this data.
Create a 3D visualization of your complete dataset.
Click Generate features to start the process.
Afterward, the Feature explorer will load. This is a plot of all the data in your dataset. Because images have a lot of dimensions (here: 96x96x1=9216 features for grayscale) we run a process called 'dimensionality reduction' on the dataset before visualizing this. Here the 9216 features are compressed down to 2, and then clustered based on similarity as shown in the feature explorer below.
With all data processed it's time to start training our FOMO model. The model will take an image as input and output objects detected using centroids. For our case, it will show centroids of cars detected on the images.
FOMO is fully compatible with any MobileNetV2 model, and depending on where the model needs to run you can pick a model with a higher or lower alpha. Transfer learning also works (although you need to train your base models specifically with FOMO in mind). Another advantage of FOMO is that it has very few parameters to learn from compared to normal SSD networks making the network even much smaller and faster to train. Together this gives FOMO the capabilities to scale from the smallest microcontrollers to full gateways or GPUs.
To configure FOMO, head over to the ‘Object detection’ section, and select 'Choose a different model' then select one of the FOMO models as shown in the image below.
Make sure to start with a learning rate of 0.001 then click start training. After the model is done you'll see accuracy numbers below the training output. You have now trained your FOMO object detection model!
As you may have noticed from the training results above, FOMO uses F1 Score as its base evaluating metric as compared to SSD MobileNetV2 which uses Mean Average Precision (mAP). Using Mean Average Precision (mAP) as the sole evaluation metric can sometimes give limited insights into the model’s performance. This is particularly true when dealing with datasets with imbalanced classes as it only measures how accurate the predictions are without putting into account how good or bad the model is for each class. The combination between F1 score and a confusion matrix gives us both the balance between precision and recall of our model as well as how the model performs for each class.
With the model trained let's try it out on some test data. When collecting the data we split the data up between a training and a testing dataset. The model was trained only on the training data, and thus we can use the data in the testing dataset to validate how well the model will work in the real world. This will help us ensure the model has not learned to overfit the training data, which is a common occurrence. To validate our model, we will go to Model Testing and select Classify all.
Given the little training data we had and the few cycles we trained on, we got an accuracy of 84.62% which can be improved further. To see the classification in detail, we will head to Live Classification* and select one image from our test sample. Click the three dots next to an item, and select Show classification. We can also capture new data directly from your development board from here.
Live Classification Result
From the test image above, our model was able to detect 16 cars out of the actual possible 18 which is a good performance. This can be seen in side by side by default, but you can also switch to overlay mode to see the model's predictions against the actual image content.
Overlay Mode for the Live Classification Result
A display option where the original image and the model's detections overlap, providing a clear juxtaposition of the model's predictions against the actual image content.
Summary Table
The summary table for a FOMO classification result provides a concise overview of the model's performance on a specific sample file, such as 'Parking_data_2283.png.2tk8c1on'. This table is organized as follows:
CATEGORY: Metric, Object category, or class label, e.g., car. COUNT: Shows detection accuracy, frequency, e.g., car detected 7 times.
INFO: Provides performance metrics definitions, including F1 Score, Precision, and Recall, which offer insights into the model's accuracy and efficacy in detection:
Table Metrics F1 Score: (77.78%): Balances precision and recall. Precision: (100.00%): Accuracy of correct predictions. Recall: (63.64%): Proportion of actual objects detected.
Viewing Options
Bottom-right controls adjust the visibility of ground truth labels and model predictions, enhancing the analysis of the model's performance:
Prediction Controls: Customize the display of model predictions, including:
Show All: Show all detections and confidence scores.
Show Correct Only: Focus on accurate model predictions.
Show incorrect only: Pinpoint undetected objects in the ground truth.
Ground Truth Controls: Toggle the visibility of original labels for direct comparison with model predictions.
Show All: Display all ground truth labels.
Hide All: Conceal all ground truth labels.
Show detected only: Highlight ground truth labels detected by the model.
Show undetected only: Identify ground truth labels missed by the model.
With the impulse designed, trained and verified you can deploy this model back to your device. This makes the model run without an internet connection, minimizes latency, and runs with minimum power consumption. Edge Impulse can package up the complete impulse - including the preprocessing steps, neural network weights, and classification code - in a single C++ library or model file that you can include in your embedded software.
From the terminal just run edge-impulse-linux-runner
. This will build and download your model, and then run it on your development board. If you're on the same network you can get a view of the camera, and the classification results directly from your dev board. You'll see a line like:
Open this URL in a browser to see your impulse running!
Go to the Deployment tab, on Build firmware section and select the board-compatible firmware to download it.
Follow the instruction provided to flash the firmware to your board and head over to your terminal and run the edge-impulse-run-impulse --debug
command:
You'll also see a URL you can use to view the image stream and results in your browser:
We can't wait to see what you'll build! 🚀
Object detection tasks take an image and output information about the class and number of objects, position, (and, eventually, size) in the image.
Edge Impulse provides, by default, two different model architectures to perform object detection, MobileNetV2 SSD FPN-Lite uses bounding boxes (objects location and size) and FOMO uses centroids (objects location only).
Want to compare the two models?
See
(bounding boxes) Can run on systems starting from Linux CPUs up to powerful GPUs
(centroids) Can run on high-end MCUs, Linux CPUs, and GPUs
This page is part of and describes how you can use development boards with an integrated camera to import image data into Edge Impulse.
First, make sure your device is connected on the Devices page in the Edge Impulse Studio. Then, head to Data acquisition, and under 'Record new data', set a label and select 'Camera' as a sensor (most devices have multiple resolutions). This shows you a nice preview of the camera. Then click Start sampling.
A few moments later - depending on the speed of the development board and the resolution - you'll now have an image collected!
Do this until you have captured 30 images per class from a variety of angles. Also make sure to vary the things you capture for the unknown class.
In this tutorial, you'll use machine learning to build a system that can recognize and track multiple objects in your house through a camera - a task known as object detection. Adding sight to your embedded devices can make them see the difference between poachers and elephants, count objects, find your lego bricks, and detect dangerous situations. In this tutorial, you'll learn how to collect images for a well-balanced dataset, how to apply transfer learning to train a neural network and deploy the system to an edge device.
At the end of this tutorial, you'll have a firm understanding of how to do object detection using Edge Impulse.
There is also a video version of this tutorial:
Running on a microcontroller?
In this tutorial we'll build a model that can distinguish between two objects on your desk - we've used a lamp and a coffee cup, but feel free to pick two other objects. To make your machine learning model see it's important that you capture a lot of example images of these objects. When training the model these example images are used to let the model distinguish between them.
Capturing data
Capture the following amount of data - make sure you capture a wide variety of angles and zoom level. It's fine if both images are in the same frame. We'll be cropping the images later to be square so make sure the objects are in the frame.
30 images of a lamp.
30 images of a coffee cup.
You can collect data from the following devices:
Or you can capture your images using another camera, and then upload them by going to Data acquisition and clicking the 'Upload' icon.
With the data collected we need to label this data. Go to Data acquisition, verify that you see your data, then click on the 'Labeling queue' to start labeling.
No labeling queue? Go to Dashboard, and under 'Project info > Labeling method' select 'Bounding boxes (object detection)'.
Labeling data
The labeling queue shows you all the unlabeled data in your dataset. Labeling your objects is as easy as dragging a box around the object, and entering a label. To make your life a bit easier we try to automate this process by running an object tracking algorithm in the background. If you have the same object in multiple photos we thus can move the boxes for you and you just need to confirm the new box. After dragging the boxes, click Save labels and repeat this until your whole dataset is labeled.
AI-Assisted Labeling
Afterwards you should have a well-balanced dataset listed under Data acquisition in your Edge Impulse project.
Rebalancing your dataset
To validate whether a model works well you want to keep some data (typically 20%) aside, and don't use it to build your model, but only to validate the model. This is called the 'test set'. You can switch between your training and test sets with the two buttons above the 'Data collected' widget. If you've collected data on your development board there might be no data in the testing set yet. You can fix this by going to Dashboard > Perform train/test split.
With the training set in place you can design an impulse. An impulse takes the raw data, adjusts the image size, uses a preprocessing block to manipulate the image, and then uses a learning block to classify new data. Preprocessing blocks always return the same values for the same input (e.g. convert a color image into a grayscale one), while learning blocks learn from past experiences.
In the studio go to Create impulse, set the image width and image height to 320
, the 'resize mode' to Fit shortest axis
and add the 'Images' and 'Object Detection (Images)' blocks. Then click Save impulse.
Configuring the processing block
To configure your processing block, click Images in the menu on the left. This will show you the raw data on top of the screen (you can select other files via the drop down menu), and the results of the processing step on the right. You can use the options to switch between 'RGB' and 'Grayscale' mode, but for now leave the color depth on 'RGB' and click Save parameters.
This will send you to the 'Feature generation' screen. In here you'll:
Resize all the data.
Apply the processing block on all this data.
Create a 3D visualization of your complete dataset.
Click Generate features to start the process.
Afterwards the 'Feature explorer' will load. This is a plot of all the data in your dataset. Because images have a lot of dimensions (here: 320x320x3=307,200 features) we run a process called 'dimensionality reduction' on the dataset before visualizing this. Here the 307,200 features are compressed down to just 3, and then clustered based on similarity. Even though we have little data you can already see the clusters forming (lamp images are all on the left, coffee all on the right), and can click on the dots to see which image belongs to which dot.
Configuring the transfer learning model
With all data processed it's time to start training a neural network. Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. The network that we're training here will take the image data as an input, and try to map this to one of the three classes.
It's very hard to build a good working computer vision model from scratch, as you need a wide variety of input data to make the model generalize well, and training such models can take days on a GPU. To make this easier and faster we are using transfer learning. This lets you piggyback on a well-trained model, only retraining the upper layers of a neural network, leading to much more reliable models that train in a fraction of the time and work with substantially smaller datasets.
To configure the transfer learning model, click Object detection in the menu on the left. Here you can select the base model (the one selected by default will work, but you can change this based on your size requirements), and set the rate at which the network learns.
Leave all settings as-is, and click Start training. After the model is done you'll see accuracy numbers below the training output. You have now trained your model!
With the model trained let's try it out on some test data. When collecting the data we split the data up between a training and a testing dataset. The model was trained only on the training data, and thus we can use the data in the testing dataset to validate how well the model will work in the real world. This will help us ensure the model has not learned to overfit the training data, which is a common occurrence.
To validate your model, go to Model testing and select Classify all. Here we hit 92.31% precision, which is great for a model with so little data.
To see a classification in detail, click the three dots next to an item, and select Show classification. This brings you to the Live classification screen with much more details on the file (you can also capture new data directly from your development board from here). This screen can help you determine why items were misclassified.
Live Classification Result
This view is particularly useful for a direct comparison between the raw image and the model's interpretation. Each object detected in the image is highlighted with a bounding box. Alongside these boxes, you'll find labels and confidence scores, indicating what the model thinks each object is and how sure it is about its prediction. This mode is ideal for understanding the model's performance in terms of object localization and classification accuracy.
Overlay Mode for the Live Classification Result
In this view, bounding boxes are drawn around the detected objects, with labels and confidence scores displayed within the image context. This approach offers a clearer view of how the bounding boxes align with the objects in the image, making it easier to assess the precision of object localization. The overlay view is particularly useful for examining the model's ability to accurately detect and outline objects within a complex visual scene.
Summary Table
Name: This field displays the name of the sample file analyzed by the model. For instance, 'sample.jpg.22l74u4f' is the file name in this case.
CATEGORY: Lists the types of objects that the model has been trained to detect. In this example, two categories are shown: 'coffee' and 'lamp'.
COUNT: Indicates the number of times each category was detected in the sample file. In this case, both 'coffee' and 'lamp' have a count of 1, meaning each object was detected once in the sample.
INFO: This column provides additional information about the model's performance. It displays the 'Precision score', which, in this example, is 95.00%. The precision score represents the model's accuracy in making correct predictions over a range of Intersection over Union (IoU) values, known as the mean Average Precision (mAP).
With the impulse designed, trained and verified you can deploy this model back to your device. This makes the model run without an internet connection, minimizes latency, and runs with minimum power consumption. Edge Impulse can package up the complete impulse - including the preprocessing steps, neural network weights, and classification code - in a single C++ library or model file that you can include in your embedded software.
Running the impulse on your Raspberry Pi 4 or Jetson Nano
From the terminal just run edge-impulse-linux-runner
. This will build and download your model, and then run it on your development board. If you're on the same network you can get a view of the camera, and the classification results directly from your dev board. You'll see a line like:
Open this URL in a browser to see your impulse running!
Running the impulse on your mobile phone
On your mobile phone just click Switch to classification mode at the bottom of your phone screen. Point it at an object and press 'Capture'.
Integrating the model in your own application
We can't wait to see what you'll build! 🚀
Alternatively you can also capture your dataset directly through a different app, and then upload the data directly to Edge Impulse There are both options to do this visually (click the 'Upload' icon on the data acquisition screen), or via the CLI. You can find instructions here: . In this case it's highly recommended to you use square images, as the transfer learning model expects these; and you probably want to resize these images before uploading them to make sure training remains fast.
You can view the finished project, including all data, signal processing and machine learning blocks here: .
Do you want a device that listens to your voice? We have a specific tutorial for that! See .
For this tutorial, you'll need a .
If you don't see your supported development board listed here, be sure to check the page for the appropriate tutorial.
Edge Impulse can ingest data from any device - including embedded devices that you already have in production. See the documentation for the for more information.
Alternatively, you can load an example test set that has about ten minutes of data in these classes (but how much fun is that?). See the for more information.
This page allows you to configure the MFE block, and lets you preview how the data will be transformed. The right of the page shows a visualization of the MFE's output for a piece of audio, which is known as a .
Once this process is complete the feature explorer shows a visualization of your dataset. Here dimensionality reduction is used to map your features onto a 3D space, and you can use the feature explorer to see if the different classes separate well, or find mislabeled data (if it shows in a different cluster). You can find more information in .
Your mobile phone can build and download the compiled impulse directly from the mobile client. See 'Deploying back to device' on the page.
Congratulations! you've used Edge Impulse to train a neural network model capable of recognizing a particular sound. There are endless applications for this type of model, from monitoring industrial machinery to recognizing voice commands. Now that you've trained your model you can integrate your impulse in the firmware of your own embedded device, see . There are examples for Mbed OS, Arduino, STM32CubeIDE, and any other target that supports a C++ compiler.
Or if you're interested in more, see our tutorials on or . If you have a great idea for a different project, that's fine too. Edge Impulse lets you capture data from any sensor, build to extract features, and you have full flexibility in your Machine Learning pipeline with the learning blocks.
You can now go back to the tutorial to build your machine learning model.
Alternatively, you can capture your images using another camera, and then upload them directly from the studio by going to Data acquisition and clicking the 'Upload' icon or using Edge Impulse CLI .
All our collected images will be staged for annotation at the "labeling queue". Labeling your objects is as easy as dragging a box around the object, and entering a label. However, when you have a lot of images, this manual annotation method can become tiresome and time consuming. To make this task even easier, Edge impulse provides methods that can help you save time and energy. The AI assisted labeling techniques include:
Using YoloV5 - Useful when your objects are part of the common objects in the .
One of the beauties of FOMO is its fully convolutional nature, which means that just the ratio is set. Thus, it gives you more flexibility in its usage compared to the classical . method. For this tutorial, we have been using 96x96 images but it will accept other resolutions as long as the images are square.
To run using an Arduino library, go to the studio Deployment tab on Create Library section and select Arduino Library to download your custom Arduino library. Go to your Arduino IDE, then click on Sketch >> Include Library >> Add .Zip ( Your downloaded Arduino library). Make sure to follow the instruction provided on . Open Examples >> Examples from custom library and select your library. Upload the ''Portenta_H7_camera'' sketch to your Portenta then open your serial monitor to view results.
Congratulations! You've added object detection using FOMO to your sensors. Now that you've trained your model you can integrate your impulse in the firmware of your own edge device, see or the documentation for the Node.js, Python, Go and C++ SDKs that let you do this in a few lines of code and make this model run on any device.
when an object is seen.
Or if you're interested in more, see our tutorials on or . If you have a great idea for a different project, that's fine too. Edge Impulse lets you capture data from any sensor, build to extract features, and you have full flexibility in your Machine Learning pipeline with the learning blocks.
You can view the finished project, including all data, signal processing and machine learning blocks here: .
We recently released a brand-new approach to perform object detection tasks on microcontrollers, , if you are using a constraint device that does not have as much compute, RAM, and flash as Linux platforms, please head to this end-to-end tutorial:
Alternatively, if you only need to recognize a single object, you can follow our tutorial on - which performs image classification, hence, limits you to a single object but can also fit on microcontrollers.
You can view the finished project, including all data, signal processing and machine learning blocks here: .
For this tutorial, you'll need a .
If you don't have any of these devices, you can also upload an existing dataset through the - including . After this tutorial you can then deploy your trained machine learning model as a C++ library and run it on your device.
- for the Raspberry Pi 4 and the Jetson Nano.
Use AI-Assisted Labeling for your object detection project! For more information, .
For this tutorial we'll use the 'Images' preprocessing block. This block takes in the color image, optionally makes the image grayscale, and then turns the data into a features array. If you want to do more interesting preprocessing steps - like finding faces in a photo before feeding the image into the network -, see the tutorial. Then we'll use a 'Transfer Learning' learning block, which takes all the images in and learns to distinguish between the two ('coffee', 'lamp') classes.
Congratulations! You've added object detection to your sensors. Now that you've trained your model you can integrate your impulse in the firmware of your own edge device, see the documentation for the Node.js, Python, Go and C++ SDKs that let you do this in a few lines of code and make this model run on any device. when an object is seen.
Or if you're interested in more, see our tutorials on or . If you have a great idea for a different project, that's fine too. Edge Impulse lets you capture data from any sensor, build to extract features, and you have full flexibility in your Machine Learning pipeline with the learning blocks.