1 of 3

Data acquisition

All collected data for each project can be viewed on the Data acquisition tab. You can see how your data has been split for train/test set as well as the data distribution for each class in your dataset. You can also send new sensor data to your project either by file upload, WebUSB, Edge Impulse API, or Edge Impulse CLI.

Add data to your project

Record new data

The panel on the right allows you to collect data directly from any fully supported platform:

The WebUSB and the Edge Impulse daemon work with any fully supported device by flashing the pre-built Edge Impulse firmware to your board. See the list of fully supported boards.

When using the Edge Impulse for Linux CLI, run edge-impulse-linux --clean and it will add your platform to the device list of your project. You will then will be able to interact with it from the Record new data panel.

Other methods

Studio uploader.
CLI uploader.
CLI data forwarder.
Ingestion API.
Import from S3 buckets (Enterprise feature).
Upload portals (Enterprise feature).

Dataset train/test split ratio

The train/test split is a technique for training and evaluating the performance of a machine learning algorithms. It indicates how your data is split between training and testing samples. For example, an 80/20 split indicates that 80% of the dataset is used for model training purposes while 20% is used for model testing.

This section also shows how your data samples in each class are distributed to prevent imbalanced datasets which might introduce bias during model training.

Data acquisition filter

Manually navigating to some categories of data can be time consuming, especially when dealing with a large dataset. The data acquisition filter enables the user to filter data samples based on some criteria of choice. This can be based on:

Label - class to which a sample represents.
Sample name - unique ID representing a sample.
Signature validity
Enabled and disabled samples
Length of sample - duration of a sample.

The filtered samples can then be manipulated by editing label, deleting, moving from trains set to test set and vise versa a shown in the image above.

The data manipulations above can also be applied at the data sample level just by simply navigating to the individual data sample then clicking ⋮ and selecting the type of action you might want to perform to the specific sample. This might be renaming , editing its label, disabling, cropping, splitting, downloading, and even deleting the sample when desired.

Cropping samples

To crop a data sample, go to the sample you want to crop and click ⋮, then select Crop sample. You can specific a length, or drag the handles to resize the window, then move the window around to make your selection.

Made a wrong crop? No problem, just click Crop sample again and you can move your selection around. To undo the crop, just set the sample length to a high number, and the whole sample will be selected again.

Splitting data sample

Besides cropping you can also split data automatically. Here you can perform one motion repeatedly, or say a keyword over and over again, and the events are detected and can be stored as individual samples. This makes it easy to very quickly build a high-quality dataset of discrete events. To do so head to Data acquisition, record some new data, click, and select Split sample. You can set the window length, and all events are automatically detected. If you're splitting audio data you can also listen to events by clicking on the window, the audio player is automatically populated with that specific split.

Samples are automatically centered in the window, which might lead to problems on some models (the neural network could learn a shortcut where data in the middle of the window is always associated with a certain label), so you can select "Shift samples" to automatically move the data a little bit around.

Splitting data is - like cropping data - non-destructive. If you're not happy with a split just click Crop sample and you can move the selection around easily.

Labelling queue

The labelling queue will only appear on your data acquisition page if you are dealing with an object detection tasks. The labelling queue shows a list of images that have been staged for annotation for your project.

If you are not dealing with an object detection task, you can simply disable the labelling queue bar by going to Dashboard > Project info > Labeling method and clicking the dropdown and selecting "one label per data item" as shown in the image below.

For more information about the labelling queue and how to perform data annotation using AI assisted labelling on Edge Impulse, you can have a look at our documentation here.

Uploader

You can upload an already existing dataset to your project directly through the Edge impulse Studio. The data should be in the Data Acquisition format (CBOR, JSON, CSV), or as WAV, JPG or PNG files.

To upload data using the uploader, go to the Data acquisition page and click on the uploader button as shown in the image below:

When uploading your data, you can choose the category you want your data to fall in i.e training set, testing set or automatically split the dataset between training and testing set. You can also choose whether to infer labels from files name or enter a label of which the files should automatically fall in.

Labeling queue (Images)

If you are working on an object detection project, you will most likely see "labelling queue" bar on your data acquisition page. The labeling queue shows you all the data that has not been labelled in your dataset.

Can't see the labeling queue? Go to Dashboard, and under 'Project info > Labeling method' select 'Bounding boxes (object detection)'.

In object detection, labelling is the process of adding a bounding box around specific objects in an image so that your machine learning model can learn and infer from it. Edge impulse studio has an inbuilt data annotation tool with AI assisted labelling to assist you in your labelling workflows as we will see.

In the Edge Impulse studio, labelling your data is as easy as dragging a box around the object, then entering a label and saving as shown below.

However, as simple the manual labelling process might look like, it sometimes can become tedious and time consuming especially when dealing with huge datasets. To make your life easier, Edge Impulse studio has an inbuilt AI-assisted labelling feature to automatically assist you in your labelling workflows.

AI Assisted labelling

there are 3 ways you can use to perform AI assisted labelling on the Edge Impulse Studio:

Using yolov5
Using your own model
Using object tracking

Using YoloV5

By utilizing an existing library of pre-trained object detection models from YOLOv5 (trained with the COCO dataset), common objects in your images can quickly be identified and labeled in seconds without needing to write any code!

To label your objects with YOLOv5 classification, click the Label suggestions dropdown and select “Classify using YOLOv5.” If your object is more specific than what is auto-labeled by YOLOv5, e.g. “coffee” instead of the generic “cup” class, you can modify the auto-labels to the left of your image. These modifications will automatically apply to future images in your labeling queue.

Click Save labels to move on to your next raw image, and see your fully labeled dataset ready for training in minutes!

Using your own model

You can also use your own trained model to predict and label your new images. From an existing (trained) Edge Impulse object detection project, upload new unlabeled images from the Data Acquisition tab. Then, from the “Labeling queue”, click the Label suggestions dropdown and select “Classify using ”:

You can also upload a few samples to a new object detection project, train a model, then upload more samples to the Data Acquisition tab and use the AI-Assisted Labeling feature for the rest of your dataset. Classifying using your own trained model is especially useful for objects that are not in YOLOv5, such as industrial objects, etc.

Click Save labels to move on to your next raw image, and see your fully labeled dataset ready for training in minutes using your own pre-trained model!

Using Object tracking

If you have objects that are a similar size or common between images, you can also track your objects between frames within the Edge Impulse Labeling Queue, reducing the amount of time needed to re-label and re-draw bounding boxes over your entire dataset.

Draw your bounding boxes and label your images, then, after clicking Save labels, the objects will be tracked from frame to frame:

Now that your object detection project contains a fully labeled dataset, learn how to train and deploy your model to your edge device: check out our tutorial!

We are excited to see what you build with the AI-Assisted Labeling feature in Edge Impulse, please post your project on our forum or tag us on social media, @Edge Impulse!