1 of 12

Data acquisition

All collected data for each project can be viewed on the Data acquisition tab. You can see how your data has been split for train/test set as well as the data distribution for each class in your dataset. You can also send new sensor data to your project either by file upload, WebUSB, Edge Impulse API, or Edge Impulse CLI.

Add data to your project

Organization data

Since the creation of Edge Impulse, we have been helping our customers deal with complex data pipelines, complex data transformation methods and complex clinical validation studies.

The organizational data gives you tools to centralize, validate and transform datasets so they can be easily imported into your projects.

Collect data

The panel on the right allows you to collect data directly from any fully supported platform:

When using the Edge Impulse for Linux CLI, run edge-impulse-linux --clean and it will add your platform to the device list of your project. You will then will be able to interact with it from the Collect data panel.

Need more?

Upload existing datasets

Multi-label time-series data

Data sample preview

Time-series data samples

For time-series data samples (including audio), you can visualize the time-series graphs on the right panel with a dark-blue background:

If you are dealing with multi-label data samples. Here is the corresponding preview:

Non-time-series & pre-processed data samples

Preview the values of tabular non-time-series & pre-processed data samples:

Images data samples

Raw images can be directly visualized from the preview:

For object detection projects, we can overlay the corresponding bounding boxes:

Video data samples

Raw videos (.mp4) can be directly visualized from the preview. Please note that you will need to split the videos into frames as we do not support training on videos files:

Dataset overview

List view

Grid view

Dataset train/test split ratio

The train/test split is a technique for training and evaluating the performance of machine learning algorithms. It indicates how your data is split between training and testing samples. For example, an 80/20 split indicates that 80% of the dataset is used for model training purposes while 20% is used for model testing.

This section also shows how your data samples in each class are distributed to prevent imbalanced datasets which might introduce bias during model training.

Data acquisition filters

Manually navigating to some categories of data can be time-consuming, especially when dealing with a large dataset. The data acquisition filter enables the user to filter data samples based on some criteria of choice. This can be based on:

Label - class to which a sample represents.
Sample name - unique ID representing a sample.
Signature validity
Enabled and disabled samples
Length of sample - duration of a sample.

The filtered samples can then be manipulated by editing labels, deleting, and moving from the training set to the testing set (and vice versa), a shown in the image above.

Data sample actions

The data manipulations above can also be applied at the data sample level by simply navigating to the individual data sample by clicking on "⋮" and selecting the type of action you might want to perform on the specific sample. This might be renaming, editing its label, disabling, cropping, splitting, downloading, and even deleting the sample when desired.

Edit label(s)

Single label

Multi-label

Cropping samples

To crop a data sample, go to the sample you want to crop and click ⋮, then select Crop sample. You can specify a length, or drag the handles to resize the window, then move the window around to make your selection.

Made a wrong crop? No problem, just click Crop sample again and you can move your selection around. To undo the crop, just set the sample length to a high number, and the whole sample will be selected again.

Splitting data sample

Besides cropping you can also split data automatically. Here you can perform one motion repeatedly, or say a keyword over and over again, and the events are detected and can be stored as individual samples. This makes it easy to very quickly build a high-quality dataset of discrete events. To do so head to Data Acquisition, record some new data, click, and select Split sample. You can set the window length, and all events are automatically detected. If you're splitting audio data you can also listen to events by clicking on the window, the audio player is automatically populated with that specific split.

Samples are automatically centered in the window, which might lead to problems on some models (the neural network could learn a shortcut where data in the middle of the window is always associated with a certain label), so you can select "Shift samples" to automatically move the data a little bit around.

Splitting data is - like cropping data - non-destructive. If you're not happy with a split just click Crop sample and you can move the selection around easily.

Labeling tools

If you are not dealing with an object detection task, you can simply change the Labeling method configuration by going to Dashboard > Project info > Labeling method and clicking the dropdown and selecting "one label per data item" as shown in the image below.

Uploader

You can upload your existing data samples and datasets to your project directly through the Edge Impulse Studio Uploader.

The uploader signs local files and uploads them to the . This is useful to upload existing data samples and entire datasets, or to migrate data between Edge Impulse instances.

The uploader currently handles these types of files:

.cbor - Files in the Edge Impulse . The uploader will not resign these files, only upload them.
.json - Files in the Edge Impulse . The uploader will not resign these files, only upload them.
.csv - Files in the Edge Impulse . If you have configured the "", the settings will be used to parse your CSV files.
.wav - Lossless audio files. It's recommended to use the same frequency for all files in your data set, as signal processing output might be dependent on the frequency.
.jpg and .png - Image files. It's recommended to use the same ratio for all files in your data set.
.mp4 and .avi- Video file. You can then from the studio split this video file into images at a configurable frame per second.
info.labels - JSON-like file (without the .json extension). You can use it to add metadata and for custom labeling strategies (single-label vs multi-label, float values labels, etc...). See

The uploader currently handles these types of :

Need more?

To upload data using the uploader, go to the Data acquisition page and click on the uploader button as shown in the image below:

Bounding boxes?

Upload data

Upload mode

Select individual files: This option let you select multiple individual files within a single folder. If you want to upload images with bounding boxes, make sure to also select the label files.

Select a folder: This option let you select one folder, including all the subfolders.

Upload into a category

Select which category you want to upload your dataset into. Options can be training, testing or perform an 80/20 split between your data samples.

Label

When a labeling method is not provided, the labels are automatically inferred from the filename through the following regex: ^[a-zA-Z0-9\s-_]+. For example: idle.01 will yield the label idle.

Thus, if you want to use labels (string values) containing float values (e.g. "0.01", "5.02", etc...), automatic labeling won't work.

Edge Impulse Exporter format (`info.labels` files)

The Edge Impulse Exporter acquisition format provides a simple and intuitive way to store files and associated labels. Folders containing data in this format will take the following structure:

The subdirectories contain files in any Edge Impulse-supported format (see above). Each file represents a sample and is associated with its respective labels in the info.labels file.

The info.labels file (can be located in each subdirectory or at the folder root) provides detailed information about the labels. The file follows a JSON format, with the following structure:

version: Indicates the version of the label format.
files: A list of objects, where each object represents a supported file format and its associated labels.
- path: The path or file name.
- category: Indicates whether the image belongs to the training or testing set.
- label (optional): Provides information about the labeled objects.
  - type: Specifies the type of label - unlabeled, label, multi-label
  - label (optional): The actual label or class name of the sample.
  - label: Label for the given period.
    startIndex: Timestamp in milliseconds.
    endIndex: Timestamp in milliseconds.
- metadata (Optional): Additional metadata associated with the image, such as the site where it was collected, the timestamp or any useful information.
- boundingBoxes (Optional): A list of objects, where each object represents a bounding box for an object within the image.
  - label: The label or class name of the object within the bounding box.
  - x, y: The coordinates of the top-left corner of the bounding box.
  - width, height: The width and height of the bounding box.

The Studio Uploader will automatically detect the info.labels file:

Image dataset annotation format

Image datasets can be found in a range of different formats. Different formats have different directory structures, and require annotations (or labels) to follow a particular structure. We support uploading data in many different formats in the Edge Impulse Studio.

Image datasets usually consist of a bunch of image files, and one (or many) annotation files, which provide labels for the images. Image datasets may have annotations that consist of:

A single-label: each image has a single label
Bounding boxes: used for object detection; images contain 'objects' to be detected, given as a list of labeled 'bounding boxes'

When you upload an image dataset, we try to automatically detect the format of that data (in some cases, we cannot detect it and you will need to manually select it).

Once the format of your dataset has been selected, click on Upload Data and let the Uploader parse your dataset:

Understanding image dataset annotation formats

Unlabeled

Leave the data unlabeled, you can manually label your data sample in the studio.

Edge Impulse object detection format

The Edge Impulse object detection acquisition format provides a simple and intuitive way to store images and associated bounding box labels. Folders containing data in this format will take the following structure:

The subdirectories contain image files in JPEG or PNG format. Each image file represents a sample and is associated with its respective bounding box labels in the bounding_boxes.labels file.

The bounding_boxes.labels file in each subdirectory provides detailed information about the labeled objects and their corresponding bounding boxes. The file follows a JSON format, with the following structure:

version: Indicates the version of the label format.
files: A list of objects, where each object represents an image and its associated labels.
- path: The path or file name of the image.
- category: Indicates whether the image belongs to the training or testing set.
- (optional) label: Provides information about the labeled objects.
  - type: Specifies the type of label (e.g., a single label).
  - label: The actual label or class name of the object.
- (Optional) metadata: Additional metadata associated with the image, such as the site where it was collected, the timestamp or any useful information.
- boundingBoxes: A list of objects, where each object represents a bounding box for an object within the image.
  - label: The label or class name of the object within the bounding box.
  - x, y: The coordinates of the top-left corner of the bounding box.
  - width, height: The width and height of the bounding box.

bounding_boxes.labels example:

COCO JSON

The COCO JSON (Common Objects in Context JSON) format is a widely used standard for representing object detection datasets. It provides a structured way to store information about labeled objects, their bounding boxes, and additional metadata.

A COCO JSON dataset can follow this directory structure:

The _annotations.coco.json file in each subdirectory provides detailed information about the labeled objects and their corresponding bounding boxes. The file follows a JSON format, with the following structure:

Categories

The "categories" component defines the labels or classes of objects present in the dataset. Each category is represented by a dictionary containing the following fields:

id: A unique integer identifier for the category.
name: The name or label of the category.
(Optional) supercategory: A higher-level category that the current category belongs to, if applicable. This supercategory is not used or imported by the Uploader.

Images

The "images" component stores information about the images in the dataset. Each image is represented by a dictionary with the following fields:

id: A unique integer identifier for the image.
width: The width of the image in pixels.
height: The height of the image in pixels.
file_name: The file name or path of the image file.

Annotations

The "annotations" component contains the object annotations for each image. An annotation refers to a labeled object and its corresponding bounding box. Each annotation is represented by a dictionary with the following fields:

id: A unique integer identifier for the annotation.
image_id: The identifier of the image to which the annotation belongs.
category_id: The identifier of the category that the annotation represents.
bbox: A list representing the bounding box coordinates in the format [x, y, width, height].
(Optional) area: The area (in pixels) occupied by the annotated object.
(Optional) segmentation: The segmentation mask of the object, represented as a list of polygons.
(Optional) iscrowd: A flag indicating whether the annotated object is a crowd or group of objects.

Edge Impulse uploader currently doesn't import the area, segmentation, iscrowd fields.

_annotations.coco.json example:

Open Images CSV

The OpenImage dataset provides object detection annotations in CSV format. The _annotations.csv file is located in the same directory of the images it references. A class-descriptions.csv mapping file can be used to give short description or human-readable classes from the MID LabelName.

An OpenImage CSV dataset usually has this directory structure:

Annotation Format:

Each line in the CSV file represents an object annotation.
The values in each line are separated by commas.

CSV Columns:

The CSV file typically includes several columns, each representing different attributes of the object annotations.
The common columns found in the OpenImage CSV dataset include:
- ImageID: An identifier or filename for the image to which the annotation belongs.
- Source: The source or origin of the annotation, indicating whether it was manually annotated or obtained from other sources.
- LabelName: The class label of the object.
- Confidence: The confidence score or probability associated with the annotation.
- XMin, YMin, XMax, YMax: The coordinates of the bounding box that encloses the object, usually represented as the top-left (XMin, YMin) and bottom-right (XMax, YMax) corners.
- IsOccluded, IsTruncated, IsGroupOf, IsDepiction, IsInside: Binary flags indicating whether the object is occluded, truncated, a group of objects, a depiction, or inside another object.

Currently, Edge Impulse only imports these fields:

Class Labels:

Each object in the dataset is associated with a class label.
The class labels in the OpenImage dataset are represented as LabelName in the CSV file.
The LabelName correspond to specific object categories defined in the OpenImage dataset's ontology (MID).

Note that Edge Impulse does not enforce this ontology, if you have an existing dataset using the MID LabelName, simply provide a class-description.csv mapping file to see your classes in Edge Impulse Studio.

Bounding Box Coordinates:

The bounding box coordinates define the normalized location and size of the object within the image.
The coordinates are represented as the X and Y pixel values for the top-left corner (XMin, YMin) and the bottom-right corner (XMax, YMax) of the bounding box.

class-descriptions.csv mapping file:

To be ingested in Edge Impulse the mapping file name must end with *class-descriptions.csv

_annotations.csv example:

Pascal VOC XML

The Pascal VOC (Visual Object Classes) format is another widely used standard for object detection datasets. It provides a structured format for storing images and their associated annotations, including bounding box labels.

A Pascal VOC dataset can follow this directory structure:

The Pascal VOC dataset XML format typically consists of the following components:

Image files: The dataset includes a collection of image files, usually in JPEG or PNG format. Each image represents a sample in the dataset.
Annotation files: The annotations for the images are stored in XML files. Each XML file corresponds to an image and contains the annotations for that image, including bounding box labels and class labels.
Class labels: A predefined set of class labels is defined for the dataset. Each object in the image is assigned a class label, indicating the category or type of the object.
Bounding box annotations: For each object instance in an image, a bounding box is defined. The bounding box represents the rectangular region enclosing the object. It is specified by the coordinates of the top-left corner, width, and height of the box.
Additional metadata: Pascal VOC format allows the inclusion of additional metadata for each image or annotation. This can include information like the source of the image, the author, or any other relevant details. The Edge Impulse uploader currently doesn't import these metadata.

The structure of an annotation file in Pascal VOC format typically follows this pattern:

cubes.23im33f2.xml:

Plain CSV

The Plain CSV format is a very simple format: a CSV annotation file is stored in the same directory as the images. We support both "Single Label" and "Object Detection" labeling methods for this format.

An Plain CSV dataset can follow this directory structure:

Annotation Format:

Each line in the CSV file represents an object annotation.
The values in each line are separated by commas.

CSV Columns (Single Label):

The Plain CSV format (single Label) just contains the file_name and the class:

file_name: The filename of the image.
class_name: The class label or category of the image.

_annotations_single_label.csv example:

CSV Columns (Object Detection):

This Plain CSV format is similar to the TensorFlow Object Detection Dataset format. In this format, the CSV file contains the following columns:

file_name: The filename of the image.
classes: The class label or category of the object.
xmin: The x-coordinate of the top-left corner of the bounding box.
ymin: The y-coordinate of the top-left corner of the bounding box.
xmax: The x-coordinate of the bottom-right corner of the bounding box.
ymax: The y-coordinate of the bottom-right corner of the bounding box.

Each row represents an annotated object in an image. In the following example, there are three objects in cubes_training_0.jpg: a blue, a green and a red cube, two objects in cubes_training_1.jpg, etc... The bounding box coordinates are specified as the top-left corner (xmin, ymin) and the bottom-right corner (xmax, ymax).

_annotations_bounding_boxes.csv example:

YOLO TXT

The YOLO TXT format is a specific text-based annotation format mostly used in conjunction with the YOLO object detection algorithm. This format represents object annotations for an image in a plain text file.

File Structure:
- Each annotation is represented by a separate text file.
- The text file has the same base name as the corresponding image file.
- The file extension is .txt.

Example:

Annotation Format:
- Each line in the TXT file represents an object annotation.
- Each annotation line contains space-separated values representing different attributes.
- The attributes in each line are ordered as follows: class_label, normalized bounding box coordinates (center_x, center_y, width, height).
Class label:
- The class label represents the object category or class.
- The class labels are usually represented as integers, starting from 0 or 1.
- Each class label corresponds to a specific object class defined in the dataset.
Normalized Bounding Box Coordinates:
- The bounding box coordinates represent the location and size of the object in the image.
- The coordinates are normalized to the range [0, 1], where (0, 0) represents the top-left corner of the image, and (1, 1) represents the bottom-right corner.
- The normalized bounding box coordinates include the center coordinates (center_x, center_y) of the bounding box and its width and height.
- The center coordinates (center_x, center_y) are relative to the width and height of the image, where (0, 0) represents the top-left corner, and (1, 1) represents the bottom-right corner.
- The width and height are also relative to the image size.

Here's an example of a YOLO TXT annotation file format for a single object:

For instance: cubes-23im33f2.txt

Each line represent a given normalized bounding box for the corresponding cubes-23im33f2.jpg image.

Mapping the Class Label:
- The classes.txt, classes.names or data.yaml (used by Roboflow YOLOv5 PyTorch export format) files contain configuration values used by the model to locate images and map class names to class_ids.

For example with the cubes on a conveyor belt dataset with the classes.txt file:

Data explorer

The data explorer is a visual tool to explore your dataset, find outliers or mislabeled data, and to help label unlabeled data. The data explorer first tries to extract meaningful features from your data (through signal processing and neural network embeddings) and then uses a dimensionality reduction algorithm to map these features to a 2D space. This gives you a one-look overview of your complete dataset.

The Data explorer tab is available for audio classification, image classification and regression projects only.

Using the data explorer

To access the data explorer head to Data acquisition, click Data explorer, then select a way to generate the data explorer. Depending on you data you'll see three options:

Using a pre-trained model - here we use a large neural network trained on a varied dataset to generate the embeddings. This works very well if you don't have any labeled data yet, or want to look at new clusters of data. This option is available for keywords and for images.
Using your trained impulse - here we use the neural network block in your impulse to generate the embeddings. This typically creates even better visualizations, but will fail if you have completely new clusters of data as the neural network hasn't learned anything about them. This option is only available if you have a trained impulse.

Then click Generate data explorer to create the data explorer. If you want to make a different choice after creating the data explorer click ⋮ in the top right corner and select Clear data explorer.

Want to see examples of the same dataset visualized in different ways? Scroll down!

Viewing and modifying data

To view an item in your dataset just click on any of the dots (some basic information appears on hover). Information about the sample, and a preview of the data item appears at the bottom of the data explorer. You can click Set label (or l on your keyboard) to set a new label for the data item, or press Delete item (or d on your keyboard) to remove the data item. These changes are queued until you click Save labels (at the top of the data explorer).

Assisted labeling

The data explorer marks unlabeled data in gray (with an 'Unlabeled' label). To label this data you click on any gray dot. To then set a label by clicking the Set label button (or by pressing l on your keyboard) and enter a label. Other unlabeled data in the vicinity of this item will automatically be labeled as well. This way you can quickly label clustered data.

To upload unlabeled data you can either:

Select the items in your dataset under Data acquisition, select all relevant items, click Edit labels and set the label to an empty string.

Or, if you want to start from scratch, click the three dots on top of the data explorer, and select Clear all labels.

Wait, how does this work?

The data explorer uses a three-stage process:

It runs your data through an input and a DSP block - like any impulse.
It passes the result of 1) through part of a neural network. This forces the neural network to compress the DSP output even further, but to features that are highly specialized to distinguish the exact type of data in your dataset (called 'embeddings').
The embeddings are passed through t-SNE, a dimensionality reduction algorithm.

33 input features (from the signal processing step)
A layer with 20 neurons
A layer with 10 neurons
A layer with 4 neurons (the number of different classes)

While training the neural network we try to find the mathematical formula that best maps the input to the output. We do this by tweaking each neuron (each neuron is a parameter in our formula). The interesting part is that each layer of the neural network will start acting like a feature extracting step - just like our signal processing step - but highly tuned for your specific data. For example, in the first layer, it'll learn what features are correlated, in the second it derives new features, and in the final layer, it learns how to distinguish between classes of motions.

In the data explorer we now cut off the final layer of the neural network, and thus we get the derived features back - these are called "embeddings". Contrary to features we extract using signal processing we don't really know what these features are - they're specific to your data. In essence, they provide a peek into the brain of the neural network. Thus, if you see data in the data explorer that you can't easily separate, the neural network probably can't either - and that's a great way to spot outliers - or if there's unlabeled data close to a labeled cluster they're probably very similar - great for labeling unknown data!

Examples of different embeddings

Here's an example of using the data explorer to visualize a very complex computer vision dataset (distinguishing between the four cats of one of our infrastructure engineers).

No embeddings (just running t-SNE over the images)

With embeddings from a pre-trained MobileNetV2 model

With embeddings from a custom ML model

For less complex datasets, or lower-dimensional data you'll typically see more separation, even without custom models.

Questions? Excited?

Data sources

The data sources page is much more than just adding data from external sources. It lets you create complete automated data pipelines so you can work on your active learning strategies.

From there, you can import datasets from existing cloud storage buckets, automate and schedule the imports, and, trigger actions such as explore and label your new data, retrain your model, automatically build a new deployment task and more.

Run transformation jobs directly from your projects

You can also trigger cloud jobs, known as , these are particularly useful if you want to generate synthetic datasets or automate tasks using the Edge Impulse API. We provide several pre-built transformation blocks available for organizations' projects:

This view, originally accessible from the main left menu, has been moved to the Data acquisition tab for better clarity. The screenshots have not yet been updated.

Add a data source

Click in + Add new data source and select where your data lives:

You can either use:

AWS S3 buckets
Google Cloud Storage
Any S3-compatible bucket
Don't import data (if you just need to create a pipeline)

Click on Next, provide credentials:

Click on Verify credentials:

Here, you have several options to automatically label your data:

Infer from folder name

In the example above, the structure of the folder is the following:

The labels will be picked from the folder name and will be split between your training and testing set using the following ratio 80/20.

The samples present in an unlabeled/ folder will be kept unlabeled in Edge Impulse Studio.

Alternatively, you can also organize your folder using the following structure to automatically split your dataset between training and testing sets:

Infer from file name

When using this option, only the file name is taken into account. The part before the first . will be used to set the label. E.g. cars.01741.jpg will set the label to cars.

Keep the data unlabeled

All the data samples will be unlabeled, you will need to label them manually before using them.

Finally, click on Next, post-sync actions.

From this view, you can automate several actions:

Recreate data explorer
Retrain model
If needed, will retrain your model with the same impulse. If you enable this you'll also get an email with the new validation and test set accuracy.
Note: You will need to have trained your project at least once.
Create new version
Store all data, configuration, intermediate results and final models.
Create new deployment
Builds a new library or binary with your updated model. Requires 'Retrain model' to also be enabled.

Run the pipeline

Once your pipeline is set, you can run it directly from the UI, from external sources or by scheduling the task.

Run the pipeline from the UI

To run your pipeline from Edge Impulse studio, click on the ⋮ button and select Run pipeline now.

Run the pipeline from code

To run your pipeline from Edge Impulse studio, click on the ⋮ button and select Run pipeline from code. This will display an overlay with curl, Node.js and Python code samples.

You will need to create an API key to run the pipeline from code.

Schedule your pipeline jobs

By default, your pipeline will run every day. To schedule your pipeline jobs, click on the ⋮ button and select Edit pipeline.

Free users can only run the pipeline every 4 hours. If you are an enterprise customer, you can run this pipeline up to every minute.

Once the pipeline has successfully finish, you will receive an email like the following:

Webhooks

Another useful feature is to create a webhook to call a URL when the pipeline has ran. It will run a POST request containing the following information:

Edit your pipeline

As of today, if you want to update your pipeline, you need to edit the configuration json available in ⋮ -> Run pipeline from code.

Here is an example of what you can get if all the actions have been selected:

Free projects have only access to the above builtinTransformationBlock.

Select Copy as pipeline step and paste it to the configuration json file.

Synthetic data

The Synthetic data integration allows you to easily create and manage synthetic data, enhancing your datasets and improving model performance. Whether you need images, speech, or audio data, our new integrations make it simple and efficient.

There is also a video version demonstrating the Synthetic data workflow and features:

Only available with Edge Impulse Professional and Enterprise Plans

Try our Professional Plan or FREE Enterprise Trial today.

Supported Blocks

DALL-E Image Generation Block: Generate image datasets using Dall·E using the DALL-E model.
Whisper Keyword Spotting Generation Block: Generate keyword-spotting datasets using the Whisper model. Ideal for keyword spotting and speech recognition applications.
Eleven Labs Sound Generation Block: Generate sound datasets using the Eleven Labs model. Ideal for generating realistic sound effects for various applications.

To use these features, navigate to Data Sources, add new data source transformation blocks, set up actions, run a pipeline, and then go to Data Acquisition to view the output. If you want to make changes or refine your prompts, you have to delete the pipeline and start over.

Benefits of Synthetic data management

Enhance Your Datasets: Easily augment your datasets with high-quality synthetic data.
Improve Model Accuracy: Synthetic data can help fill gaps in your dataset, leading to better model performance.
Save Time and Resources: Quickly generate the data you need without the hassle of manual data collection.

Accessing the Synthetic data

To access the Synthetic data, follow these steps:

Navigate to Your Project: Open your project in Edge Impulse Studio.
Open Synthetic data Tab: Click on the "Synthetic Data" tab in the left-hand menu.

Generating Synthetic Images with GPT-4 (DALL-E)

Create Realistic Images: Use DALL-E to generate realistic images for your datasets.
Customize Prompts: Tailor the prompts to generate specific types of images suited to your project needs.

Select Image Generation: Choose the GPT-4 (DALL-E) option.
Enter a Prompt: Describe the type of images you need (e.g., "A photo of a factory worker wearing a hard hat", or some background data for object detection (of cars) "aerial view images of deserted streets").
Generate and Save: Click "Generate" to create the images. Review and save the generated images to your dataset.

Generating Human Speech with Whisper

Human-like Speech Data: Utilize Whisper to generate human-like speech data.
Versatile Applications: Ideal for voice recognition, command-and-control systems, or any application requiring natural language processing.

Select Speech Generation: Choose the Whisper option.
Enter Text: Provide the text you want to be converted into speech (e.g., "Hello Edge!").
Generate and Save: Click "Generate" to create the speech data. Review and save the generated audio files.

Eleven Labs Sound Effects models

Realistic Sound Effects: Use Eleven Labs to generate realistic sound effects for your projects.
Customize Sound Prompts: Define the type of sound you need (e.g., "Glass breaking" or "Car engine revving").

Custom Synthetic data blocks (Enterprise Plan only)

You can also create custom transformation blocks to generate synthetic data using your own models or APIs. This feature allows you to integrate your custom generative models into Edge Impulse Studio for data augmentation.

Follow our Custom Transformation Blocks guide to learn how to create and use custom transformation blocks in Edge Impulse Studio.

Data ingestion should also include a flag in the header x-synthetic-data-job-id, allowing users to pass an optional new header to indicate this is synthetic data. Read on in the Custom Transformation Block section below for more details.

x-synthetic-data-job-id header

To handle the new synthetic data ingestion flag, it is necessary to parse an extra argument as can be seen in the DALL-E blocks example below:

parser.add_argument('--synthetic-data-job-id', type=int, required=False, help="If specified, sets the synthetic_data_job_id metadata key")

Then, pass the argument as a header to the ingestion api via the x-synthetic-data-job-id header field:

Pass the argument as a header to ingestion:
            res = requests.post(url=INGESTION_URL + '/api/' + upload_category + '/files',
                headers={
                    'x-label': label,
                    'x-api-key': API_KEY,
                    'x-metadata': json.dumps({
                        'generated_by': 'dall-e-3',
                        'prompt': prompt,
                    }),
                    'x-synthetic-data-job-id': str(args.synthetic_data_job_id) if args.synthetic_data_job_id is not None else None,
                },
                files = { 'data': (os.path.basename(fullpath), png, 'image/png') }
            )

Read on in our DALL-E 3 Image Generation Block guide and repo here.

Summary

To start using the Synthetic Data tab, log in to your Edge Impulse Enterprise account and open a project. Navigate to the "Synthetic Data" tab and explore the new features. If you don't have an account yet, sign up for free at Edge Impulse.

For further assistance, visit our forum or check out our tutorials.

Stay tuned for more updates on what we're doing with generative AI. Exciting times ahead!

Labeling queue

In object detection ML projects, labeling is the process of defining regions of interest in the frame.

Manually labeling images can become tedious and time-consuming, especially when dealing with huge datasets. This is why Edge Impulse studio provides an AI-assisted labeling tool to help you in your labeling workflows.

To use the labeling queue, you will need to set your Edge Impulse project as an "object detection" project. The labeling queue will only display the images that have not been labeled.

Currently, it only works to define bounding boxes (ingestion format used to train both MobileNetv2 SSD and FOMO models).

Can't see the labeling queue?

Go to Dashboard, and under 'Project info > Labeling method' select 'Bounding boxes (object detection)'.

AI-Assisted labeling

The labeling queue supports four different operation modes:

Using YOLOv5.
Using your current impulse.
Using any pretrained object detection model.
Using object tracking.

Already have a labeled dataset?

Using YOLOv5

By utilizing an existing library of pre-trained object detection models from YOLOv5 (trained with the COCO dataset), common objects in your images can quickly be identified and labeled in seconds without needing to write any code!

To label your objects with YOLOv5 classification, click the Label suggestions dropdown and select “Classify using YOLOv5.” If your object is more specific than what is auto-labeled by YOLOv5, e.g. “coffee” instead of the generic “cup” class, you can modify the auto-labels to the left of your image. These modifications will automatically apply to future images in your labeling queue.

Click Save labels to move on to your next raw image, and see your fully labeled dataset ready for training in minutes!

Using your own model

You can also use your own trained model to predict and label your new images. From an existing (trained) Edge Impulse object detection project, upload new unlabeled images from the Data Acquisition tab.

Currently, this only works with models trained with MobileNet SSD transfer learning.

From the “Labeling queue”, click the Label suggestions dropdown and select “Classify using ”:

You can also upload a few samples to a new object detection project, train a model, then upload more samples to the Data Acquisition tab and use the AI-Assisted Labeling feature for the rest of your dataset. Classifying using your own trained model is especially useful for objects that are not in YOLOv5, such as industrial objects, etc.

Click Save labels to move on to your next raw image, and see your fully labeled dataset ready for training in minutes using your own pre-trained model!

Using any pretrained object detection model

This only works with object detection models outputting bounding boxes. Centroid-based models (such as FOMO) won't work.

To label using a pretrained objection model:

Create a new (second) Edge Impulse project.
Choose Upload your model.
Select your model file (e.g. in ONNX or TFLite format), tell a bit about your model, and verify that the model gives correct suggestions via "Check model behavior".

Click Save model.

While still in this (second) project:

Go to Data acquisition and upload your unlabeled dataset.
Click Labeling queue, and under 'Label suggestions' choose "Classify using 'your project name'". You now get suggestions based on your uploaded model:

When you're done labeling, go to Data acquisition > Export data and export your (now labeled) dataset.
Import the labeled dataset into your original project.

Using Object tracking

If you have objects that are a similar size or common between images, you can also track your objects between frames within the Edge Impulse Labeling Queue, reducing the amount of time needed to re-label and re-draw bounding boxes over your entire dataset.

Draw your bounding boxes and label your images, then, after clicking Save labels, the objects will be tracked from frame to frame:

Now that your object detection project contains a fully labeled dataset, learn how to train and deploy your model to your edge device: check out our tutorial!

We are excited to see what you build with the AI-Assisted Labeling feature in Edge Impulse, please post your project on our forum or tag us on social media, @Edge Impulse!

AI labeling

The AI labeling feature is an extensible way of integrating existing AI models into your workflow and using them to automatically label your datasets. This can be achieved through leveraging ready-made blocks provided by Edge Impulse or developing custom ones to meet your specific needs. Whether you’re labeling images, bounding boxes, or audio samples, these AI labeling blocks are sure to save you time and improve your consistency.

AI labeling actions

To exit an AI labeling action configuration and return to the overview page, you can click on the < button found to the left of the block configuration title (AI Labeling - Step 1) or click the AI labeling tab.

You can create multiple AI labeling actions that contain one or more AI labeling blocks, each with different prompts, parameters and filters. From the AI labeling actions overview page you can add new actions, delete existing ones, access their configurations, or run them directly.

AI labeling blocks

There are several AI labeling blocks that have been developed by Edge Impulse and are available for your use. These are listed below with links to their associated code in public GitHub repositories:

Configuration

To begin, proceed to the Data acquisition view and ensure you have data samples in the Dataset tab. Then, continue to the AI labeling tab.

Click on an existing AI labeling action to enter the configuration view for that action. If you do not yet have an AI labeling action, you can create one using the + Add new label action button.

Select an AI labeling block

The first step is to select an AI labeling block that you would like to use. By default, blocks that are not compatible with your data modality or labeling objective are greyed out. Once you have selected an AI labeling block, the parameters specific to that block are presented.

Some blocks require an API key to interact with other providers, such as OpenAI or Hugging Face. You can set your API key directly in the AI labeling block configuration panel the first time you use the block. The key you enter will be stored in Secrets. Once created, the key value will no longer be visible anywhere in the platform.

To manage your secrets if you are an Enterprise customer, go to your organization and select the Secrets menu item. If you are not an Enterprise customer, secrets can be accessed through the settings in your developer profile. Click on your avatar and go to your Account settings -> Secrets:

Add multiple AI labeling blocks

You can chain several AI labeling blocks together to create an AI labeling action with multiple steps. For example, you can first use a zero-shot object detector to automatically detect high-level objects within an image then follow this with a step to re-label the bounding boxes with more precise labels or remove them entirely.

To add multiple AI labeling blocks, click on the button at the bottom of the block configuration panel to add an extra step.

Filter which data to label

Preview

Tip: If you want to change the number of data samples or the number of columns shown in the preview, click on the view settings icon. Changing the number of columns can be useful for object detection use cases where your objects are small and you want to see larger images.

Before running the AI labeling action on your entire dataset, we recommend to preview the label results on a small subset of your dataset. This will help you to validate your prompt and parameters so that you can iterate faster.

When clicking on the Label preview data button, the changes are staged but not directly applied.

Set metadata (optional)

You can add metadata such as ai-labeled: true, labeling-source: GPT-4o or labeled-on: Nov 2024 that will be set after running the AI labeling action. This is particularly useful if you plan to add more data samples over time and need to filter out your already-labeled samples.

Run the labeling process

Once you are satisfied with your configuration, click on the Label all data button. This will run the AI labeling action and apply the labeling updates to your dataset.

Examples

Bounding box labeling with OWL-ViT

A zero-shot object detector that uses OWL-ViT to label objects with bounding boxes. For complex objects, pair with "Bounding box re-labeling with GPT-4o" to refine labels.

Bounding box re-Labeling with GPT-4o

OpenAI API key needed

Take existing bounding boxes (e.g. from a zero-shot object detector) and use GPT-4o to re-label or remove them as needed. This can be configured as a two step process in a single AI labeling action.

Image labeling with GPT-4o

OpenAI API key needed

Use GPT-4o to apply a single label to images. Customize prompts to return a single label, for example “Is there a person in this picture? Answer with 'yes' or 'no'.”

Image labeling with pretrained models

Audio labeling with AudioSet

Hugging Face API key needed

Custom AI labeling blocks

Only available with Edge Impulse Enterprise Plans

Interface

Inputs

The parameters defined in your parameters.json file will be passed into your block as command line arguments. For example, a parameter named other-label will be passed to your block as --other-label <value>.

In addition to the parameters defined by you, the following arguments will be automatically passed to your AI labeling block.

One note is that secrets will be passed as environment variables. Additional required environment variables can be defined using requiredEnvVariables in the block info section of the parameters.json file. These can then be set within your organization by editing the block in the AI labeling section found under custom blocks for your organization.

<data-type> options:

images_object_detection | images_single_label | audio | other

Outputs

There are no required outputs from the AI labeling block. In general, all changes are applied inside the block itself using API calls.

Push to Edge Impulse

Initialize and push the block:
Your AI labeling block is available in your organization. To run the block, open or create a project belonging to your organization and go to Data acquisition > AI labeling and create an AI labeling action that uses your block.

Preview mode

AI labeling blocks can run in 'preview' mode (triggered when you click Label preview data within an AI labeling action configuration). The changes are staged but not directly applied.

For preview mode, the --propose-actions <job-id> flag and argument are passed into your block. When you see this flag you should not apply changes directly to the data samples (e.g. via raw_data_api.set_sample_bounding_boxes or raw_data_api.set_sample_structured_labels) but rather use the raw_data_api.set_sample_proposed_changes API call.

Troubleshooting

CSV Wizard (Time-series)

The CSV Wizard allows users with larger or more complex datasets to easily upload their data without having to worry about converting it to the .

To access the CSV Wizard, navigate to the Data Acquisition tab of your Edge Impulse project and click on the CSV Wizard button:

How to use the CSV Wizard

We can take a look at some sample data from a Heart Rate Monitor (Polar H10). We can see there is a lot of extra information we don’t need:

Step 1: Upload a file

Choose a CSV file to upload and select "Upload File". The file will be automatically analyzed and the results will be displayed in the next step. Here I have selected an export from a HR monitor. You can try it out yourself by downloading this file:

Step 2: Analyze your data

When processing your data, we will check for the following:

Does this data contain a label?
Is this data time series data?
Is this data raw sensor data or processed features?
Is this data separated by a standard delimiter?
Is this data separated by a non-standard delimiter?

If there are settings that need to be adjusted, (for the start of your data you can select skip first x lines or no header, and adjust the delimiter) you can do so before selecting looks good, next"**.

Step 3: About your data

Here you can select the timestamp column, or row and the frequency of the timestamps. If you do not have a timestamp column, you can select No timestamp column and add a timestamp later. If you do have a timestamp column you can select: the timestamp format, e.g. full timestamp, and the frequency of the timestamps, overriding is also possible via Override timestamp difference. For example Selecting 20000 will give you the detected frequency of: 0.05 Hz.

Step 4: CSV Wizard: About your values

Here you can select the label column, or row. If you do not have a label column, you can select No (no worries, you can provide this when you upload data) and add a label later. If you do have a label column you can select: Yes it's "Value" The CSV Wizard allows users with larger or more complex datasets to easily upload their data without having to worry about converting it to CBOR format. You can also select the columns that contain your values.

Step 5: Split up your samples

How long do you want your samples to be?

In this section, you can set a length limit to your sample size. For example, if your CSV contains 30 seconds of data, when setting a limit of 3000ms, it will create 10 distinct data samples of 3 seconds.

How should we deal with multiple labels in a sample?

Congratulations! 🚀 You have successfully created a CSV transform with the CSV Wizard. You can now save this transform and use it to process your data.

Multi-label

How should we deal with multiple labels in a sample?

◉ The sample should have multiple labels

◯ Use the last value of "label as the label for each sample (see the table on the right)

What happens next?

Any CSV files that you upload into your project - whether it's through the uploader, the CLI, the API or through data sources - will now be processed according to the rules you set up with the CSV Wizard!

Multi-label (Time-series)

The multi-label feature brings considerable value by preserving the context of longer data samples, simplifying data preparation, and enabling more efficient and effective data analysis.

The first improvement is in the way you can analyze and process complex datasets, especially for applications where context and continuity are crucial. With this feature, you can maintain the integrity of longer-duration samples, such as hour-long exercise sessions or night-long sleep studies, without the need to segment these into smaller fragments every time there is a change in activity. This holistic view not only preserves the context but also provides a richer data set for analysis.

Then, the ability to select window sizes directly in Edge Impulse addresses a common pain point - data duplication. Without the multi-label feature, you need to pre-process data, either externally or using , creating multiple copies of the same data with different window sizes to determine the optimal configuration. This process is not only time-consuming but also prone to errors and inefficiencies. With multi-label samples, adjusting the window size becomes a simple parameter change in the "", streamlining the process significantly. This flexibility saves time, reduces the risk of errors, and allows for more dynamic experimentation with data, leading to potentially more accurate and insightful models.

Upload multi-label samples

1. Using the CSV Wizard

For example:

2. Using Edge Impulse `info.labels` description file

The other way is to create a info.labels file, present in your dataset. Edge Impulse will automatically detect it when you upload your dataset and will use this file to set the labels.

The info.labels looks like the following:

Tip

You can export a public project dataset that uses the multi-label feature to understand how the info.labels is structured.

Once you have your info.labels file available, to upload it, you can use:

The Studio Uploader will automatically detect the info.labels file:

2. Using Edge Impulse `structured_labels.labels` description file

The structured_labels.labels format looks like the following:

Then you can run the following command:

Visualizing multi-label samples

Please note that you can also hide the sensors in the graph:

Edit multi-label samples

To edit the labels using the UI, click ⋮ -> Edit labels. The following model will appear:

Please note that you will need to provide continuous and non-overlapping labels for the full length of your data sample.

The format is the like following:

Classify multi-label data

In the Live classification tab, you can classify your multi-label test samples:

Limitations

Labeling UI is available but is only text-based.
Overlapping labels are not supported
The entire data sample needs to have a label, you cannot leave parts unlabeled.

Please, leave us a note on the forum or feedback using the "?" widget (bottom-right corner) if you see a need or an issue. This can help us prioritize the development or improvement of the features.

Resources

Public projects

Tabular data (Pre-processed & Non-time-series)

Edge Impulse has been a powerful platform for processing raw data like time-series and images, and now we’re taking it even further. With tabular data import, we’re empowering users by enabling seamless integration of pre-processed data, giving you more flexibility in how you work. Whether you process data externally or face restrictions with raw data, this update makes it even easier to leverage Edge Impulse for all your data handling and model training needs.

Check out the !

Upload tabular data samples

1. Using the CSV Wizard

2. Using Edge Impulse `info.labels` description file

The other way is to create a info.labels file, present in your dataset. Edge Impulse will automatically detect it when you upload your dataset and will use this file to set the labels.

Visualizing tabular data samples

Classify tabular data

In the Live classification tab, you can classify your tabular/pre-processed test samples:

Resources

Public projects

Uploader

You can upload your existing data samples and datasets to your project directly through the Edge Impulse Studio Uploader.

The uploader signs local files and uploads them to the . This is useful to upload existing data samples and entire datasets, or to migrate data between Edge Impulse instances.

The uploader currently handles these types of files:

.cbor - Files in the Edge Impulse . The uploader will not resign these files, only upload them.
.json - Files in the Edge Impulse . The uploader will not resign these files, only upload them.
.csv - Files in the Edge Impulse . If you have configured the "", the settings will be used to parse your CSV files.
.wav - Lossless audio files. It's recommended to use the same frequency for all files in your data set, as signal processing output might be dependent on the frequency.
.jpg and .png - Image files. It's recommended to use the same ratio for all files in your data set.
.mp4 and .avi- Video file. You can then from the studio split this video file into images at a configurable frame per second.
info.labels - JSON-like file (without the .json extension). You can use it to add metadata and for custom labeling strategies (single-label vs multi-label, float values labels, etc...). See

The uploader currently handles these types of :

Need more?

If none of these above choices are suitable for your project, you can also have a look at the Transformation blocks to parse your data samples to create a dataset supported by Edge Impulse. See

To upload data using the uploader, go to the Data acquisition page and click on the uploader button as shown in the image below:

Bounding boxes?

If you have existing bounding boxes for your images dataset, make sure your project's labeling method is set to Bounding Boxes (object detection), you can change this parameter in your .

Then you need to upload any label files with your images. You can upload object detection datasets in any . Select both your images and the labels file when uploading to apply the labels. The uploader will try to automatically detect the right format.

Upload data

Upload mode

Select individual files: This option let you select multiple individual files within a single folder. If you want to upload images with bounding boxes, make sure to also select the label files.

Select a folder: This option let you select one folder, including all the subfolders.

Upload into a category

Select which category you want to upload your dataset into. Options can be training, testing or perform an 80/20 split between your data samples.

If needed, you can always perform a split later from your .

Label

When a labeling method is not provided, the labels are automatically inferred from the filename through the following regex: ^[a-zA-Z0-9\s-_]+. For example: idle.01 will yield the label idle.

Thus, if you want to use labels (string values) containing float values (e.g. "0.01", "5.02", etc...), automatic labeling won't work.

To bypass this limitation, you can make an info.labels JSON file containing your dataset files' info. We also support adding metadata to your samples. See below to understand the .

Edge Impulse Exporter format (`info.labels` files)

The Edge Impulse Exporter acquisition format provides a simple and intuitive way to store files and associated labels. Folders containing data in this format will take the following structure:

The subdirectories contain files in any Edge Impulse-supported format (see above). Each file represents a sample and is associated with its respective labels in the info.labels file.

The info.labels file (can be located in each subdirectory or at the folder root) provides detailed information about the labels. The file follows a JSON format, with the following structure:

version: Indicates the version of the label format.
files: A list of objects, where each object represents a supported file format and its associated labels.
- path: The path or file name.
- category: Indicates whether the image belongs to the training or testing set.
- label (optional): Provides information about the labeled objects.
  - type: Specifies the type of label - unlabeled, label, multi-label
  - label (optional): The actual label or class name of the sample.
  - labels (optional): The labels in the :
    label: Label for the given period.
    startIndex: Timestamp in milliseconds.
    endIndex: Timestamp in milliseconds.
- metadata (Optional): Additional metadata associated with the image, such as the site where it was collected, the timestamp or any useful information.
- boundingBoxes (Optional): A list of objects, where each object represents a bounding box for an object within the image.
  - label: The label or class name of the object within the bounding box.
  - x, y: The coordinates of the top-left corner of the bounding box.
  - width, height: The width and height of the bounding box.

The Studio Uploader will automatically detect the info.labels file:

Want to try it yourself? You can export any dataset from once you cloned it.

Image dataset annotation format

Image datasets usually consist of a bunch of image files, and one (or many) annotation files, which provide labels for the images. Image datasets may have annotations that consist of:

A single-label: each image has a single label
Bounding boxes: used for object detection; images contain 'objects' to be detected, given as a list of labeled 'bounding boxes'

When you upload an image dataset, we try to automatically detect the format of that data (in some cases, we cannot detect it and you will need to manually select it).

Once the format of your dataset has been selected, click on Upload Data and let the Uploader parse your dataset:

Understanding image dataset annotation formats

Unlabeled

Leave the data unlabeled, you can manually label your data sample in the studio.

Edge Impulse object detection format

The subdirectories contain image files in JPEG or PNG format. Each image file represents a sample and is associated with its respective bounding box labels in the bounding_boxes.labels file.

version: Indicates the version of the label format.
files: A list of objects, where each object represents an image and its associated labels.
- path: The path or file name of the image.
- category: Indicates whether the image belongs to the training or testing set.
- (optional) label: Provides information about the labeled objects.
  - type: Specifies the type of label (e.g., a single label).
  - label: The actual label or class name of the object.
- (Optional) metadata: Additional metadata associated with the image, such as the site where it was collected, the timestamp or any useful information.
- boundingBoxes: A list of objects, where each object represents a bounding box for an object within the image.
  - label: The label or class name of the object within the bounding box.
  - x, y: The coordinates of the top-left corner of the bounding box.
  - width, height: The width and height of the bounding box.

bounding_boxes.labels example:

Want to try it yourself? Check this in Edge Impulse Object Detection format. You can also retrieve this dataset from this . Data exported from an object detection project in the Edge Impulse Studio is exported in this format.

COCO JSON

A COCO JSON dataset can follow this directory structure:

Categories

The "categories" component defines the labels or classes of objects present in the dataset. Each category is represented by a dictionary containing the following fields:

id: A unique integer identifier for the category.
name: The name or label of the category.
(Optional) supercategory: A higher-level category that the current category belongs to, if applicable. This supercategory is not used or imported by the Uploader.

Images

The "images" component stores information about the images in the dataset. Each image is represented by a dictionary with the following fields:

id: A unique integer identifier for the image.
width: The width of the image in pixels.
height: The height of the image in pixels.
file_name: The file name or path of the image file.

Annotations

id: A unique integer identifier for the annotation.
image_id: The identifier of the image to which the annotation belongs.
category_id: The identifier of the category that the annotation represents.
bbox: A list representing the bounding box coordinates in the format [x, y, width, height].
(Optional) area: The area (in pixels) occupied by the annotated object.
(Optional) segmentation: The segmentation mask of the object, represented as a list of polygons.
(Optional) iscrowd: A flag indicating whether the annotated object is a crowd or group of objects.

Edge Impulse uploader currently doesn't import the area, segmentation, iscrowd fields.

_annotations.coco.json example:

Want to try it yourself? Check this in the COCO JSON format.

Open Images CSV

An OpenImage CSV dataset usually has this directory structure:

Annotation Format:

Each line in the CSV file represents an object annotation.
The values in each line are separated by commas.

CSV Columns:

The CSV file typically includes several columns, each representing different attributes of the object annotations.
The common columns found in the OpenImage CSV dataset include:
- ImageID: An identifier or filename for the image to which the annotation belongs.
- Source: The source or origin of the annotation, indicating whether it was manually annotated or obtained from other sources.
- LabelName: The class label of the object.
- Confidence: The confidence score or probability associated with the annotation.
- XMin, YMin, XMax, YMax: The coordinates of the bounding box that encloses the object, usually represented as the top-left (XMin, YMin) and bottom-right (XMax, YMax) corners.
- IsOccluded, IsTruncated, IsGroupOf, IsDepiction, IsInside: Binary flags indicating whether the object is occluded, truncated, a group of objects, a depiction, or inside another object.

Currently, Edge Impulse only imports these fields:

Class Labels:

Each object in the dataset is associated with a class label.
The class labels in the OpenImage dataset are represented as LabelName in the CSV file.
The LabelName correspond to specific object categories defined in the OpenImage dataset's ontology (MID).

Bounding Box Coordinates:

The bounding box coordinates define the normalized location and size of the object within the image.
The coordinates are represented as the X and Y pixel values for the top-left corner (XMin, YMin) and the bottom-right corner (XMax, YMax) of the bounding box.

class-descriptions.csv mapping file:

To be ingested in Edge Impulse the mapping file name must end with *class-descriptions.csv
Here is an example of the mapping file:

_annotations.csv example:

Want to try it yourself? Check this in the OpenImage CSV format.

Pascal VOC XML

A Pascal VOC dataset can follow this directory structure:

The Pascal VOC dataset XML format typically consists of the following components:

Image files: The dataset includes a collection of image files, usually in JPEG or PNG format. Each image represents a sample in the dataset.
Annotation files: The annotations for the images are stored in XML files. Each XML file corresponds to an image and contains the annotations for that image, including bounding box labels and class labels.
Class labels: A predefined set of class labels is defined for the dataset. Each object in the image is assigned a class label, indicating the category or type of the object.
Bounding box annotations: For each object instance in an image, a bounding box is defined. The bounding box represents the rectangular region enclosing the object. It is specified by the coordinates of the top-left corner, width, and height of the box.
Additional metadata: Pascal VOC format allows the inclusion of additional metadata for each image or annotation. This can include information like the source of the image, the author, or any other relevant details. The Edge Impulse uploader currently doesn't import these metadata.

The structure of an annotation file in Pascal VOC format typically follows this pattern:

cubes.23im33f2.xml:

Want to try it yourself? Check this in the Pascal VOC format.

Plain CSV

An Plain CSV dataset can follow this directory structure:

Annotation Format:

Each line in the CSV file represents an object annotation.
The values in each line are separated by commas.

CSV Columns (Single Label):

The Plain CSV format (single Label) just contains the file_name and the class:

file_name: The filename of the image.
class_name: The class label or category of the image.

_annotations_single_label.csv example:

CSV Columns (Object Detection):

This Plain CSV format is similar to the TensorFlow Object Detection Dataset format. In this format, the CSV file contains the following columns:

file_name: The filename of the image.
classes: The class label or category of the object.
xmin: The x-coordinate of the top-left corner of the bounding box.
ymin: The y-coordinate of the top-left corner of the bounding box.
xmax: The x-coordinate of the bottom-right corner of the bounding box.
ymax: The y-coordinate of the bottom-right corner of the bounding box.

_annotations_bounding_boxes.csv example:

Want to try it yourself? Check this in the Plain CSV (object detection) format.

YOLO TXT

File Structure:
- Each annotation is represented by a separate text file.
- The text file has the same base name as the corresponding image file.
- The file extension is .txt.

Example:

Annotation Format:
- Each line in the TXT file represents an object annotation.
- Each annotation line contains space-separated values representing different attributes.
- The attributes in each line are ordered as follows: class_label, normalized bounding box coordinates (center_x, center_y, width, height).
Class label:
- The class label represents the object category or class.
- The class labels are usually represented as integers, starting from 0 or 1.
- Each class label corresponds to a specific object class defined in the dataset.
Normalized Bounding Box Coordinates:
- The bounding box coordinates represent the location and size of the object in the image.
- The coordinates are normalized to the range [0, 1], where (0, 0) represents the top-left corner of the image, and (1, 1) represents the bottom-right corner.
- The normalized bounding box coordinates include the center coordinates (center_x, center_y) of the bounding box and its width and height.
- The center coordinates (center_x, center_y) are relative to the width and height of the image, where (0, 0) represents the top-left corner, and (1, 1) represents the bottom-right corner.
- The width and height are also relative to the image size.

Here's an example of a YOLO TXT annotation file format for a single object:

For instance: cubes-23im33f2.txt

Each line represent a given normalized bounding box for the corresponding cubes-23im33f2.jpg image.

Mapping the Class Label:
- The classes.txt, classes.names or data.yaml (used by Roboflow YOLOv5 PyTorch export format) files contain configuration values used by the model to locate images and map class names to class_ids.