AI labeling

The AI labeling feature is an extensible way of integrating existing AI models into your workflow and using them to automatically label your datasets. This can be achieved through leveraging ready-made blocks provided by Edge Impulse or developing custom ones to meet your specific needs. Whether you’re labeling images, bounding boxes, or audio samples, these AI labeling blocks are sure to save you time and improve your consistency.

AI labeling actions

To exit an AI labeling action configuration and return to the overview page, you can click on the < button found to the left of the block configuration title (AI Labeling - Step 1) or click the AI labeling tab.

You can create multiple AI labeling actions that contain one or more AI labeling blocks, each with different prompts, parameters and filters. From the AI labeling actions overview page you can add new actions, delete existing ones, access their configurations, or run them directly.

AI labeling blocks

There are several AI labeling blocks that have been developed by Edge Impulse and are available for your use. These are listed below with links to their associated code in public GitHub repositories:

If none of the blocks from Edge Impulse fit your needs, you can modify them or develop from scratch to create a custom AI labeling block. This allows you to integrate your own models or prompts for unique project requirements. See the Custom AI labeling blocks example below.

If you have a suggestion for an AI labeling block that you would like to see Edge Impulse develop, please let us know in our forum.

Configuration

To begin, proceed to the Data acquisition view and ensure you have data samples in the Dataset tab. Then, continue to the AI labeling tab.

Click on an existing AI labeling action to enter the configuration view for that action. If you do not yet have an AI labeling action, you can create one using the + Add new label action button.

Select an AI labeling block

The first step is to select an AI labeling block that you would like to use. By default, blocks that are not compatible with your data modality or labeling objective are greyed out. Once you have selected an AI labeling block, the parameters specific to that block are presented.

Some blocks require an API key to interact with other providers, such as OpenAI or Hugging Face. You can set your API key directly in the AI labeling block configuration panel the first time you use the block. The key you enter will be stored in Secrets. Once created, the key value will no longer be visible anywhere in the platform.

To manage your secrets if you are an Enterprise customer, go to your organization and select the Secrets menu item. If you are not an Enterprise customer, secrets can be accessed through the settings in your developer profile. Click on your avatar and go to your Account settings -> Secrets:

Add multiple AI labeling blocks

You can chain several AI labeling blocks together to create an AI labeling action with multiple steps. For example, you can first use a zero-shot object detector to automatically detect high-level objects within an image then follow this with a step to re-label the bounding boxes with more precise labels or remove them entirely.

To add multiple AI labeling blocks, click on the button at the bottom of the block configuration panel to add an extra step.

Filter which data to label

Select which data items in your dataset you want to label. You can use the metadata attached to your data samples to define your own labeling strategy.

Preview

Tip: If you want to change the number of data samples or the number of columns shown in the preview, click on the view settings icon. Changing the number of columns can be useful for object detection use cases where your objects are small and you want to see larger images.

change view icon

Before running the AI labeling action on your entire dataset, we recommend to preview the label results on a small subset of your dataset. This will help you to validate your prompt and parameters so that you can iterate faster.

When clicking on the Label preview data button, the changes are staged but not directly applied.

Set metadata (optional)

You can add metadata such as ai-labeled: true, labeling-source: GPT-4o or labeled-on: Nov 2024 that will be set after running the AI labeling action. This is particularly useful if you plan to add more data samples over time and need to filter out your already-labeled samples.

Run the labeling process

Once you are satisfied with your configuration, click on the Label all data button. This will run the AI labeling action and apply the labeling updates to your dataset.

Examples

Bounding box labeling with OWL-ViT

A zero-shot object detector that uses OWL-ViT to label objects with bounding boxes. For complex objects, pair with "Bounding box re-labeling with GPT-4o" to refine labels.

Bounding box re-Labeling with GPT-4o

OpenAI API key needed

Take existing bounding boxes (e.g. from a zero-shot object detector) and use GPT-4o to re-label or remove them as needed. This can be configured as a two step process in a single AI labeling action.

Image labeling with GPT-4o

OpenAI API key needed

Use GPT-4o to apply a single label to images. Customize prompts to return a single label, for example “Is there a person in this picture? Answer with 'yes' or 'no'.”

Image labeling with pretrained models

Use a model from an existing Edge Impulse project to label images (classification or object detection). You can also upload your pretrained models to Edge Impulse using the BYOM (Bring Your Own Model) feature.

Audio labeling with AudioSet

Hugging Face API key needed

Label audio samples with multiple labels per sample using an Audio Spectrogram Transformer (AST) model trained on AudioSet. Use only AudioSet labels (see AudioSet Dataset for reference).

Custom AI labeling blocks

Only available with Edge Impulse Enterprise Plans

Try our FREE Enterprise Trial today.

Interface

AI labeling blocks follow the Transformation blocks structure. Please refer to that documentation for further details. Inputs and outputs for the block are defined below.

Inputs

The parameters defined in your parameters.json file will be passed into your block as command line arguments. For example, a parameter named other-label will be passed to your block as --other-label <value>.

In addition to the parameters defined by you, the following arguments will be automatically passed to your AI labeling block.

Argument

Description

--data-ids-file <file>

Always passed. Provides an ids.json file that lists the data sample IDs to operate on as integers.

--proposed-actions <job-id>

Passed only when the user wants to preview label changes. If passed, changes should be staged and not directly applied. Provides a job ID as an integer. See preview mode.

One note is that secrets will be passed as environment variables. Additional required environment variables can be defined using requiredEnvVariables in the block info section of the parameters.json file. These can then be set within your organization by editing the block in the AI labeling section found under custom blocks for your organization.

{
    "version": 1,
    "type": "ai-action",
    "info": {
        "name": "<block-name>",
        "description": "<block-description>",
        "requiredEnvVariables": [
            "<required-env-var-1>",
            "<required-env-var-2>"
        ],
        "operatesOn": [
            "<data-type>"
        ]
    },
    "parameters": [
        { ... },
        { ... }
    ]
}

<data-type> options:

images_object_detection | images_single_label | audio | other

{
    "ids": [id1, id2, ..., idn]
}

Outputs

There are no required outputs from the AI labeling block. In general, all changes are applied inside the block itself using API calls.

Push to Edge Impulse

After cloning and modifying an existing AI labeling block or developing one from scratch, you can publish the block to your organization to make it available to everyone in your organization. Initialize the block and push it to your organization using the edge-impulse-blocks tool within the Edge Impulse CLI:

Update parameters.json to update the name and description of your block. See example above in the custom blocks Interface section.

Initialize and push the block:

edge-impulse-blocks init

? Choose a type of block
Transformation block
Synthetic data block
❯ AI labeling block
Deployment block
DSP block
Machine learning block

$ edge-impulse-blocks push

Your AI labeling block is available in your organization. To run the block, open or create a project belonging to your organization and go to Data acquisition > AI labeling and create an AI labeling action that uses your block.

Preview mode

AI labeling blocks can run in 'preview' mode (triggered when you click Label preview data within an AI labeling action configuration). The changes are staged but not directly applied.

For preview mode, the --propose-actions <job-id> flag and argument are passed into your block. When you see this flag you should not apply changes directly to the data samples (e.g. via raw_data_api.set_sample_bounding_boxes or raw_data_api.set_sample_structured_labels) but rather use the raw_data_api.set_sample_proposed_changes API call.

if args.propose_actions:
    raw_data_api.set_sample_proposed_changes(project_id=project_id, sample_id=sample.id, set_sample_proposed_changes_request={
        'jobId': args.propose_actions,
        'proposedChanges': {
            'structuredLabels': structured_labels,
            'metadata': new_metadata
        }
    })
else:
    raw_data_api.set_sample_structured_labels(
        project_id, sample.id, set_sample_structured_labels_request={
            'structuredLabels': structured_labels
        }
    )
    raw_data_api.set_sample_metadata(project_id=project_id, sample_id=sample.id, set_sample_metadata_request={
        'metadata': new_metadata
    })

if (proposeActionsJobId) {
      await api.rawData.setSampleProposedChanges(project.id, sample.id, {
          jobId: proposeActionsJobId,
          proposedChanges: {
              boundingBoxes: newBbs,
          },
      });
  }
else {
      await api.rawData.setSampleBoundingBoxes(project.id, sample.id, {
          boundingBoxes: newBbs,
      });
}

Troubleshooting

No common issues have been identified thus far. If you encounter an issue, please reach out on the forum or, if you are an Enterprise customer, to your Solutions engineer.

PreviousLabeling queue NextCSV Wizard (Time-series)

Last updated 8 months ago