Comment on page
Auto-labeler
Our auto-labeling feature relies on the Segment Anything foundation model, creates embeddings or segmentation maps for your image datasets and then clusters (or groups) these embeddings based on your settings. In the Studio, you can then associate a label with a cluster and it will automatically create the labeled bounding boxes around each of the objects present in that cluster.
We developed this feature to ease your labeling tasks in your object detection projects.
Only available for enterprise customers
- 1.
- 2.Make sure your project is configured as an object detection project. You can change the labeling method in your project's dashboard. See Dashboard for more info.
- 3.Add some images to your project, either by collecting data or by uploading existing datasets. See Data acquisition for more info.
- 4.You now should be able to see the Auto-labeler tab in your Data acquisition view:

Data acquisition with auto-labeler available when using an enterprise project
Which items to include:
- All data items present in your dataset
- Data items in the labeling queue
- Data items without a given class
Minimum object size (pixels):
Objects smaller than this value are thrown out, an object of 20x10 pixels is 200 pixels.
Maximum object size (pixels):
Objects bigger than this value are thrown out, an object of 150x100 pixels is 15,000 pixels.
Sim threshold:
The Sim threshold corresponds to the "similarity" where 1.0 implies items are exactly the same and 0.0 are totally different. Ideal values are usually between 0.9 and 0.999, lower this value if you have too many clusters, or increase it if you notice that different objects are in the same cluster.

Auto-labeler settings
Click on Run the auto-labeler to generate the segmentation maps and the clusters.
Note that this process is slow (a few seconds per image, even on GPUs). However, we apply a strong cache on the results, so once you have ran the auto-labeler once, your iterations will be must faster. This will allow you to change the settings with less friction.
Once the process is finished, you will be redirected to a new page to associate a label with a cluster:

Add a label to a cluster
Select your class or create a new one for each of the clusters you want to label and click on Save the labels once you are happy with it.
Do not hesitate to go back and adjust the parameters if the clusters you don't see a clear separation, if too different objects are in the same cluster or if you have too many clusters.
Each project is different, to write this documentation page, we have collected images containing several dice. This dataset can be used in several ways - you can either label the dice only, the dice color or the dice figures.
To adjust the granularity, you can use the Sim threshold parameter.
Here we have been setting the Sim threshold to
0.915

Auto-labeler clusters
Here we have been setting the Sim threshold to
0.945

Auto-labeler clusters
Here we have been setting the Sim threshold to
0.98

Auto-labeler clusters
Voilà! Now that you have labeled your dataset, you can create an Impulse and train your object detection project.
In the public project shared above, here are the results of the trained model using the mobile phone deployment option:

Model trained with FOMO
Last modified 3mo ago