Generate audio event datasets
Last updated
Last updated
Generate audio data using the Eleven Labs block, specifically for non-human voice audio. In this guide and video we'll be exploring an innovative approach leveraging generative AI to train small edge AI models. This is particularly useful in scenarios where high-quality, diverse training data is scarce or expensive to collect.
There is also a video version of this guide:
Prerequisites
Eleven Labs: account and API Key
Only available with Edge Impulse Enterprise Plan
Try our FREE Enterprise Trial today.
Transformation blocks can be complex to set up and are one of the most advanced features Edge Impulse provides. Feel free to ask your customer solution engineer for some help and some examples, we have been setting up complex pipelines for our customers and our engineers have acquired a lot of expertise with transformation blocks.
Introduction
Edge AI enables smart devices to perform machine learning tasks right at the source of the data collection. These small models are very good and optimized for performance scope tasks, as long as they are trained on quality data. We usually say "garbage in, garbage out."
Data Quality and Diversity
To help us improve the quality of our sound datasets, we've been working on an integration between Edge Impulse and ElevenLabs.io. Edge Impulse excels in the creation and optimization of Edge AI models, while Eleven Labs offers advanced capabilities to create realistic sound effects. This integration allows us to expand our datasets with sounds that may be difficult or expensive to record naturally. This approach not only saves time and money but also enhances the accuracy and reliability of the models we deploy on edge devices.
Practical Application: Glass Breaking Sound
In the above demonstration, we focus on a practical application that can be used in a smart security system, or in a factory to detect incidents, such as detecting the sounds of glass breaking.
Getting Started
Steps to Generate Sound Samples with ElevenLabs.io
Navigate to Data Acquisition: Once you're in your project, navigate to the Data Acquisition section and go to Data Sources. Under Data Sources, you can add a new one and select the transformation block for sound generation using generative AI.
Step 1: Generate a Sound Sample with ElevenLabs.io
First, get your Eleven Labs API Key. Navigate to the Eleven Labs web interface to get your key and test your prompt.
Step 2: Define the Prompt
Here we will be trying to collect a glass breaking sound or impact. Navigate to the settings to define the prompt "glass breaking" and define the length (e.g., 2 seconds) and prompt influence
Step 3: Run the Pipeline:
Once you've set up your prompt, and api key, run the pipeline to generate the sound samples. You can then view the output in the Data Acquisition section.
Benefits of Using Generative AI for Sound Generation
Enhance Data Quality: Generative AI can create high-quality sound samples that are difficult to record naturally.
Increase Dataset Diversity: Access a wide range of sounds to enrich your training dataset and improve model performance.
Save Time and Resources: Quickly generate the sound samples you need without the hassle of manual recording.
Improve Model Accuracy: High-quality, diverse sound samples can help fill gaps in your dataset and enhance model performance.
Conclusion
By leveraging generative AI for sound generation, you can enhance the quality and diversity of your training datasets, leading to more accurate and reliable edge AI models. This innovative approach saves time and resources while improving the performance of your models in real-world applications. Try out the Eleven Labs block in Edge Impulse today and start creating high-quality sound datasets for your projects.