LogoLogo
HomeDocsAPIProjectsForum
  • Getting Started
    • For beginners
    • For ML practitioners
    • For embedded engineers
  • Frequently asked questions
  • Tutorials
    • End-to-end tutorials
      • Continuous motion recognition
      • Responding to your voice
      • Recognize sounds from audio
      • Adding sight to your sensors
        • Collecting image data from the Studio
        • Collecting image data with your mobile phone
        • Collecting image data with the OpenMV Cam H7 Plus
      • Object detection
        • Detect objects using MobileNet SSD
        • Detect objects with FOMO
      • Sensor fusion
      • Sensor fusion using Embeddings
      • Processing PPG input with HR/HRV Features Block
      • Industrial Anomaly Detection on Arduino® Opta® PLC
    • Advanced inferencing
      • Continuous audio sampling
      • Multi-impulse
      • Count objects using FOMO
    • API examples
      • Running jobs using the API
      • Python API Bindings Example
      • Customize the EON Tuner
      • Ingest multi-labeled data using the API
      • Trigger connected board data sampling
    • ML & data engineering
      • EI Python SDK
        • Using the Edge Impulse Python SDK with TensorFlow and Keras
        • Using the Edge Impulse Python SDK to run EON Tuner
        • Using the Edge Impulse Python SDK with Hugging Face
        • Using the Edge Impulse Python SDK with Weights & Biases
        • Using the Edge Impulse Python SDK with SageMaker Studio
        • Using the Edge Impulse Python SDK to upload and download data
      • Label image data using GPT-4o
      • Label audio data using your existing models
      • Generate synthetic datasets
        • Generate image datasets using Dall·E
        • Generate keyword spotting datasets
        • Generate physics simulation datasets
        • Generate audio datasets using Eleven Labs
      • FOMO self-attention
    • Lifecycle Management
      • CI/CD with GitHub Actions
      • OTA Model Updates
        • with Nordic Thingy53 and the Edge Impulse APP
      • Data Aquisition from S3 Object Store - Golioth on AI
    • Expert network projects
  • Edge Impulse Studio
    • Organization hub
      • Users
      • Data campaigns
      • Data
      • Data transformation
      • Upload portals
      • Custom blocks
        • Transformation blocks
        • Deployment blocks
          • Deployment metadata spec
      • Health Reference Design
        • Synchronizing clinical data with a bucket
        • Validating clinical data
        • Querying clinical data
        • Transforming clinical data
        • Buildling data pipelines
    • Project dashboard
      • Select AI Hardware
    • Devices
    • Data acquisition
      • Uploader
      • Data explorer
      • Data sources
      • Synthetic data
      • Labeling queue
      • AI labeling
      • CSV Wizard (Time-series)
      • Multi-label (Time-series)
      • Tabular data (Pre-processed & Non-time-series)
      • Metadata
      • Auto-labeler [Deprecated]
    • Impulse design & Experiments
    • Bring your own model (BYOM)
    • Processing blocks
      • Raw data
      • Flatten
      • Image
      • Spectral features
      • Spectrogram
      • Audio MFE
      • Audio MFCC
      • Audio Syntiant
      • IMU Syntiant
      • HR/HRV features
      • Building custom processing blocks
        • Hosting custom DSP blocks
      • Feature explorer
    • Learning blocks
      • Classification (Keras)
      • Anomaly detection (K-means)
      • Anomaly detection (GMM)
      • Visual anomaly detection (FOMO-AD)
      • Regression (Keras)
      • Transfer learning (Images)
      • Transfer learning (Keyword Spotting)
      • Object detection (Images)
        • MobileNetV2 SSD FPN
        • FOMO: Object detection for constrained devices
      • NVIDIA TAO (Object detection & Images)
      • Classical ML
      • Community learn blocks
      • Expert Mode
      • Custom learning blocks
    • EON Tuner
      • Search space
    • Retrain model
    • Live classification
    • Model testing
    • Performance calibration
    • Deployment
      • EON Compiler
      • Custom deployment blocks
    • Versioning
  • Tools
    • API and SDK references
    • Edge Impulse CLI
      • Installation
      • Serial daemon
      • Uploader
      • Data forwarder
      • Impulse runner
      • Blocks
      • Himax flash tool
    • Edge Impulse for Linux
      • Linux Node.js SDK
      • Linux Go SDK
      • Linux C++ SDK
      • Linux Python SDK
      • Flex delegates
    • Edge Impulse Python SDK
  • Run inference
    • C++ library
      • As a generic C++ library
      • On your desktop computer
      • On your Zephyr-based Nordic Semiconductor development board
    • Linux EIM Executable
    • WebAssembly
      • Through WebAssembly (Node.js)
      • Through WebAssembly (browser)
    • Docker container
    • Edge Impulse firmwares
  • Edge AI Hardware
    • Overview
    • MCU
      • Nordic Semi nRF52840 DK
      • Nordic Semi nRF5340 DK
      • Nordic Semi nRF9160 DK
      • Nordic Semi nRF9161 DK
      • Nordic Semi nRF9151 DK
      • Nordic Semi nRF7002 DK
      • Nordic Semi Thingy:53
      • Nordic Semi Thingy:91
    • CPU
      • macOS
      • Linux x86_64
    • Mobile Phone
    • Porting Guide
  • Integrations
    • Arduino Machine Learning Tools
    • NVIDIA Omniverse
    • Embedded IDEs - Open-CMSIS
    • Scailable
    • Weights & Biases
  • Pre-built datasets
    • Continuous gestures
    • Running faucet
    • Keyword spotting
    • LiteRT (Tensorflow Lite) reference models
  • Tips & Tricks
    • Increasing model performance
    • Data augmentation
    • Inference performance metrics
    • Optimize compute time
    • Adding parameters to custom blocks
    • Combine Impulses
  • Concepts
    • Glossary
    • Data Engineering
      • Audio Feature Extraction
      • Motion Feature Extraction
    • ML Concepts
      • Neural Networks
        • Layers
        • Activation Functions
        • Loss Functions
        • Optimizers
          • Learned Optimizer (VeLO)
        • Epochs
      • Evaluation Metrics
    • Edge AI
      • Introduction to edge AI
      • What is edge computing?
      • What is machine learning (ML)?
      • What is edge AI?
      • How to choose an edge AI device
      • Edge AI lifecycle
      • What is edge MLOps?
      • What is Edge Impulse?
      • Case study: Izoelektro smart grid monitoring
      • Test and certification
    • What is embedded ML, anyway?
    • What is edge machine learning (edge ML)?
Powered by GitBook
On this page
  • Supported Blocks
  • Benefits of Synthetic data management
  • Accessing the Synthetic data
  • Generating Synthetic Images with GPT-4 (DALL-E)
  • Generating Human Speech with Whisper
  • Eleven Labs Sound Effects models
  • Custom Synthetic data blocks (Enterprise Plan only)
  • Summary
  1. Edge Impulse Studio
  2. Data acquisition

Synthetic data

PreviousData sourcesNextLabeling queue

Last updated 6 months ago

The Synthetic data integration allows you to easily create and manage synthetic data, enhancing your datasets and improving model performance. Whether you need images, speech, or audio data, our new integrations make it simple and efficient.

There is also a video version demonstrating the Synthetic data workflow and features:

Only available with Edge Impulse Professional and Enterprise Plans

Supported Blocks

To use these features, navigate to Data Sources, add new data source transformation blocks, set up actions, run a pipeline, and then go to Data Acquisition to view the output. If you want to make changes or refine your prompts, you have to delete the pipeline and start over.

Benefits of Synthetic data management

  • Enhance Your Datasets: Easily augment your datasets with high-quality synthetic data.

  • Improve Model Accuracy: Synthetic data can help fill gaps in your dataset, leading to better model performance.

  • Save Time and Resources: Quickly generate the data you need without the hassle of manual data collection.

Accessing the Synthetic data

To access the Synthetic data, follow these steps:

  1. Navigate to Your Project: Open your project in Edge Impulse Studio.

  2. Open Synthetic data Tab: Click on the "Synthetic Data" tab in the left-hand menu.

Generating Synthetic Images with GPT-4 (DALL-E)

  • Create Realistic Images: Use DALL-E to generate realistic images for your datasets.

  • Customize Prompts: Tailor the prompts to generate specific types of images suited to your project needs.

  1. Select Image Generation: Choose the GPT-4 (DALL-E) option.

  2. Enter a Prompt: Describe the type of images you need (e.g., "A photo of a factory worker wearing a hard hat", or some background data for object detection (of cars) "aerial view images of deserted streets").

  3. Generate and Save: Click "Generate" to create the images. Review and save the generated images to your dataset.

Generating Human Speech with Whisper

  • Human-like Speech Data: Utilize Whisper to generate human-like speech data.

  • Versatile Applications: Ideal for voice recognition, command-and-control systems, or any application requiring natural language processing.

  1. Select Speech Generation: Choose the Whisper option.

  2. Enter Text: Provide the text you want to be converted into speech (e.g., "Hello Edge!").

  3. Generate and Save: Click "Generate" to create the speech data. Review and save the generated audio files.

Eleven Labs Sound Effects models

  • Realistic Sound Effects: Use Eleven Labs to generate realistic sound effects for your projects.

  • Customize Sound Prompts: Define the type of sound you need (e.g., "Glass breaking" or "Car engine revving").

Custom Synthetic data blocks (Enterprise Plan only)

You can also create custom transformation blocks to generate synthetic data using your own models or APIs. This feature allows you to integrate your custom generative models into Edge Impulse Studio for data augmentation.

x-synthetic-data-job-id header

parser.add_argument('--synthetic-data-job-id', type=int, required=False, help="If specified, sets the synthetic_data_job_id metadata key")

Then, pass the argument as a header to the ingestion api via the x-synthetic-data-job-id header field:

Pass the argument as a header to ingestion:
            res = requests.post(url=INGESTION_URL + '/api/' + upload_category + '/files',
                headers={
                    'x-label': label,
                    'x-api-key': API_KEY,
                    'x-metadata': json.dumps({
                        'generated_by': 'dall-e-3',
                        'prompt': prompt,
                    }),
                    'x-synthetic-data-job-id': str(args.synthetic_data_job_id) if args.synthetic_data_job_id is not None else None,
                },
                files = { 'data': (os.path.basename(fullpath), png, 'image/png') }
            )

Summary

Stay tuned for more updates on what we're doing with generative AI. Exciting times ahead!

Try our or FREE today.

DALL-E Image Generation Block: Generate image datasets using Dall·E using the .

Whisper Keyword Spotting Generation Block: Generate keyword-spotting datasets using the . Ideal for keyword spotting and speech recognition applications.

Eleven Labs Sound Generation Block: Generate sound datasets using the . Ideal for generating realistic sound effects for various applications.

Follow our to learn how to create and use custom transformation blocks in Edge Impulse Studio.

Data ingestion should also include a flag in the header x-synthetic-data-job-id, allowing users to pass an optional new header to indicate this is synthetic data. Read on in the below for more details.

To handle the new synthetic data ingestion flag, it is necessary to parse an extra argument as can be seen in the DALL-E blocks below:

Read on in our DALL-E 3 Image Generation Block guide and repo .

To start using the Synthetic Data tab, log in to your Edge Impulse Enterprise account and open a project. Navigate to the "Synthetic Data" tab and explore the new features. If you don't have an account yet, sign up for free at .

For further assistance, visit our or check out our .

Professional Plan
Enterprise Trial
DALL-E model
Whisper model
Eleven Labs model
Custom Transformation Blocks guide
example
here
Edge Impulse
forum
tutorials
Custom Transformation Block section
Synthetic data
Synthetic data blocks
Synthetic data tab
Synthetic data tab
Synthetic data tab
Synthetic data tab