Skip to main content

Android series overview

These tutorials walk you through deploying Edge Impulse models on Android, from running inference on static test data to real-time camera and audio apps to hardware-accelerated inference on Qualcomm’s NPU. All examples use the Android NDK to call the Edge Impulse C++ SDK directly from Kotlin or Java, giving you low-latency, fully on-device inference without a network connection. If you’re new to Edge Impulse, complete a model training tutorial first to train and export a model, then come back here to deploy it to Android.
No dataset yet? Start with the Android data collector — it streams phone, Wear OS, and BLE sensor data (plus camera images) straight into your Edge Impulse project, supports voice‑controlled capture (“hey android, capture 10 seconds with label circle”), and can collect directly from Arduino boards over USB OTG — including IMU and camera frames from the Nano 33 BLE Sense + TinyML Kit and ESP32-S3-EYE. Train a model on the data you collect, then return here to deploy it.

Prerequisites

Before starting any tutorial in this series, make sure you have the following:
  • An Edge Impulse account with a trained model
  • Android Studio (Ladybug 2024.2.2 or later)
  • Android SDK tools: API 35, NDK 27.0.12077973, CMake 3.22.1
  • Basic familiarity with Android development

How the examples work

All examples in this series share the same underlying architecture. The Edge Impulse C++ SDK handles signal processing and neural network execution in native code, while your Kotlin or Java app manages the UI, sensor access, and data capture. A JNI bridge in native-lib.cpp connects the two layers, keeping the performance-critical inference path in C++ while letting you build a standard Android UI around it. See the Android NDK documentation for more on integrating native C++ into Android apps.
LayerComponentsResponsibility
Data sourceCamera2, AudioRecord, SensorManagerCaptures images, audio, or IMU data
Java/KotlinActivities, UI, preprocessingUser interface and data conversion
JNI bridgenative-lib.cppType conversion between Java and C++
C++ nativeEdge Impulse SDKSignal processing and feature extraction
TensorFlow LiteRuntime librariesNeural network execution
HardwareCPU, NPU, DSPPhysical computation
Optional delegateQNNHardware acceleration on Qualcomm Snapdragon devices

Quick start

All examples live in the same repository. Clone it once, then navigate into whichever example you want to run:
git clone https://github.com/edgeimpulse/example-android-inferencing.git
cd example-android-inferencing
If this is your first time, start with the Static Buffer Inference tutorial. It runs inference on hardcoded test data with no sensors required, and gives you a clear understanding of how the C++ SDK integrates with Android before you add camera, audio, or motion input.

Tutorials

Step 0 — Collect your dataset

Android data collector

Collect phone sensor data, Wear OS heart‑rate / IMU / GPS, camera images, BLE‑relayed results from a Zephyr device, and USB OTG serial data from Arduino boards (including IMU + camera from the Nano 33 BLE Sense + TinyML Kit) — all in one app, with optional voice‑controlled capture. Upload directly to your Edge Impulse project. Skip this step if you already have a dataset in Studio.

Deploy a trained model

Static buffer inference

Run inference on hardcoded test data. The simplest starting point, with no sensors required.

Keyword spotting

Recognize wake words and voice commands from your phone’s microphone in real time.

Camera inference

Run object detection or image classification on a live camera feed using Camera2.

WearOS motion inference

Classify motion data from a WearOS device’s accelerometer and gyroscope.

QNN hardware acceleration

Speed up inference 10× or more using Qualcomm’s Hexagon NPU on Snapdragon devices.

QNN speech to image GenAI

Generate images from spoken prompts using a quantized GenAI pipeline running entirely on-device.
More tutorials coming soon.

Common deployment workflow

Every deployment tutorial in this series (everything except the Android data collector, which is a data‑acquisition app and does not embed a model) follows the same five‑step process to go from a trained Edge Impulse model to a running Android app.
  1. Export your model: In Edge Impulse Studio, go to Deployment → Android (C++ library) and click Build. Download the .zip archive.
  2. Download TensorFlow Lite libraries: Each example includes a script that fetches the required TFLite runtime binaries:
    cd app/src/main/cpp/tflite
    
    # macOS/Linux
    sh download_tflite_libs.sh
    
    # Windows
    download_tflite_libs.bat
    
  3. Copy your model files: Extract the downloaded .zip and copy all files into app/src/main/cpp/. Don’t overwrite the existing CMakeLists.txt; the project’s build configuration is already set up for you.
  4. Update the test features: Open native-lib.cpp and paste in the raw features from one of your test samples in Studio. This lets you verify the model produces correct output before wiring up a live sensor.
  5. Build and run: Open the project in Android Studio, click Build → Make Project, then run it on a device or emulator.

Platform support

ABIStatusTarget devices
arm64-v8a (64-bit)RecommendedAll modern Android devices
armeabi-v7a (32-bit)Requires configOlder devices; see the 32-bit setup steps in the repository README
The minimum supported Android API level is 24 for the deployment examples; the build target is API 35. The Android data collector requires API 28 or later because the bundled full TensorFlow Lite runtime uses aligned_alloc.

Performance optimization

By default, all examples use XNNPACK for CPU acceleration. XNNPACK is bundled with TensorFlow Lite and requires no extra configuration; it transparently accelerates supported operations across any Android device and typically provides a meaningful speedup over plain TFLite out of the box. If you’re running on a Qualcomm Snapdragon device, you can go significantly further with QNN (Qualcomm AI Engine Direct). QNN offloads compatible model operations to the Hexagon NPU or DSP, and on tested devices delivers 10× or greater speedup over CPU-only inference. INT8 quantization is required to get the best results. See the QNN hardware acceleration tutorial for a full walkthrough.

Repositories

RepositoryDescription
example-android-inferencingMono‑repo: static buffer, KWS, camera, WearOS, data collector, QNN object detection, and QNN speech‑to‑image GenAI. All Android tutorials in this series live here.
qnn-hardware-accelerationStandalone mirror of the QNN object detection example (also bundled in the mono‑repo above as qnn-hardware-acceleration/).

Need help