“Google was running neural networks that were just 14 kilobytes (KB) in size! They needed to be so small because they were running on the digital signal processors (DSPs) present in most Android phones, continuously listening for the ‘OK Google’ wake words…”This kind of breakthrough shows what’s possible when you focus on keeping models small and efficient. In this guide, we’ll follow the same philosophy: you’ll build your own custom wake-word detector that runs directly on your phone, using tools like Edge Impulse, TensorFlow Lite, and Android Studio. The system will be optimized to listen for a trigger phrase like “Neo” with minimal power usage, no cloud calls, no bulky models, just fast, local inference.
.tflite
model and run inference through TFLite’s Interpreter API.
In this guide, we will focus on building in native C++ and include it in our Android application.
Fig 1: Creating a New Project in Edge Impulse
Fig 2: The SDK Tools Window Showing the NDK Option
Fig 3: Data Acquisition Page in Edge Impulse
Fig 4: Collecting Data From Your Computer
Fig 5: Collecting Data by Recording Audio
Fig 6: Overview of the Dataset Created
Fig 7: Waveform of the Wake Word
Fig 8: Creating an Impulse
Fig 9: Settings for MFCC
Fig 10: NN Classifier in Edge Impulse
Fig 11: Confusion Matrix of the Model
Fig 12: Results of the Model on the Testing Data
Fig 13: Building C++ library with TensorflowLite Enabled for Optimizations
.tflite files
are compressed and optimized versions of your trained model, plus C++ wrapper code that handles loading and running inference.
src/main
directory, create a cpp
folder and copy these three directories from your Edge Impulse export:
CMakeLists.txt
, you’d hit hundreds of build errors. The Edge Impulse SDK relies on precise compiler flags, include paths, and libraries like XNNPACK (for faster inference) and Ruy (for optimized matrix operations).
File organization is also critical as CMake needs to locate CMSIS-DSP, Edge Impulse utilities, and TensorFlow Lite components in the right order. For more information, visit the Android inferencing example.
To compile and link your C++ inference code with TensorFlow Lite, we need to install the native TensorFlow Lite C++ static library (libtensorflow-lite.a
) for Android ARM64.For that, clone the repo and run the script:
tensorflow-lite
directory contains the source code and headers for the TensorFlow Lite runtime. This is essential for running on-device inference.
native-lib.cpp
in your cpp
directory. This connects your Java code to the Edge Impulse inference engine:
getModelInfo
function helps verify that your model is loaded correctly. The main classifyAudio
function takes audio samples from Java, wraps them in Edge Impulse’s signal structure, runs inference, and returns classification probabilities. Edge Impulse uses a callback pattern for audio data rather than direct memory access. This allows efficient streaming and better memory management.
build.gradle
:
System.loadLibrary("audiospot")
loads the compiled native library, and the native
method declarations create the bridge to C++ functions that handle the actual ML inference.
audioRecord.read()
method fills the buffer with raw audio samples from the device microphone.
classifyAudio()
method via JNI, receives confidence scores back, and updates the UI based on the detection thresholds. If the confidence exceeds 70% probability, then the wake word will be detected. We can change this threshold according to our use case.