Skip to main content
Deploy audio keyword spotting models on Android for wake word detection, voice commands, and audio event recognition.

What You’ll Build

Keyword Spotting on Android An Android app that:
  • Captures real-time audio from microphone
  • Recognizes spoken keywords continuously
  • Displays classification results with confidence scores
  • Runs entirely on-device with low latency
Time: 30 minutes
Difficulty: Intermediate

Prerequisites

  • Trained audio keyword spotting model
  • Android Studio with NDK and CMake
  • Android device with microphone (usb camera with a mic also works)
  • Basic familiarity with Android development

Step 1: Clone the Repository

git clone https://github.com/edgeimpulse/example-android-inferencing.git
cd example-android-inferencing/example_kws

Step 2: Download TensorFlow Lite Libraries

cd app/src/main/cpp/tflite
sh download_tflite_libs.sh  # or .bat for Windows

Step 3: Export Your Audio Model

  1. In Edge Impulse Studio, go to Deployment
  2. Select Android (C++ library)
  3. Enable EON Compiler (recommended for audio)
  4. Click Build and download the .zip

Step 4: Integrate the Model

  1. Extract the downloaded .zip file
  2. Copy all files except CMakeLists.txt to:
    example_kws/app/src/main/cpp/
    
Your structure should be:
app/src/main/cpp/
├── edge-impulse-sdk/
├── model-parameters/
├── tflite-model/
├── native-lib.cpp
└── CMakeLists.txt (existing)

Step 5: Configure Audio Permissions

Permissions are already set in AndroidManifest.xml:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-feature android:name="android.hardware.microphone" />

Step 6: Build and Run

  1. Open in Android Studio
  2. Build > Make Project
  3. Connect your Android device
  4. Run the app
  5. Grant microphone permission when prompted
Speak your keywords and see real-time recognition results!

Understanding the Code

Audio Capture

// MainActivity.kt
private fun startAudioRecording() {
    val bufferSize = AudioRecord.getMinBufferSize(
        SAMPLE_RATE,
        AudioFormat.CHANNEL_IN_MONO,
        AudioFormat.ENCODING_PCM_16BIT
    )
    
    audioRecord = AudioRecord(
        MediaRecorder.AudioSource.MIC,
        SAMPLE_RATE,
        AudioFormat.CHANNEL_IN_MONO,
        AudioFormat.ENCODING_PCM_16BIT,
        bufferSize
    )
    
    audioRecord.startRecording()
    
    // Read audio in background thread
    Thread {
        val buffer = ShortArray(bufferSize)
        while (isRecording) {
            val read = audioRecord.read(buffer, 0, buffer.size)
            if (read > 0) {
                processAudioBuffer(buffer, read)
            }
        }
    }.start()
}

Ring Buffer for Continuous Inference

private val audioRingBuffer = RingBuffer(16000) // 1 second at 16kHz

private fun processAudioBuffer(buffer: ShortArray, size: Int) {
    // Add to ring buffer
    audioRingBuffer.write(buffer, size)
    
    // Run inference when buffer is full
    if (audioRingBuffer.isFull()) {
        val features = audioRingBuffer.read()
        val result = runInference(features)
        
        runOnUiThread {
            updateUI(result)
        }
    }
}

Native Inference

// native-lib.cpp
extern "C" JNIEXPORT jobject JNICALL
Java_com_example_kws_EIClassifierAudio_run(
    JNIEnv* env, jobject, jshortArray audioData) {
    
    // Convert audio to float features
    jshort* audio = env->GetShortArrayElements(audioData, nullptr);
    int length = env->GetArrayLength(audioData);
    
    std::vector<float> features(length);
    for (int i = 0; i < length; i++) {
        features[i] = (float)audio[i] / 32768.0f;
    }
    
    // Create signal
    signal_t signal;
    signal.total_length = features.size();
    signal.get_data = &get_feature_data;
    
    // Run classifier
    ei_impulse_result_t result = {0};
    run_classifier(&signal, &result, false);
    
    // Return result object
    return createResultObject(env, result);
}

Result Display

private fun updateUI(result: ClassificationResult) {
    // Find highest confidence prediction
    val topResult = result.classification.maxByOrNull { it.score } ?: return
    
    if (topResult.score > CONFIDENCE_THRESHOLD) {
        // Show detected keyword
        keywordTextView.text = topResult.label
        confidenceTextView.text = "${(topResult.score * 100).toInt()}%"
        
        // Highlight detection
        detectionIndicator.setBackgroundColor(Color.GREEN)
        
        // Trigger action (optional)
        onKeywordDetected(topResult.label)
    } else {
        keywordTextView.text = "Listening..."
        detectionIndicator.setBackgroundColor(Color.GRAY)
    }
}

Summary

You’ve built a simple keyword spotting app on Android using Edge Impulse. You can now expand this foundation by integrating more complex models, adding custom actions on keyword detection, or optimizing performance further. See the Android series overview for more tutorials!

Resources