EON Compiler

Welcome to our comprehensive documentation on the groundbreaking EON Compiler, the Edge Optimized Neural (EON) compiler. We aim to provide you with a detailed understanding of EON's exceptional capabilities and the significant benefits it brings to the world of neural network deployment for Edge AI devices.

EON Compiler is a revolutionary tool designed to elevate your code to new heights by effectively running neural networks with reduced RAM and flash usage, all while maintaining accuracy comparable to TensorFlow Lite for Microcontrollers. Unlike conventional embedded solutions relying on generic interpreters, EON Compiler incorporates a Proprietary Compiler that compiles neural networks to C++. This cutting-edge approach eliminates complex code, significantly reduces device resource utilization, and remarkably saves valuable time.

Some of the key advantages of EON Compiler, which include:

Key Benefits of EON Compiler:

  • 25-55% less RAM

  • 35% less flash

  • Same accuracy as TFLite

  • Faster inference

  • Cheaper hardware

  • Faster time to market

But before we dive into the finer details of EON Compiler's performance, let's take a moment to familiarize ourselves with how it works in practice, in our Getting started section and the unique approach that sets it apart from traditional Tensorflow Lite for Microcontrollers.

Getting Started

First, make sure you have an audio, motion, or image classification project in your Edge Impulse account, and a target device selected to work with. No projects yet? Follow one of our tutorials to get started:

  1. Log in to the Edge Impulse Studio and open a project.

  2. Select the Deployment tab in the left-hand menu.

  3. Search for and select your target device from the search bar.

  4. Click the Enable EON Compiler button to enable the EON compiler for your project.

  5. Click the Run Model Testing button and notice you will see Calculating model accuracy... on your target device.

  6. Selecting Quantized mode, you can see the RAM and ROM usage of your model. In Unoptimized (Float32) mode, you can see the RAM and ROM usage of your model, as well as the Learn Blocks and Inferencing Time.

What do these metrics mean?

  • Classifier (Processing blocks Performance): Here we can see the optimizations for the DSP components of the compiled model DSP components. e.g. Spectral Features, MFCC, FFT, etc.

  • Spectral Features (Learn Blocks Performance): The performance of the compiled model on the device. Here we see the time it takes to run inference.

  • latency: the time it takes to run the model on the device

  • RAM: the amount of RAM the model uses

  • Flash: the amount of ROM the model uses

  • Accuracy: the accuracy of the model

  1. Click Build and then select Deploy once finished.

  2. Wait for the EON compiler to finish building your model. This can take a moment to build, but you can continue working in the Studio while you wait.

  3. Now you’re ready to deploy your automatically configured Edge Impulse model to your target edge device!

How does it work?

EON Compiler at its core is a proprietary compiler and state serialization for Tensorflow Lite.

A neural network compiler for general purpose mcus built on top of tensorflow lite for microcontrollers. Which means we can use their operators and kernels plus our own.

First we take the Tensor Flow Lite model kernel and operators. This is done by loading the model into the Tensor Flow Lite interpreter and then serializing the model.

Finally we remove the interpreter, and serialize the model to C++ code. Producing a complete state of our model in memory. Everything not used can then be removed by the linker.

Detailed info

EON compiler is a code generator that produces highly optimized variants of ML models.

Input to EON compiler is a Tensorflow Lite Flatbuffer file containing model weights. Output is a .cpp and .h files containing unpacked model weights and functions to prepare and run the model inference.

Regular Tflite Micro is based on Tensorflow Lite and contains all the necessary instruments for reading the model weights in Flatbuffer format (which is the content of .tflite file), constructing the inference graph, planning the memory allocation for tensors / data, executing the initialization, preparation and finally invoking the operators in the inference graph to get the inference results.

The advantage of using the traditional Tflite Micro approach is very versatile and flexible. The disadvantage is that all the code for getting model ready on the device is pretty heavy for embedded systems:

To overcome these limitations, our solution involves performing the resource-intensive tasks, such as reading the model from Flatbuffer, constructing the graph, and planning memory allocation, on a more capable machine, such as a Server.

Subsequently, the EON compiler performs the generation of C++ files, housing the necessary functions for the Init, Prepare, and Invoke stages.

These C++ files can then be deployed on the embedded systems, alleviating the computational burden on those devices.

Supported Operators

TensorFlow Lite for Microcontrollers supports a subset of TensorFlow Lite operators. See their Operator Support page for more information.

At Edge Impulse, we have our own set of operators that we support. These operators are optimized for our supported embedded devices and are adapted to the needs of our users and partners.

Currently we support the following operators, although we are adding more all the time:

Operator
Description

AddAbs()

Computes the element-wise absolute value of input tensor elements.

AddAdd()

Performs element-wise addition of two input tensors.

AddAddN()

Adds multiple tensors element-wise, combining their values.

AddArgMax()

Finds the indices of the maximum values along specified dimensions in the input tensor.

AddArgMin()

Finds the indices of the minimum values along specified dimensions in the input tensor.

AddAveragePool2D()

Applies 2D average pooling to the input tensor, reducing spatial dimensions.

AddBatchMatMul()

Computes batched matrix multiplication between two input tensors.

AddBatchToSpaceNd()

Rearranges data in the batch dimension based on the block size specified.

AddCeil()

Rounds up each element of the input tensor to the nearest integer greater than or equal to it.

AddComplexAbs()

Computes the absolute values of complex numbers in the input tensor.

AddConcatenation()

Concatenates multiple input tensors along a specified axis.

AddConv2D()

Performs 2D convolution on the input tensor using specified filters and strides.

AddCos()

Computes the element-wise cosine of the input tensor.

AddDepthwiseConv2D()

Applies depthwise 2D convolution to the input tensor.

AddDequantize()

Converts quantized input tensor elements to floating-point representation.

AddRsqrt()

Computes the element-wise reciprocal square root of the input tensor.

AddSelect() (if available)

Selects elements from the two input tensors based on a condition tensor.

AddSelectV2() (if available)

Selects elements from two input tensors based on a condition tensor (version 2).

AddShape()

Computes the shape of the input tensor and returns it as a new tensor.

AddSin()

Computes the element-wise sine of the input tensor.

AddSlice()

Extracts a slice of the input tensor based on specified starting and ending indices.

AddSoftmax()

Computes the softmax activation function along a specified axis.

AddSpaceToBatchNd()

Rearranges data in the batch dimension, creating spatial dimensions.

AddSplit()

Splits the input tensor into multiple tensors along the specified axis.

AddSplitV()

Splits the input tensor into multiple tensors along the specified axis (version 2).

AddSqrt()

Computes the element-wise square root of the input tensor.

AddSquare()

Computes the element-wise square of the input tensor.

AddSquaredDifference()

Computes the element-wise squared difference between two input tensors.

AddSqueeze()

Removes dimensions with size 1 from the input tensor.

AddStridedSlice()

Extracts a strided slice of the input tensor based on specified parameters.

AddSub()

Performs element-wise subtraction of two input tensors.

AddSum()

Computes the sum of input tensor elements along specified dimensions.

AddSvdf()

Applies the singular value decomposition filter to the input tensor.

AddTanh()

Computes the element-wise hyperbolic tangent of the input tensor.

AddTranspose()

Transposes the input tensor based on specified axes.

AddTransposeConv()

Performs transposed convolution on the input tensor.

AddTreeEnsembleClassifier()

Applies a tree ensemble model for classification tasks.

AddUnpack()

Unpacks the input tensor along a specified axis into multiple tensors.

Last updated

Revision created

Merge branch 'main' into brickml