The Edge Optimized Neural (EON) compiler is a powerful tool developed by Edge Impulse and designed to optimize and effectively run neural networks with reduced RAM and flash usage, all while maintaining accuracy comparable to TensorFlow Lite for Microcontrollers. The EON Compiler incorporates a proprietary compiler that compiles neural networks to C++. This approach eliminates complex code, significantly reduces device resource utilization, and saves inference time.
Some of the key advantages of EON Compiler, which include:
- 25-55% less RAM
- 35% less flash
- Same accuracy as TFLite
- Faster inference
EON Compiler Accuracy
What do these metrics mean?
- MFE (Processing blocks Performance): Here we can see the optimizations for the DSP components of the compiled model DSP components. e.g. Spectral Features, MFCC, FFT, etc.
- NN Classifier (Learn Blocks Performance): The performance of the compiled model on the device. Here we see the time it takes to run inference.
- Latency: the time it takes to run the model on the device.
- RAM: the amount of RAM the model uses.
- Flash: the amount of ROM the model uses.
- Accuracy: the accuracy of the model.
The input of the EON compiler is a Tensorflow Lite Flatbuffer file containing model weights. The output is a .cpp and .h files containing unpacked model weights and functions to prepare and run the model inference.
Regular Tflite Micro is based on Tensorflow Lite and contains all the necessary instruments for reading the model weights in Flatbuffer format (which is the content of .tflite file), constructing the inference graph, planning the memory allocation for tensors/data, executing the initialization, preparation and finally invoking the operators in the inference graph to get the inference results.
The advantage of using the traditional Tflite Micro approach is very versatile and flexible. The disadvantage is that all the code for getting the model ready on the device is pretty heavy for embedded systems.
To overcome these limitations, our solution involves performing resource-intensive tasks, such as reading the model from Flatbuffer, constructing the graph, and planning memory allocation directly on our servers.
Subsequently, the EON compiler performs the generation of C++ files, housing the necessary functions for the Init, Prepare, and Invoke stages.
These C++ files can then be deployed on the embedded systems, alleviating the computational burden on those devices.
TensorFlow Lite for Microcontrollers supports a subset of TensorFlow Lite operators. See their Operator Support page for more information.
At Edge Impulse, we have our own set of operators that we support. These operators are optimized for our supported embedded devices and are adapted to the needs of our users and partners.
Currently, we support the following operators, although we are adding more all the time: