LogoLogo
HomeDocsAPI & SDKsProjectsForumStudio
  • Welcome
    • Featured Machine Learning Projects
      • Getting Started with the Edge Impulse Nvidia TAO Pipeline - Renesas EK-RA8D1
      • Smart City Traffic Analysis - NVIDIA TAO + Jetson Orin Nano
      • ROS 2 Pick and Place System - Arduino Braccio++ Robotic Arm and Luxonis OAK-D
      • Optimize a cloud-based Visual Anomaly Detection Model for Edge Deployments
      • Rooftop Ice Detection with Things Network Visualization - Nvidia Omniverse Replicator
      • Surgery Inventory Object Detection - Synthetic Data - Nvidia Omniverse Replicator
      • NVIDIA Omniverse - Synthetic Data Generation For Edge Impulse Projects
      • Community Guide – Using Edge Impulse with Nvidia DeepStream
      • Computer Vision Object Counting - Avnet RZBoard V2L
      • Gesture Appliances Control with Pose Detection - BrainChip AKD1000
      • Counting for Inspection and Quality Control - Nvidia Jetson Nano (TensorRT)
      • High-resolution, High-speed Object Counting - Nvidia Jetson Nano (TensorRT)
    • Prototype and Concept Projects
      • Renesas CK-RA6M5 Cloud Kit - Getting Started with Machine Learning
      • TI CC1352P Launchpad - Getting Started with Machine Learning
      • OpenMV Cam RT1062 - Getting Started with Machine Learning
      • Getting Started with Edge Impulse Experiments
  • Computer Vision Projects
    • Workplace Organizer - Nvidia Jetson Nano
    • Recyclable Materials Sorter - Nvidia Jetson Nano
    • Analog Meter Reading - Arduino Nicla Vision
    • Creating Synthetic Data with Nvidia Omniverse Replicator
    • SonicSight AR - Sound Classification with Feedback on an Augmented Reality Display
    • Traffic Monitoring - Brainchip Akida
    • Multi-camera Video Stream Inference - Brainchip Akida
    • Industrial Inspection Line - Brainchip Akida
    • X-Ray Classification and Analysis - Brainchip Akida
    • Inventory Stock Tracker - FOMO - BrainChip Akida
    • Container Counting - Arduino Nicla Vision
    • Smart Smoke Alarm - Arduino Nano 33
    • Shield Bot Autonomous Security Robot
    • Cyclist Blind Spot Detection - Himax WE-I Plus
    • IV Drip Fluid-Level Monitoring - Arduino Portenta H7
    • Worker PPE Safety Monitoring - Nvidia Jetson Nano
    • Delivered Package Detection - ESP-EYE
    • Bean Leaf Disease Classification - Sony Spresense
    • Oil Tank Measurement Using Computer Vision - Sony Spresense
    • Object Counting for Smart Industries - Raspberry Pi
    • Smart Cashier with FOMO - Raspberry Pi
    • PCB Defect Detection with Computer Vision - Raspberry Pi
    • Bicycle Counting - Sony Spresense
    • Counting Eggs with Computer Vision - OpenMV Cam H7
    • Elevator Passenger Counting - Arduino Nicla Vision
    • ESD Protection using Computer Vision - Seeed ReComputer
    • Solar Panel Defect Detection - Arduino Portenta H7
    • Label Defect Detection - Raspberry Pi
    • Dials and Knob Monitoring with Computer Vision - Raspberry Pi
    • Digital Character Recognition on Electric Meter System - OpenMV Cam H7
    • Corrosion Detection with Computer Vision - Seeed reTerminal
    • Inventory Management with Computer Vision - Raspberry Pi
    • Monitoring Retail Checkout Lines with Computer Vision - Renesas RZ/V2L
    • Counting Retail Inventory with Computer Vision - Renesas RZ/V2L
    • Pose Detection - Renesas RZ/V2L
    • Product Quality Inspection - Renesas RZ/V2L
    • Smart Grocery Cart Using Computer Vision - OpenMV Cam H7
    • Driver Drowsiness Detection With FOMO - Arduino Nicla Vision
    • Gastroscopic Image Processing - OpenMV Cam H7
    • Pharmaceutical Pill Quality Control and Defect Detection
    • Deter Shoplifting with Computer Vision - Texas Instruments TDA4VM
    • Smart Factory Prototype - Texas Instruments TDA4VM
    • Correct Posture Detection and Enforcement - Texas Instruments TDA4VM
    • Visual Anomaly Detection with FOMO-AD - Texas Instruments TDA4VM
    • Surface Crack Detection and Localization - Texas Instruments TDA4VM
    • Surface Crack Detection - Seeed reTerminal
    • Retail Image Classification - Nvidia Jetson Nano
    • SiLabs xG24 Plus Arducam - Sorting Objects with Computer Vision and Robotics - Part 1
    • SiLabs xG24 Plus Arducam - Sorting Objects with Computer Vision and Robotics - Part 2
    • Object Detection and Visualization - Seeed Grove Vision AI Module
    • Bike Rearview Radar - Raspberry Pi
    • Build a Self-Driving RC Vehicle - Arduino Portenta H7 and Computer Vision
    • "Bring Your Own Model" Image Classifier for Wound Identification
    • Acute Lymphoblastic Leukemia Classifier - Nvidia Jetson Nano
    • Hardhat Detection in Industrial Settings - Alif Ensemble E7
    • Motorcycle Helmet Identification and Traffic Light Control - Texas Instruments AM62A
    • Import a Pretrained Model with "Bring Your Own Model" - Texas Instruments AM62A
    • Product Inspection with Visual Anomaly Detection - FOMO-AD - Sony Spresense
    • Visual Anomaly Detection in Fabric using FOMO-AD - Raspberry Pi 5
    • Car Detection and Tracking System for Toll Plazas - Raspberry Pi AI Kit
    • Visual Anomaly Detection - Seeed Grove Vision AI Module V2
    • Object Counting with FOMO - OpenMV Cam RT1062
    • Visitor Heatmap with FOMO Object Detection - Jetson Orin Nano
    • Vehicle Security Camera - Arduino Portenta H7
  • Audio Projects
    • Occupancy Sensing - SiLabs xG24
    • Smart Appliance Control Using Voice Commands - Nordic Thingy:53
    • Glass Window Break Detection - Nordic Thingy:53
    • Illegal Logging Detection - Nordic Thingy:53
    • Illegal Logging Detection - Syntiant TinyML
    • Wearable Cough Sensor and Monitoring - Arduino Nano 33 BLE Sense
    • Collect Data for Keyword Spotting - Raspberry Pi Pico
    • Voice-Activated LED Strip - Raspberry Pi Pico
    • Snoring Detection on a Smart Phone
    • Gunshot Audio Classification - Arduino Nano 33 + Portenta H7
    • AI-Powered Patient Assistance - Arduino Nano 33 BLE Sense
    • Acoustic Pipe Leakage Detection - Arduino Portenta H7
    • Location Identification using Sound - Syntiant TinyML
    • Environmental Noise Classification - Nordic Thingy:53
    • Running Faucet Detection - Seeed XIAO Sense + Blues Cellular
    • Vandalism Detection via Audio Classification - Arduino Nano 33 BLE Sense
    • Predictive Maintenance Using Audio Classification - Arduino Nano 33 BLE Sense
    • Porting an Audio Project from the SiLabs Thunderboard Sense 2 to xG24
    • Environmental Audio Monitoring Wearable - Syntiant TinyML - Part 1
    • Environmental Audio Monitoring Wearable - Syntiant TinyML - Part 2
    • Keyword Spotting - Nordic Thingy:53
    • Detecting Worker Accidents with Audio Classification - Syntiant TinyML
    • Snoring Detection with Syntiant NDP120 Neural Decision Processor - Arduino Nicla Voice
    • Recognize Voice Commands with the Particle Photon 2
    • Voice Controlled Power Plug with Syntiant NDP120 (Nicla Voice)
    • Determining Compressor State with Audio Classification - Avnet RaSynBoard
    • Developing a Voice-Activated Product with Edge Impulse's Synthetic Data Pipeline
    • Enhancing Worker Safety using Synthetic Audio to Create a Dog Bark Classifier
  • Predictive Maintenance and Defect Detection Projects
    • Predictive Maintenance - Nordic Thingy:91
    • Brushless DC Motor Anomaly Detection
    • Industrial Compressor Predictive Maintenance - Nordic Thingy:53
    • Anticipate Power Outages with Machine Learning - Arduino Nano 33 BLE Sense
    • Faulty Lithium-Ion Cell Identification in Battery Packs - Seeed Wio Terminal
    • Weight Scale Predictive Maintenance - Arduino Nano 33 BLE Sense
    • Fluid Leak Detection With a Flowmeter and AI - Seeed Wio Terminal
    • Pipeline Clog Detection with a Flowmeter and AI - Seeed Wio Terminal
    • Refrigerator Predictive Maintenance - Arduino Nano 33 BLE Sense
    • Motor Pump Predictive Maintenance - Infineon PSoC 6 WiFi-BT Pioneer Kit + CN0549
    • BrickML Demo Project - 3D Printer Anomaly Detection
    • Condition Monitoring - Syntiant TinyML Board
    • Predictive Maintenance - Commercial Printer - Sony Spresense + CommonSense
    • Vibration Classification with BrainChip's Akida
    • AI-driven Audio and Thermal HVAC Monitoring - SeeedStudio XIAO ESP32
  • Accelerometer and Activity Projects
    • Arduino x K-Way - Outdoor Activity Tracker
    • Arduino x K-Way - Gesture Recognition for Hiking
    • Arduino x K-Way - TinyML Fall Detection
    • Posture Detection for Worker Safety - SiLabs Thunderboard Sense 2
    • Hand Gesture Recognition - OpenMV Cam H7
    • Arduin-Row, a TinyML Rowing Machine Coach - Arduino Nicla Sense ME
    • Fall Detection using a Transformer Model – Arduino Giga R1 WiFi
    • Bluetooth Fall Detection - Arduino Nano 33 BLE Sense
    • Monitor Packages During Transit with AI - Arduino Nano 33 BLE Sense
    • Smart Baby Swing - Arduino Portenta H7
    • Warehouse Shipment Monitoring - SiLabs Thunderboard Sense 2
    • Gesture Recognition - Bangle.js Smartwatch
    • Gesture Recognition for Patient Communication - SiLabs Thunderboard Sense 2
    • Hospital Bed Occupancy Detection - Arduino Nano 33 BLE Sense
    • Porting a Posture Detection Project from the SiLabs Thunderboard Sense 2 to xG24
    • Porting a Gesture Recognition Project from the SiLabs Thunderboard Sense 2 to xG24
    • Continuous Gait Monitor (Anomaly Detection) - Nordic Thingy:53
    • Classifying Exercise Activities on a BangleJS Smartwatch
  • Air Quality and Environmental Projects
    • Arduino x K-Way - Environmental Asthma Risk Assessment
    • Gas Detection in the Oil and Gas Industry - Nordic Thingy:91
    • Smart HVAC System with a Sony Spresense
    • Smart HVAC System with an Arduino Nicla Vision
    • Indoor CO2 Level Estimation - Arduino Portenta H7
    • Harmful Gases Detection - Arduino Nano 33 BLE Sense
    • Fire Detection Using Sensor Fusion and TinyML - Arduino Nano 33 BLE Sense
    • AI-Assisted Monitoring of Dairy Manufacturing Conditions - Seeed XIAO ESP32C3
    • AI-Assisted Air Quality Monitoring - DFRobot Firebeetle ESP32
    • Air Quality Monitoring with Sipeed Longan Nano - RISC-V Gigadevice
    • Methane Monitoring in Mines - Silabs xG24 Dev Kit
    • Smart Building Ventilation with Environmental Sensor Fusion
    • Sensor Data Fusion with Spresense and CommonSense
    • Water Pollution Detection - Arduino Nano ESP32 + Ultrasonic Scan
    • Fire Detection Using Sensor Fusion - Arduino Nano 33 BLE Sense
  • Novel Sensor Projects
    • 8x8 ToF Gesture Classification - Arduino RP2040 Connect
    • Food Irradiation Dose Detection - DFRobot Beetle ESP32C3
    • Applying EEG Data to Machine Learning, Part 1
    • Applying EEG Data to Machine Learning, Part 2
    • Applying EEG Data to Machine Learning, Part 3
    • Liquid Classification with TinyML - Seeed Wio Terminal + TDS Sensor
    • AI-Assisted Pipeline Diagnostics and Inspection with mmWave Radar
    • Soil Quality Detection Using AI and LoRaWAN - Seeed Sensecap A1101
    • Smart Diaper Prototype - Arduino Nicla Sense ME
    • DIY Smart Glove with Flex Sensors
    • EdgeML Energy Monitoring - Particle Photon 2
    • Wearable for Monitoring Worker Stress using HR/HRV DSP Block - Arduino Portenta
  • Software Integration Demos
    • Azure Machine Learning with Kubernetes Compute and Edge Impulse
    • ROS2 + Edge Impulse, Part 1: Pub/Sub Node in Python
    • ROS2 + Edge Impulse, Part 2: MicroROS
    • Using Hugging Face Datasets in Edge Impulse
    • Using Hugging Face Image Classification Datasets with Edge Impulse
    • Edge Impulse API Usage Sample Application - Jetson Nano Trainer
    • MLOps with Edge Impulse and Azure IoT Edge
    • A Federated Approach to Train and Deploy Machine Learning Models
    • DIY Model Weight Update for Continuous AI Deployments
    • Automate the CI/CD Pipeline of your Models with Edge Impulse and GitHub Actions
    • Deploying Edge Impulse Models on ZEDEDA Cloud Devices
Powered by GitBook
On this page
  • Introduction
  • Software used
  • Context
  • Step 1: Create the Datasets
  • Step 2: Create the models
  • Baseline model
  • Convert Baseline with BYOM
  • EfficientAD
  • FOMO-AD model (automated)
  • Step 3: Benchmarking
  • Step 4: API & Web App
  • SageMaker Serverless Inference
  • Real-time inference
  • Website
  • Improvements

Was this helpful?

Edit on GitHub
Export as PDF
  1. Welcome
  2. Featured Machine Learning Projects

Optimize a cloud-based Visual Anomaly Detection Model for Edge Deployments

Advanced ML workflow with available Jupyter Notebook using computer vision, AWS SageMaker and MLFlow to benchmark industry visual anomaly models.

PreviousROS 2 Pick and Place System - Arduino Braccio++ Robotic Arm and Luxonis OAK-DNextRooftop Ice Detection with Things Network Visualization - Nvidia Omniverse Replicator

Last updated 4 months ago

Was this helpful?

Created By: Mathieu Lescaudron

Public Project Link:

GitHub Repo:

Introduction

Let's explore the development and optimization of a cloud-based visual anomaly detection model designed for edge deployments, featuring real-time and serverless inference.

We will cover the following topics:

  • Datasets: Creation of our own datasets.

  • Models: Development of three different models:

    • A baseline model + usage of BYOM (Bring Your Own Model on Edge Impulse),

    • Efficient AD model,

    • FOMO AD model by Edge Impulse (automated).

  • Web App:

    • Setting up a real-time and serverless inference endpoint,

    • Dataset explorer,

    • Automating deployments with GitHub Actions and Terraform on AWS.

This is a demo project. All code is provided for you to implement any or all parts yourself.

Software used

Context

Imagine we are a commercial baking company that produces cookies. Our goal is to sort cookies to identify those with and without defects (anomalies), so that any broken cookies do not get packaged and sent to retailers.

We are developing a cloud-based proof-of-concept to understand the feasibility of this technique, before deploying it on edge devices.

Although this is only a hypothetical example and demonstration, this quality inspection process and computer vision workflow could absolutely be leveraged by large-scale food service providers, commercial kitchens that make packaged retail food items, or any many other mass-produced retail products even beyond the food industry.

Step 1: Create the Datasets

We'll create three datasets using three different types of cookies:

  • One with texture,

  • One thicker cookie,

  • One plain cookie.

Each dataset will consist of 200 images, totaling 600 images:

  • 100 without any anomalies

  • 100 with anomalies

    • 50 easy to recognize with a clear, strong separation down the middle,

    • 25 medium difficulty with a separation that has no gap,

    • 25 hard to detect, with small defects and knife marks.

We take around five pictures of each cookie, making slight rotations each time. Here's the result:

mogrify -resize 1024x1024 *.jpg

The folder structure looks like this:

- cookies_1
    - anomaly_lvl_1
    - anomaly_lvl_2
    - anomaly_lvl_3
    - no_anomaly
- cookies_2
...

Step 2: Create the models

Baseline model

The first model we will develop will be our baseline, serving as our starting point.

It consists of a categorical image classification using a pre-trained MobileNet.

This is Categorical (rather than binary) classification to allow for the addition of more categories of anomalies in the future.

base_model = MobileNet(input_shape=(160, 160, 3),
                       include_top=False,
                       weights='imagenet',
                       pooling='avg')

base_model.trainable = False

model = Sequential([
    base_model,
    Dense(128, activation='relu'),
    Dense(2, activation='softmax')
])

model.compile(
    optimizer=tf.keras.optimizers.legacy.Adam(learning_rate=1e-5),  
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

Here's how the images are distributed for this model:

  • Training: 144 images (72%)

  • Validation: 16 images (8%)

  • Test: 40 images (20%)

Both "anomaly" and "no anomaly" images are used during training.

The model is trained on a Mac using the CPU, running through 50 epochs.

Convert Baseline with BYOM

In our case, let's use a Jupyter notebook that converts the Baseline model to a MacOS version using the Edge Impulse API. (You can do it for a specific edge device, linux, web assembly, etc). It can save you quite some time compared to doing it yourself.

import os
import edgeimpulse as ei 
from dotenv import load_dotenv 

ei.API_KEY = os.getenv("EDGE_IMPULSE_API_KEY_BASELINE")

After that, define the input and output types for your model:

python
model_input_type = ei.model.input_type.ImageInput()
model_output_type = ei.model.output_type.Classification(labels=["anomaly","no_anomaly"])

And then, convert it to the format that fits your needs (in this case, MacOS for the demo):

deploy_bytes_mac_os = ei.model.deploy(model=model,
                                      model_output_type=model_output_type,
                                      model_input_type=model_input_type,
                                      deploy_target='runner-mac-x86_64')

if deploy_bytes_mac_os:
    with open(f"baseline.eim", 'wb') as f:
        f.write(deploy_bytes.getvalue())

You'll need to make it executable by using the command chmod +x baseline.eim. And you're all set! Create an inference function to use it with this model:

def ei_inference(img_path):
    with ImageImpulseRunner(modelfile) as runner:
        model_info = runner.init()

        original_image = cv2.imread(img_path, cv2.IMREAD_COLOR)
        img = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)  
        
        features, cropped = runner.get_features_from_image(img)
        res = runner.classify(features)
        
        anomaly = res["result"]["classification"]["anomaly"]
        no_anomaly = res["result"]["classification"]["no_anomaly"]

        classification = "anomaly" if anomaly > no_anomaly else "no_anomaly"
        print(res["result"]["classification"], f"Classification: {classification}")

EfficientAD

EfficientAD employs an autoencoder paired with a student-teacher approach to quickly and effectively identify anomalies in images.

The network, named PDN (Patch Description Network), includes a design with 4 convolutional layers and 2 pooling layers. It examines each segment of the 33 x 33 pixel image and produces a feature vector of 384 values.

Two models, student and teacher are trained on the same data. The teacher model guides the student model by providing a loss function which helps the student to improve their performance in detecting anomalies.

Anomaly detection during testing is measurable when the student model fails to predict the characteristics of an image. EfficientAD introduces an autoencoder that gives a broader view of the image, improving the overall performance of the detection in addition to the Student-Teacher method.

Experimenting with MLFLow

To run a MLFlow server, either locally or remotely, use the following command:

# pip install mlflow
mlflow server \
    --host 0.0.0.0 \
    --port 5000 \
    --artifacts-destination s3://artifact-dev.anomaly.parf.ai \
    --gunicorn-opts --timeout=300 \
    --gunicorn-opts --keep-alive=300

Here, we're using the --artifacts-destination argument to specify where to store our models. You can omit this argument if you're not using a S3 bucket on AWS, and it will default to storing the models on the disk.

In your code, you define an experiment like this:

mlflow.set_tracking_uri(uri=self.cfg["mlflow_tracking_uri"])
mlflow.set_experiment("Name of the experiment")

with mlflow.start_run(): # Start the experiments and timer

    mlflow.log_params({**self.cfg}) 

    # your code, training, ...

    mlflow.log_metric("score", score)
    mlflow.log_metric("threshold", threshold)

    mlflow.log_artifact("./models.pth") # teacher, student & autoencoder
    # mlflow.log_artifact(...) # you can log multiple artifacts

We primarily use MLFlow to track experiments and store artifacts, although it offers many other powerful features including model registry, model deployments, model serving, and more.

Training in the cloud

The specific instance type we use is g4dn.xlarge. To get access to this instance, you need to create a support ticket requesting access to the type G instance type in your region. It will cost us 0.526 USD per hour and we plan to use it for approximately 3h.

For our setup, we'll use a pre-configured AMI with PyTorch named Deep Learning OSS Nvidia Driver AMI GPU PyTorch 2.2.0.

Here is the CLI:

INSTANCE_NAME=ModelTraining
SG_GROUP_ID="..." # The security group ID allowing ssh from your computer.
KEY_NAME="..." # Your PEM key to connect through ssh

# Create the instance
aws ec2 run-instances --image-id ami-02e407fb981a2b5e3 \
--instance-type g4dn.xlarge \
--key-name $KEY_NAME \
--security-group-ids $SG_GROUP_ID \
--block-device-mappings '[{"DeviceName":"/dev/sda1","Ebs":{"VolumeSize":60}}]' \
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=$INSTANCE_NAME}]"

# To terminate the instance
aws ec2 terminate-instances --instance-ids $(aws ec2 describe-instances \
--filters "Name=tag:Name,Values=$INSTANCE_NAME" \
--query "Reservations[*].Instances[*].InstanceId" \
--output text)

Once you've connected to the instance using ssh and cloned the repository along with the datasets, you can run the following command to start the jupyter notebook:

jupyter notebook --no-browser >jupyter.log 2>&1 &

Make sure you've enabled port forwarding so you can connect to the remote Jupyter notebook locally:

ssh -N -f -L 8888:localhost:8888 ubuntu@44.200.180.25 # Change using your instance IP

You can now access Jupyter Notebook on the remote instance from your local computer.

For the training, we will only use the images without anomalies. Here's how the data is distributed:

  • Training

    • No anomaly: 72 images (36%)

  • Validation

    • No Anomaly: 8 images (4%)

    • Anomaly: 20 images (10%)

  • Testing

    • No Anomaly: 20 images (10%)

    • Anomaly: 80 images (40%)

Once it is trained, you can see the different results in MLFlow:

And you can create graphics to build reports:

For the cookies dataset three, the best model used 3,200 steps, pretrained weights, and the small network. In the study, they used 70,000 steps. We added early stopping based on the F1 score from the evaluation dataset. Modify this for your needs.

We use the same config for training datasets one and two.

Here's an example of the inference results with EfficientAD. It localizes the anomaly within the image through a heatmap.

FOMO-AD model (automated)

The last model we will build is called FOMO-AD, a visual anomaly detection learning block developed by Edge Impulse. It's based on the FOMO architecture, specifically designed for constrained devices.

Let's automate the entire process using the Edge Impulse API:

  • Import the dataset,

  • Create an impulse,

  • Generate features,

  • Train the model,

  • Export the model.

We separate our dataset as follows:

  • Training set

    • No Anomaly: 80 images (40%)

  • Testing set

    • No Anomaly: 20 images (10%)

    • Anomaly: 100 images (50%)

The best part of the notebook is that it includes a pre-built pipeline in Edge Impulse that will Find the best Visual AD Model using our dataset. All you need to do is provide the dataset and run the pipeline. After that, you'll have the optimal model set up in your project, and you can find the best threshold to use in the logs (Refer to the Option 2 section in the notebook for more details).

Edge Impulse lets you classify your entire dataset or just one image at a time:

Once the model is exported, you can create an inference function in Python to run it locally:

def ei_inference(img_path):
    with ImageImpulseRunner(model_path) as runner:
        model_info = runner.init()
        
        original_image = cv2.imread(img_path, cv2.IMREAD_COLOR)  
        img = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)  
        
        features, cropped = runner.get_features_from_image(img)
        res = runner.classify(features)

    return res

Step 3: Benchmarking

Now that we've trained all the models, it's time to evaluate how well they perform using the F1 Score. (The F1 Score is a way to measure a model's accuracy, taking into account both precision and recall).

Since each model was trained on different sets of data, we will use the test dataset from EfficientAD model for comparison.

Here are the results, tested on a Macbook:

FOMO-AD performs the best in most datasets. Although EfficientAD could be improved to score higher, it would require more time.

The EfficientAD model should be used by modern GPUs, where the inference time is about 3ms.

Step 4: API & Web App

The models are trained and ready to be used, so let's build an app to showcase our proof of concept.

We'll include two features:

In the public repository, you will find:

SageMaker Serverless Inference

This is the infrastructure of our serverless inference endpoint:

When a user uploads an image to get the anomaly result, it will go through:

  • Cloudfront (which is also used by the front end. Users are redirected to the API Gateway when the request path matches /api*),

  • An API Gateway (to communicate with Lambda and allows for future API expansions),

  • A Lambda that communicate to the SageMaker endpoint securely,

  • A Serverless SageMaker endpoint (executes the inference using a Docker container).

The SageMaker endpoint operates using a Docker image. You can build your dockerfile like this:

FROM python:3.11.7  
  
WORKDIR /app  
  
COPY requirements.txt .  
RUN apt-get update && apt-get install -y \  
    libgl1-mesa-glx \  
    build-essential \
    && rm -rf /var/lib/apt/lists/*

RUN pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cpu
RUN pip install --no-cache-dir -r requirements.txt

COPY app.py .
COPY inference.py .  

RUN mkdir efficientad_cookies_3

# Artifacts from MLFLOW
COPY efficientad_cookies_3/all_models.pth ./efficientad_cookies_3/
COPY efficientad_cookies_3/best_threshold.pkl ./efficientad_cookies_3/
COPY efficientad_cookies_3/map_normalization.pth ./efficientad_cookies_3/

ENTRYPOINT ["gunicorn", "-b", "0.0.0.0:8080", "app:app"]  

Then, upload the Docker image to an ECR repository (an Elastic Container Registry).

# Tag version image
export IMG_VERSION=latest

# Login to AWS ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin "$ACCOUNT_ID".dkr.ecr.us-east-1.amazonaws.com/

docker build --platform linux/amd64 -t anomaly-inference-api:$IMG_VERSION .
docker tag anomaly-inference-api:$IMG_VERSION "$ACCOUNT_ID".dkr.ecr.us-east-1.amazonaws.com/anomaly-inference-api:latest
docker push "$ACCOUNT_ID".dkr.ecr.us-east-1.amazonaws.com/anomaly-inference-api:$IMG_VERSION

You can also test the inference locally without using Docker:

FLASK_APP=app.py flask run --port=8080
python local.py

The serverless inference is quite slow (12 sec per inference), you can speed this up this by increasing the RAM usage, switching to a provisioned endpoint, or using a real-time endpoint within AWS. However, these options will increase the cost. The actual setup cost $ 0.20 per 1,000 inferences, an affordable way for creating demos without impacting your wallet.

Real-time inference

If you've previously played with Edge Impulse, you might be familiar with the Launch in browser feature that lets you test your model in real-time.

Wouldn't it be great to include this feature directly in our web app?

The way it work is that the client is downloading a web assembly .zip file of the model using the Edge Impulse API from your project's API KEY. Then, it unzips the export and loads the model along with multiple scripts to enable real-time inference.

We're going to modify this a bit.

  • We’ll no longer use the API KEY.

  • We’ll include the web assembly zip file directly in the website’s assets. (you can download this file manually from Edge Impulse, or it can be downloaded automatically using the API when building the website assets),

  • We'll keep only the essential code and update what's needed to make it work the new way,

  • We'll add a colormap function for fun to show the model's confidence.

This is what we obtain:

Website

The website is hosted on AWS within an S3 bucket and is behind a Cloudfront distribution.

It also features a dataset explorer that showcases the data used for benchmarking:

It includes all the images, scores, predictions, and timings for all the models and cookies.

Improvements

It will eliminate manual processing, and you won't need to run 10 km to burn off all the cookies you've eaten.

We assume we don't have access to to create a synthetic dataset. Instead, we manually create our own. The first step is to carefully review which cookies to eat use.

Each picture, taken from a mobile phone in a 1:1 ratio with an original size of 2992 x 2992 pixels, is resized to 1024 x 1024 pixels using command from ImageMagick. It saves computing resources for both the training process and the inference endpoint:

You can download the datasets (95MB) and the raw images (1GB)

Have a look at the training in

You can find the results in the section.

With Edge Impulse's feature, you can easily upload your own model and use all their features.

You can find detailed steps in (scroll down to the section titled Edge Impulse conversion)

First, start by importing the . Then load your project's API KEY.

Let's use another method called EfficientAD ().

Take a look at their for a brief overview.

We're going to reuse some of the code from and update it to suit our needs. You can find the updated code .

We will test different parameters to build a model that performs well. In the study, they used 70,000 iterations (steps) and pretrained weights from .

We will experiment different numbers of steps, enabling or disabling the pretrained weights and using the small or medium size of the patch description network (the medium size includes another layer and twice as many features). Each test is called an experiment, and we will use to log the parameters and store their results, including the scores and the models.

You can find the full setup instructions for MLFlow for this demo .

Let's train our models in the cloud using our . We are using a Jupyter notebook, or you could also use a Python script.

There are many different cloud providers that allow you to train a model. We will use an AWS instance that includes an .

The complete commands used are detailed

Once you're finished, terminate the remote instance. You can find the results in the section.

Check the for more information.

There's too much code to detail here, if you want to replicate it yourself step by step, check out

Take a look at where all the benchmarking is done.

For additional details on performance, including difficulty, time, and RAM usage, check out . Usually, the inference time of Efficient AD is 300ms, whereas FOMO AD is 35ms.

A serverless endpoint using with EfficientAD,

A real-time inference using a compact version of the Edge Impulse with FOMO-AD.

The ,

The ,

The .

Check out the to configure the SageMaker endpoint, or you can do it manually in the AWS Console.

Thanks to Edge Impulse, this feature is !

All the modifications are detailed in the Mobile Client compressed version detail section.

For the website, we're using with React based on the template.

To automatically deploy the website, we use . It triggers a deployment whenever the commit message includes deploy:website.

One key improvement could be enhancing the dataset. We used a mobile phone with a combination of natural and artificial lighting. The model's performance might improve if you create a synthetic dataset using featuring different lighting conditions, backgrounds, and more.

Edge Impulse Studio
Edge Impulse Mobile client
Visual Studio Code
Amazon Web Services
Terraform
MLFlow
Astro
Onmiverse Replicator
mogrify
here
here
this notebook
Bring Your Own Model
this notebook
Edge Impulse Python SDK
detailed in a study from arXiv.org
video presentation
nelson1425/EfficientAD
here
WideResNet-101
MLFlow
here
notebook
Nvidia Tesla 4 GPU
here
FOMO-AD documentation
this notebook
this notebook
this notebook
SageMaker Serverless Inference
mobile client
API Code
Automated Infrastructure Code (using Terraform)
Website Code
terraform code
open source
here
Astro
AstroWind
this github action
Onmiverse Replicator
Step 3: Benchmarking
Step 3: Benchmarking
https://studio.edgeimpulse.com/public/376268/latest
https://github.com/emergy-official/anomaly.parf.ai