Smart Grocery Cart Using Computer Vision
Keep track of the items in a shopping basket using computer vision and an OpenMV Cam H7.
Created By: Kutluhan Aktar
Especially after the recent success of Amazon Go cashierless convenience stores, there is a surge in adaptations of this relatively new approach to the shopping experience, including computer vision, sensor fusion, and deep learning. Since the nonpareil concept of cashierless stores is to make shoppers avoid tedious checkout lines and self-checkout stations, the stores equipped with this technology improve the customer experience and increase profit margins comparatively. While implementing this technology in a grocery or convenience store, smart grocery carts are the most prominent asset to provide an exceptional customer experience like Amazon Go.
Although smart grocery carts improve the customer experience and plummet maintenance costs by providing an automated product tracking and payment system, the current integration methods are expensive investments for small businesses in the food retail industry since these methods require renovating (remodeling) store layouts or paying monthly fees to cloud services.
After perusing recent research papers on smart grocery carts, I noticed there is no appliance devised for converting regular grocery carts into smart grocery carts without changing anything else in an existing establishment. Therefore, I decided to build a budget-friendly and easy-to-use device giving smart grocery cart perks to regular grocery carts with a user-friendly interface.
To detect different food retail products accurately, I needed to create a valid data set in order to train my object detection model with notable veracity. Since OpenMV Cam H7 is a considerably small high-performance microcontroller board designed specifically for machine vision applications, I decided to utilize OpenMV Cam H7 in this project. Also, I was able to capture product images easily while collecting data and store them on an SD card since OpenMV Cam H7 has a built-in MicroSD card module. Then, I employed a color TFT screen (ST7735) to display a real-time video stream, the prediction (detection) results, and the selection (options) menu.
After completing my data set including various food retail products, I built my object detection model with Edge Impulse to detect products added or removed to/from the grocery cart. I utilized Edge Impulse FOMO (Faster Objects, More Objects) algorithm to train my model, which is a novel machine learning algorithm that brings object detection to highly constrained devices. Since Edge Impulse is nearly compatible with all microcontrollers and development boards, I had not encountered any issues while uploading and running my model on OpenMV Cam H7. As labels, I utilized the product brand names, such as Nutella and Snickers.
After training and testing my object detection (FOMO) model, I deployed and uploaded the model on OpenMV Cam H7 as an OpenMV firmware. Therefore, the device is capable of detecting products by running the model independently without any additional procedures or latency.
This complementing web application lets customers create accounts via its interface, receives requests from the device to add or remove products to/from the customer's database table, and creates a concurrent shopping list from the products added to the grocery cart. Also, the application sends an HTML email to the customer's registered email address when the customer finishes shopping and is ready to leave the store, including the generated shopping list and the payment link.
Since OpenMV Cam H7 does not provide Wi-Fi or cellular connectivity, I employed Beetle ESP32-C3 to get commands from OpenMV Cam H7 via serial communication and communicate with the web application via HTTP GET requests, which is an ultra-small size development board intended for IoT applications. To send commands via serial communication and control the selection menu after a product is detected by the model, I connected a joystick to OpenMV Cam H7. I also utilized the joystick while taking and storing pictures of various food retail products.
To enable the device to determine when the customer completes shopping and is ready to leave the store, I connected an MFRC522 RFID reader to Beetle ESP32-C3 so as to detect the assigned RFID key tag provided by the store per grocery cart. Also, I connected a buzzer and an RGB LED to Beetle ESP32-C3 to inform the customer of the device status.
After completing the wiring on a breadboard for the prototype and testing my code and object detection model, I decided to design a PCB for this project to make the device assembly effortless. Since Scrooge McDuck is one of my favorite cartoon characters and is famous for his wealth and stinginess, I thought it would be hilarious to design a shopping-related PCB based on him.
Lastly, to make the device as robust and sturdy as possible while being attached to a grocery cart and utilized by customers, I designed a semi-transparent hinged case compatible with any grocery cart due to its hooks and snap-fit joints (3D printable).
So, this is my project in a nutshell 😃
In the following steps, you can find more detailed information on coding, capturing food retail product images, storing pictures on an SD card, building an object detection (FOMO) model with Edge Impulse, running the model on OpenMV Cam H7, developing a full-fledged web application, and communicating with the web application via Beetle ESP32-C3.