Sensor Data Fusion with Spresense and CommonSense
Use the SensiEDGE CommonSense board to capture multiple sensor values and perform sensor fusion to identify locations.
Last updated
Use the SensiEDGE CommonSense board to capture multiple sensor values and perform sensor fusion to identify locations.
Last updated
Created By: Marcelo Rovai
Public Project Link: https://studio.edgeimpulse.com/public/281425/latest
GitHub Repo: https://github.com/Mjrovai/Sony-Spresense
This tutorial will develop a model based on the data captured with the Sony Spresense sensor extention board, SensiEDGE's CommonSense.
The general idea is to explore sensor fusion techniques, capturing environmental data such as temperature, humidity, and pressure, adding light and VOC (Volatile Organic Compounds) data to estimate what room the device is located within.
We will develop a project where our "smart device" will indicate where it is located among four different locations of a house:
Kitchen,
Laboratory (Office),
Bathroom, or
Service Area
The project will be divided into the following steps:
Sony's Spresense main board installation and test (Arduino IDE 2.x)
Spresense extension board installation and test (Arduino IDE 2.x)
Connecting the CommonSense board to the Spresense
Connecting the CommonSense board to the Edge Impulse Studio
Creating a Sensor Log for Dataset capture
Dataset collection
Dataset Pre-Processing (Data Curation)
Uploading the Curated data to Edge Impulse Studio
Training and testing the model
Deploying the trained model on the Spresense-CommonSense board
Doing Real Inference
Conclusion
You can follow this link for a more detailed explanation.
Installing USB-to-serial drivers (CP210x)
Download and install the USB-to-serial drivers that correspond to your operating system from the following links:
CP210x USB to serial driver (v11.1.0) for Windows 10/11 CP210x USB to serial driver for Mac OS X
If you use the latest Silicon Labs driver (v11.2.0) in a Windows 10/11 environment, USB communication may cause an error and fail to flash the program. Please download v11.1.0 from the above URL and install it.
Install Spresense Arduino Library
Copy and paste the following URL into the field called Additional Boards Managers URLs:
https://github.com/sonydevworld/spresense-arduino-compatible/releases/download/generic/package_spresense_index.json
Install Reference Board:
Select Board and Port
The Board and port selection can also be done by selecting them on the Top Menu:
Install BootLoader
5.1 Select Programmer → Spresense Firmware Updater
5.2 Select Burn Bootloader
During the process, it will be necessary to accept the License agreement.
Run the BLINK sketch on Examples → Basics → Blink.ino
Testing with all the 4 LEDs:
The Spresense Main board has 4 LEDs. The BUILTIN is LED0 (the far right one). But each one of them can be accessed individually. Run the code below:
Main Features:
Audio input/output - 4ch analog microphone input or 8ch digital microphone input, headphone output Digital input/output - 3.3V or 5V digital I/O Analog input - 6ch (5.0V range) External memory interface - microSD card slot
It is important to note that the Spresense main board is a low-power device running on 1.8V (including I/Os). So, installing the main board on the extension board, which has an Arduino UNO form factor and accepts up to 5V on GPIOs, is advised. Besides, the microSD card slot will be used for our Datalog.
The package of the Spresense board has 4 spacers to attach the Spresense main board.
Insert them on the Extention board and connect the main board as below:
Once the Main Board is attached to the Extension Board, insert an SD card (Formated as FAT32).
Run: Examples → File → read_write.ino under Espressif.
You should see the messages on the Serial Monitor showing that "testando…" was written on the SD card. Remove the SD card and check it on your computer. Note that I gave my card the name DATASET. Usually, for new cards, you will see, for example, NO NAME.
The CommonSense expansion board, produced by SensiEDGE, provides an array of new sensor capabilities to Spresense, including an accelerometer, gyroscope, magnetometer, temperature, humidity, pressure, proximity, ambient light, IR, microphone, and air quality (VOC). As a user interface, the board contains a buzzer, a button, an SD card reader, and a RGB LED.
The CommonSense board also features an integrated rechargeable battery connection, eliminating the necessity for a continuous power supply and allowing finished products to be viable for remote installations where a constant power source might be challenging to secure.
Below is a block diagram showing the board's main components:
Note that the sensors are connected via the I2C bus, except for the digital microphone.
So, before installing the board, let's map the Main Board I2C. Run the sketch: Examples → Wire → I2CScanner.ino under Espressif in the Arduino IDE. On the Serial Monitor, we confirm that there are no I2C devices installed:
Now, connect the CommonSense board on top of the Spresense main board as shown below:
Reconnect the Mainboard to your Computer (use the Spresense Main Board USB connector), and run the I2C mapping sketch once again. As a result, now, 12 I2C devices are found:
For example, for the SGP40 (VOC sensor) address is 0x59, the APDS-9250 (Light sensor) is 0x52, the HTS221 (Temp& Hum sensor) is 0x5F, the LPS22HH (Pressure sensor) is 0x5D, the VL53L1X (Distance sensor) is 0x29, the LSM6DSOX (Acc & Gyro) is 0x6A, the LIS2MDL (Magnetometer) is 0x1E and so on.
We have confirmed that the main MCU recognizes the sensors on the CommonSense board. Now, it is time to access and test them. For that, we will connect the board to the Edge Impulse Studio.
Go to EdgeImpulse.com, create a Project, and connect the device:
Search for supported devices and click on Sony's Spresense:
On the page that opens, go to the final portion of the document: Sensor Fusion with Sony Spresense and SensiEDGE CommonSense, and download the latest Edge Impulse Firmware for the CommonSense board: https://cdn.edgeimpulse.com/firmware/sony-spresense-commonsense.zip.
Unzip the file and run the script related to your Operating System:
And flash your board:
Run the Edge Impulse CLI and access your project:
Returning to your project, on the Devices Tab, you should confirm that your device is connected:
You can select all sensors individually or combined on the Data Acquisition Tab.
For example:
It is possible to use the Studio to collect data online, but we will use the Arduino IDE to create a Datalogger that can be used offline and not connected to our computer. The dataset can be uploaded later as a .CSV file.
For our project, we will need to install the libraries for the following sensors:
VOC - SGP40
Temperature & Humidity - HTS221TR
Pressure - LPS22HH
Light - APDS9250
Below are the required libraries:
APDS-9250: Digital RGB, IR, and Ambient Light Sensor Download the Arduino Library and install it (as .zip): https://www.artekit.eu/resources/ak-apds-9250/doc/Artekit_APDS9250.zip
HTS221 Temperature & Humidity Sensor Install the STM32duino HTS221 directly on the IDE Library Manager
SGP40 Gas Sensor Install the Sensrion I2C SGP40
LPS22HH Pressure Sensor Install the STM32duino LPS22HH
VL53L1X Time-of-Flight (Distance) Sensor (optional*) Install the VLS53L1X by Pololu
LSM6DSOX 3D accelerometer and 3D gyroscope Sensor (optional*) Install the Arduino_LSM6DSOX by Arduino
LIS2MDL - 3-Axis Magnetometer Sensor (optional*) Install the STM32duino LIS2MDL by SRA
*We will not use those sensors here, but I listed them in case they are needed for another project.
The code is simple. On a specified interval, the data will be stored on the SD card with a sample frequency specified on the line:
For example, I will have a new log each 10s in my data collection.
Also, the built-in LED will blink for each correct datalog, helping to verify if the device is working correctly during offline operation.
Here is how the data will be shown on the Serial Monitor (for testing only).
The data logger will capture data from the eight sensors (pressure, temperature, humidity, VOC, light-red, light-green, light-blue, and IR). I have captured around two hours of data (one sample every 10 seconds) in each housing area (Laboratory, Bathroom, Kitchen, and Service Area).
The CommonSense device worked offline and was powered by a 5V Powerbank as shown below:
Here the raw dataset, shown in the SD card:
As a first test, I uploaded the data to the Studio using the "CSV Wizard" tool. I also left it to the Studio to split the data into Train and Test data. Once the TimeStamp column of my raw data was a sequential number, the Studio considered the sampled frequency, 1Hz, which is OK.
For the Impulse, I considered a window of 3 samples (here 3,000 ms) with a slice of 1 sample (1,000 ms). As a Processing Block, "Flatten" was chosen; as this block changes an axis into a single value, it is helpful for slow-moving averages like the data we are capturing. For Learning, we will use "Classification" and Anomaly Detection (this one only for testing).
For Pre-Processing, we will choose as parameters Average, Minimum, Maximum, RMS, and Standard Deviation, applied for each one of the data points. So, the original 24 Raw Features (3 multiplied by 8 sensors) will result in 40 features (5 parameters per each of the original eight sensors).
The final generated features seem promising, with a good visual separation from the data points:
Now, it is time to define our Classification Model and train it. A simple DNN model with 2 hidden layers was chosen, and as main hyper-parameters, 30 Epochs with a Learning Rate (LR) of 0.0005.
The result: a complete disaster!
Let's examine why:
First, all the steps defined and performed in the Studio are correct. The problem is with the raw data that was uploaded. In tasks like sensor fusion, where data from multiple sensors, each with its measurement units and scales, are combined to create a more comprehensive view of a system, normalization and standardization are crucial preprocessing steps in a machine learning project.
So, previously, to upload the data to the Studio, we should "curate the data", or, better, normalize or standardize our sensor data to ensure faster model convergence, better performance, and more reliable sensor fusion outcomes.
In the tutorial "Using Sensor Fusion and Machine Learning to Create an AI Nose", Shawn Hymel explains how to have a sound Sensor Fusion project. In this project, we will follow his advice.
Use the notebook [data_preparation.ipynb](https://github.com/Mjrovai/Sony-Spresense/blob/main/notebooks/Spresence-CommonSense/data_preparation.ipynb)
for data curation, following the steps:
Open the Notebook on Google Colab
Open the File Manager on the left panel, go to the "three dots" menu, and create a new folder named "data"
On the data folder, go to the three dots menu and choose "upload"
Select the raw data .csv files on your computer. They should appear in the Files directory on the left panel
Create four data frames, one for each file:
bath → bathroom - Shape: (728, 9) kit → kitchen - Shape: (770, 9) lab → lab - Shape: (719, 9) serv → service - Shape: (765, 9)
Here is what one of them looks like:
Plotting the data, we can see that the initial data (around ten samples) present some instability.
So, we should delete them. Here is what the final data looks like:
We should proceed with the same cleaning for all 4 data frames.
We should split data into Train and Test at this early stage, because we should later apply the standardization or normalization to Test data with the Train data parameters.
To start, let's create a new column with the corresponding label.
We will put apart 100 data points from each dataset for testing later.
And concatenating each data frame in two single datasets for Train and Test:
We should plot pairwise relationships between variables within a dataset using the function "plot_pairplot()".
Looking at the sensor measurements on the left, we can see that each sensor's data ranges on very different values. So, we need to standardize or normalize each one of the numerical columns. But what technique should we use? Looking at the plot's diagonal, it is possible to see that the data distribution for each sensor does not follow a normal distribution, so Normalization should be the best option in this case.
Also, the data related to the light sensors (red, green, blue, and IR) correlate significantly (the plot appears as a diagonal line). This means that only one of those features should be used (or a combination of them). Leaving them separated will not damage the model; it will only make it a little bigger. But as the model is small, we will leave those features.
We should apply the normalization to the numerical features of the training data, saving as a list the mins and ranges found for each column. Here is the function used to Normalize the train data:
Those same values (train mins and ranges) should be applied to the Test dataset. Remember that the Test dataset should be new data for the model, simulating "real data", meaning we do not see this data during Training. Here is the function that can be used:
Both files will have this format:
The last step in the preparation should be saving both datasets (Train and Test) and also the train mins and ranges parameters to be used during inference.
Save the files to your computer using the option "Download" on the three dots menu in front of the four files on the left panel.
As we did before, we should upload the curated data to the Studio using the "CSV Wizard" tool. Now, we will upload 2 separate files, Train and Data. When we saved the .csv file, we did not include a timeStamp (or count) column, so on the CSV Wizard, we should inform the sampled frequency (in our case, 0.1Hz). I also left it to the Studio to define the labels, informing where they were located (column "class").
For the Impulse, I considered a window of 3 samples (here 3,000 ms) with a slice of 1 sample (1,000 ms). As a Processing Block, "Flatten" was chosen; as this block changes an axis into a single value, it is helpful for slow-moving averages like the data we are capturing. For Learning, we will use "Classification" and Anomaly Detection (this one only for testing).
The main difference now, after we upload the files, is that the total Data collected time will show more than 8 hours, which is correct once I captured around 2 hours in each of my home rooms.
The Window size for the Impulse will now be 30,000 ms, equivalent to 3 samples. We will increase the Window each 1 ms. For Pre-Processing, we will choose as parameters Average, Minimum, Maximum, RMS, and Standard Deviation, applied for each one of the data points. So, the original 24 Raw Features (3 multiplied by eight sensors) will result in 40 features (5 parameters per each of the original eight sensors).
The final generated features is very similar to what we got with the first version (Raw data).
For the Classification Model definition and training, we will keep the same hyperparameters as before. A simple DNN model with 2 hidden layers was chosen and as main hyper-parameters, 30 Epochs with a Learning Rate (LR) of 0.0005.
And the result now was great!
For the Anomaly Detection training, we used all RMS values. Confirm it by testing the model with the Test data, the result was very good again. Seems we have no issue with the Anomaly Detection.
For Deployment, we will select an Arduino Library and a non-optimized (Floating Point) model. Again, the cost of memory and latency is very small, and we can afford it on this device.
To start, let's run the Static Buffer example. For that, we should select one Raw sample as our model input tensor (in our case, a data point from the Service Room (class: serv). This value should pasted on the line:
Connect the Spresesce Board to your computer, select the appropriate port, and upload the Sketch. On the Serial Monitor, you should see the Classification result showing serv with the right score.
Based on the work done by Shawn Hymel, I adapted his code for using our Spresense-CommonSense board. The complete code can be found here: Spresense-Commonsense-inference.ino
Here is the code to be used:
Upload the code to the device and proceed with the inference in the four locations (note: wait around 2 minutes for sensor stabilization):
SensiEDGE's CommonSense board is a good choice for developing machine learning projects that involve multiple sensors. It provides accurate sensor data and can be used for sensor fusion techniques. This tutorial went step by step on a successfully developed model to estimate the location of a device in different rooms of a house using the CommonSense board, Arduino IDE, and Edge Impulse Studio.
All the code and notebook used in this project can be found in the Project Repo: Sony-Spresense.
And the Edge Impulse Studio project is located here: CommonSense-Sensor-Fusion-Preprocessed-data-v2