Data acquisition format
To send data from any sensor into Edge Impulse you'll need to deliver the data in the Data acquisition format. This is a small specification that describes the type of data, the sensor data itself, and information about the device that generated the data. You can optionally sign this data so you have cryptographic proof on the origin and integrity of the data. There are end-to-end examples in C++, Python and Node.js here.
If you access to a fully-supported development board it will already send data in this format, and if you want to collect data from other devices the data forwarder will automatically convert and sign your data.
The Edge Impulse Data Acquisition format is an optimized binary format for time-series data. It allows cryptographic signing of the data when sampling is complete to prove authenticity of data. Data is encoded using CBOR or JSON and signed according to the JSON Web Signature specification.
Data can easily be generated by any language that has a CBOR or JSON library available, but there are examples available for C, C++, Node.js and Python.
Example in CBOR diagnostic notation:
{
"protected": {
"ver": "v1",
"alg": "HS256",
"iat": 1564128599
},
"signature": "b0ee0572a1984b93b6bc56e6576e2cbbd6bccd65d0c356e26b31bbc9a48210c6",
"payload": {
"device_name": "ac:87:a3:0a:2d:1b",
"device_type": "DISCO-L475VG-IOT01A",
"interval_ms": 10,
"sensors": [
{ "name": "accX", "units": "m/s2" },
{ "name": "accY", "units": "m/s2" },
{ "name": "accZ", "units": "m/s2" }
],
"values": [
[ -9.81, 0.03, 1.21 ],
[ -9.83, 0.04, 1.27 ],
[ -9.12, 0.03, 1.23 ],
[ -9.14, 0.01, 1.25 ]
]
}
}
protected
- information about the signature format.ver
- alwaysv1
(required).alg
- the algorithm used to sign this file. EitherHS256
(HMAC-SHA256) ornone
(required).iat
- date when the file was created in seconds since epoch. Only set this when the device creating the file has an accurate clock (optional).
payload
- sensor data.device_name
- unique identifier for this device. Only set this when the device has a globally unique identifier (e.g. MAC address). If this field is set the device shows up on the 'Devices' page in the studio (optional).device_type
- device type, for example the exact model of the device. Should be the same for all similar devices (required).interval_ms
- the frequency of the data in this file (in milliseconds). E.g. for 100Hz fill in10
(new data every 10 ms.). You can use a float here if you need the precision (required). If you have sensors with different sampling frequencies, you should upscale to the highest frequency to keep the finest granularity. Example: sensor A is 100 Hz, sensor B is 5 Hz. Upscale sensor B to 100 Hz by duplicating each value 20 times (100/5). You could also smooth values over between samples.sensors
- array with sensor axes.name
- name of the axis.
values
- array of sensor values. One array item per interval, and as many items in this array as there are sensor axes. If you have a single sensor, you are allowed to flatten this array to save space. An array of string values can be used to refer to binary attachments sent to the ingestion service using multipart requests (see Binary payloads and Binary attachments).
Files can be signed to establish a trust chain between device and Edge Impulse. Because the
iat
and device_name
fields are included in the signed file you can validate authenticity of the data, which device created the data, and when the data was captured. Currently data can be signed with a symmetric HMAC key. You can create HMAC keys under 'Dashboard' in the studio.To sign data using HMAC SHA256:
- 1.Set the
signature
field to0000000000000000000000000000000000000000000000000000000000000000
(64 times ASCII0
). - 2.Create the full CBOR or JSON object.
- 3.Use your language's crypto module to sign the object with HMAC-SHA256. This should give you a 32 byte signature.
- 4.Replace the empty signature with the HEX encoded signature (64 characters ASCII).
The C, C++, Node.js and Python SDKs all support signing data using HMAC SHA256. See the examples section.
a3 69 70 72 6f 74 65 63 74 65 64 a3 63 76 65 72 62 76 31 63 61 6c 67 65 48 53 32 35 36 63 69 61 74 1a 5d 3a dd 3d 69 73 69 67 6e 61 74 75 72 65 78 40 62 30 65 65 30 35 37 32 61 31 39 38 34 62 39 33 62 36 62 63 35 36 65 36 35 37 36 65 32 63 62 62 64 36 62 63 63 64 36 35 64 30 63 33 35 36 65 32 36 62 33 31 62 62 63 39 61 34 38 32 31 30 63 36 67 70 61 79 6c 6f 61 64 a5 6b 64 65 76 69 63 65 5f 6e 61 6d 65 71 61 63 3a 38 37 3a 61 33 3a 30 61 3a 32 64 3a 31 62 6b 64 65 76 69 63 65 5f 74 79 70 65 73 44 49 53 43 4f 2d 4c 34 37 35 56 47 2d 49 4f 54 30 31 41 6b 69 6e 74 65 72 76 61 6c 5f 6d 73 0a 67 73 65 6e 73 6f 72 73 83 a2 64 6e 61 6d 65 64 61 63 63 58 65 75 6e 69 74 73 64 6d 2f 73 32 a2 64 6e 61 6d 65 64 61 63 63 59 65 75 6e 69 74 73 64 6d 2f 73 32 a2 64 6e 61 6d 65 64 61 63 63 5a 65 75 6e 69 74 73 64 6d 2f 73 32 66 76 61 6c 75 65 73 9f 83 fa c1 1c f5 c3 fa 3c f5 c2 8f fa 3f 9a e1 48 83 fa c1 1d 47 ae fa 3d 23 d7 0a fa 3f a3 d7 0a 83 fa c1 11 eb 85 fa 3c f5 c2 8f fa 3f 9d 70 a4 83 fa c1 12 3d 71 fa 3c 23 d7 0a f9 3d 00 ff
C/C++
Node.js
Python
// First, add the Edge Impulse C Ingestion SDK to your project.
// https://github.com/edgeimpulse/ingestion-sdk-c
// See 'C SDK Usage Guide' for more information and porting information
#include "sensor_aq.h"
#include "sensor_aq_mbedtls_hs256.h"
const char *hmac_key = "fed53116f20684c067774ebf9e7bcbdc";
int main() {
// The sensor format supports signing the data, set up a signing context
sensor_aq_signing_ctx_t signing_ctx;
// We'll use HMAC SHA256 signatures, which can be created through Mbed TLS
// If you use a different crypto library you can implement your own context
sensor_aq_mbedtls_hs256_ctx_t hs_ctx;
// Set up the context, the last argument is the HMAC key
sensor_aq_init_mbedtls_hs256_context(&signing_ctx, &hs_ctx, hmac_key);
// Set up the sensor acquisition basic context
sensor_aq_ctx ctx = {
// We need a single buffer. The library does not require any dynamic allocation (but your TLS library might)
{ (unsigned char*)malloc(1024), 1024 },
// Pass in the signing context
&signing_ctx,
// And pointers to fwrite and fseek - note that these are pluggable so you can work with them on
// non-POSIX systems too. Just override the EI_SENSOR_AQ_STREAM macro to your custom file type.
&fwrite,
&fseek,
// if you set the time function this will add 'iat' (issued at) field to the header with the current time
// if you don't include it, this will be omitted
&time
};
// Payload header
sensor_aq_payload_info payload = {
// Unique device ID (optional), set this to e.g. MAC address or device EUI **if** your device has one
"ac:87:a3:0a:2d:1b",
// Device type (required), use the same device type for similar devices
"DISCO-L475VG-IOT01A",
// How often new data is sampled in ms. (100Hz = every 10 ms.)
10,
// The axes which you'll use. The units field needs to comply to SenML units (see https://www.iana.org/assignments/senml/senml.xhtml)
{ { "accX", "m/s2" }, { "accY", "m/s2" }, { "accZ", "m/s2" } }
};
// Place to write our data.
// The library streams data, and does not cache everything in buffers
FILE *file = fopen("test/encoded.cbor", "w+");
// Initialize the context, this verifies that all requirements are present
// it also writes the initial CBOR structure
int res;
res = sensor_aq_init(&ctx, &payload, file, false);
if (res != AQ_OK) {
printf("sensor_aq_init failed (%d)\n", res);
return 1;
}
// Periodically call `sensor_aq_add_data` (every 10 ms. in this example) to append data
float values[][3] = {
{ -9.81, 0.03, 1.21 },
{ -9.83, 0.04, 1.28 },
{ -9.12, 0.03, 1.23 },
{ -9.14, 0.01, 1.25 }
};
for (size_t ix = 0; ix < sizeof(values) / sizeof(values[0]); ix++) {
res = sensor_aq_add_data(&ctx, values[ix], 3);
if (res != AQ_OK) {
printf("sensor_aq_add_data failed (%d)\n", res);
return 1;
}
}
// When you're done call sensor_aq_finish - this will calculate the finalized signature and close the CBOR file
res = sensor_aq_finish(&ctx);
if (res != AQ_OK) {
printf("sensor_aq_finish failed (%d)\n", res);
return 1;
}
// Use the HTTP libraries available for your platform to upload
// test/encoded.cbor
// to
// https://ingestion.edgeimpulse.com/api/training/data
}
const fs = require('fs');
const crypto = require('crypto');
const Path = require('path');
const request = require('request');
const hmac_key = "fed53116f20684c067774ebf9e7bcbdc";
const API_KEY = "ei_fd83...";
// empty signature (all zeros). HS256 gives 32 byte signature, and we encode in hex, so we need 64 characters here
let emptySignature = Array(64).fill('0').join('');
let data = {
protected: {
ver: "v1",
alg: "HS256",
iat: Math.floor(Date.now() / 1000) // epoch time, seconds since 1970
},
signature: emptySignature,
payload: {
device_name: "ac:87:a3:0a:2d:1b",
device_type: "DISCO-L475VG-IOT01A",
interval_ms: 10,
sensors: [
{ name: "accX", units: "m/s2" },
{ name: "accY", units: "m/s2" },
{ name: "accZ", units: "m/s2" }
],
values: [
[ -9.81, 0.03, 1.21 ],
[ -9.83, 0.04, 1.27 ],
[ -9.12, 0.03, 1.23 ],
[ -9.14, 0.01, 1.25 ]
]
}
};
let encoded = JSON.stringify(data);
// now calculate the HMAC and fill in the signature
let hmac = crypto.createHmac('sha256', hmac_key);
hmac.update(encoded);
let signature = hmac.digest().toString('hex');
// update the signature in the message and re-encode
data.signature = signature;
encoded = JSON.stringify(data);
// now upload the buffer to Edge Impulse
request.post('https://ingestion.edgeimpulse.com/api/training/data', {
headers: {
'x-api-key': API_KEY,
'x-file-name': 'test01',
'Content-Type': 'application/json'
},
body: encoded,
encoding: 'binary'
}, function (err, response, body) {
if (err) return console.error('Request failed', err);
console.log('Uploaded file to Edge Impulse', response.statusCode, body);
});
# First, install the dependencies via:
# $ pip3 install requests
import json
import time, hmac, hashlib
import requests
HMAC_KEY = "fed53116f20684c067774ebf9e7bcbdc"
API_KEY = "ei_fd83..."
# empty signature (all zeros). HS256 gives 32 byte signature, and we encode in hex, so we need 64 characters here
emptySignature = ''.join(['0'] * 64)
data = {
"protected": {
"ver": "v1",
"alg": "HS256",
"iat": time.time() # epoch time, seconds since 1970
},
"signature": emptySignature,
"payload": {
"device_name": "ac:87:a3:0a:2d:1b",
"device_type": "DISCO-L475VG-IOT01A",
"interval_ms": 10,
"sensors": [
{ "name": "accX", "units": "m/s2" },
{ "name": "accY", "units": "m/s2" },
{ "name": "accZ", "units": "m/s2" }
],
"values": [
[ -9.81, 0.03, 1.21 ],
[ -9.83, 0.04, 1.27 ],
[ -9.12, 0.03, 1.23 ],
[ -9.14, 0.01, 1.25 ]
]
}
}
# encode in JSON
encoded = json.dumps(data)
# sign message
signature = hmac.new(bytes(HMAC_KEY, 'utf-8'), msg = encoded.encode('utf-8'), digestmod = hashlib.sha256).hexdigest()
# set the signature again in the message, and encode again
data['signature'] = signature
encoded = json.dumps(data)
# and upload the file
res = requests.post(url='https://ingestion.edgeimpulse.com/api/training/data',
data=encoded,
headers={
'Content-Type': 'application/json',
'x-file-name': 'idle01',
'x-api-key': API_KEY
})
if (res.status_code == 200):
print('Uploaded file to Edge Impulse', res.status_code, res.content)
else:
print('Failed to upload file to Edge Impulse', res.status_code, res.content)
What about applying labels to your samples? You can use the optional
x-label
header parameter when creating your request. See the header parameters available in the ingestion API here.For some payloads, like audio data, it's not feasible to convert the raw sensor data into CBOR values while capturing data on the device. For these cases you can send a binary payload. The same header structure as above applies, but instead of writing the values in the
values
CBOR field you can append a binary array at the end of the file (in uint8
, uint16
, uint32
, int8
, int16
, int32
or float32
format). To do so:- 1.Limit your data to a single sensor.
- 2.Set the value of the
values
field to:[ "Ref-Binary-i16" ]
(so a string within an array, wherei16
indicatesint16
, alternatively you can useu16
foruint16
orf32
forfloat32
). - 3.Write the CBOR header. Note that this needs to be a valid CBOR structure, and that it requires termination with
0xFF
. The easiest to achieve this is by writing thevalues
field as an indefinite length array (which needs to be terminated by0xFF
. - 4.Write your payload (little endian).
- If you need to align your writes the easiest is to pad the
Ref-Binary-i16
string with spaces on the right.
- 5.Update the signature with the signature for the full file, including header and payload.
- 6.Upload the data to the ingestion service with the
Content-Type: application/octet-stream
header.
Here is an example of a valid audio sample that includes a binary payload:
a36970726f746563746564a26376657262763163616c67654853323536697369676e6174757265784038346432316234343665323831383561356336303761396437656134636336333735356137363261666164643036343334613136353766623433386237656630677061796c6f6164a56b6465766963655f6e616d657143343a37463a35313a38443a31383a34416b6465766963655f7479706573444953434f5f4c34373556475f494f543031416b696e74657276616c5f6d73f92c006773656e736f727381a2646e616d6565617564696f65756e697473637761766676616c7565739f6e5265662d42494e4152592d693136fff918fa18e718de18e218dd18e118e218f218f018f918f418e918ee18f618f918fc180919101915191119fe18f6180b190e190a19021912191419fc18f218f718
The header stops at byte
ff
(you can copy the beginning of the data up until the ff
into cbor.me to validate) and the payload starts at f918fa
.For non-time-series data - currently only implemented for images - you can send binary attachments to the ingestion service. Here the same header structure is used as above, but instead of writing the values in the
values
CBOR field, you reference the content type, the size, and the signed hash (with the same HMAC key) of your attachment. You then need to send both the header and the actual file to the Ingestion service via a multipart/form request (see 'Multipart requests' section).An end-to-end example that demonstrates how to create a valid message, and post it to the ingestion service can be found here: example-ingestion-jpg.
Note: Including more than 1 binary attachment is currently not supported by the Edge Impulse Studio, such messages won't be rejected by the ingestion service though, these attachments will merely be 'invisible' from the studio.
Last modified 1mo ago