.cbor
- Files in the Edge Impulse Data Acquisition format. The uploader will not resign these files, only upload them..json
- Files in the Edge Impulse Data Acquisition format. The uploader will not resign these files, only upload them..csv
- Files in the Edge Impulse Comma Separated Values (CSV) format. If you have configured the “CSV wizard”, the settings will be used to parse your CSV files..wav
- Lossless audio files. It’s recommended to use the same frequency for all files in your data set, as signal processing output might be dependent on the frequency..jpg
and .png
- Image files. It’s recommended to use the same ratio for all files in your data set..mp4
and .avi
- Video file. You can then from the studio split this video file into images at a configurable frame per second.training
category, but you can override the category with the --category
option. E.g.:
split
to automatically split data between training and testing sets (recommended for a balanced dataset). This is based on the hash of the file, so this is a deterministic process.
--label
option. E.g.:
^[a-zA-Z0-9\s-_]+
. For example: idle.01 will yield the label idle
.
Thus, if you want to use labels (string values) containing float values (e.g. “0.01”, “5.02”, etc…), automatic labeling won’t work.
To bypass this limitation, you can make a JSON file containing your dataset files’ info. We also support adding metadata to your samples:
info.labels
"label": { "type": "unlabeled" }
bounding_boxes.labels
file in the same folder as your image files. The contents of this file are formatted as JSON with the following structure:
boundingBoxes
object, one for each file name. If you have data in multiple folders, you can create a bounding_boxes.labels
in each folder.
bounding_boxes.labels
When uploading one or more images, we check whether a labels file is present in the same folder, and automatically attach the bounding boxes to the image.So you can just do:bounding_boxes.labels
file will be included in the exported archive.--format-openmv
and pass the folder of your dataset in to automatically upload data. Data is automatically split between testing and training sets. E.g.:
--silent
- omits information on startup. Still prints progress information.--dev
- lists development servers, use in conjunction with --clean
.--hmac-key <key>
- set the HMAC key, only used for files that need to be signed such as wav
files.--concurrency <count>
- number of files to uploaded in parallel (default: 20).--progress-start-ix <index>
- when set, the progress index will start at this number. Useful to split up large uploads in multiple commands while the user still sees this as one command.--progress-end-ix <index>
- when set, the progress index will end at this number. Useful to split up large uploads in multiple commands while the user still sees this as one command.--progress-interval <interval>
- when set, the uploader will not print an update for every line, but every interval
period (in ms.).--allow-duplicates
- to avoid pollution of your dataset with duplicates, the hash of a file is checked before uploading against known files in your dataset. Enable this flag to skip this check..wav
files exceeds the total number of arguments allowed for a single command on your shell. You can easily work around this shell limitation by using the find
command to call the uploader for manageable batches of files:
xargs
portion, for example if you wish to specify a category
: