data package
Submodules
Sample
Wrapper class for sample data, labels, and associated metadata.
Sample data should be contained in a file or file-like object, for example, as the return from open(..., "rb")
. The upload_samples()
function expects Sample objects as input.
Parameters
data: Union[io.BufferedIOBase, _io.StringIO, str]
filename: Optional[str] = None
category: Optional[Literal['training', 'testing', 'anomaly', 'split']] = 'split'
label: Optional[str] = None
bounding_boxes: Optional[List[dict]] = None
metadata: Optional[dict] = None
sample_id: Optional[int] = None
structured_labels: Optional[List[dict]] = None
Class variables
bounding_boxes: Optional[List[dict]]
category: Optional[Literal['training', 'testing', 'anomaly', 'split']]
data: Union[io.BufferedIOBase, _io.StringIO, str]
filename: Optional[str]
label: Optional[str]
metadata: Optional[dict]
sample_id: Optional[int]
structured_labels: Optional[List[dict]]
delete_all_samples
Delete all samples in a given category.
If category is set to None
, all samples in the project are deleted.
Parameters
category: Optional[str] = None
api_key: Optional[str] = None
timeout_sec: Optional[float] = None
Return
Optional[edgeimpulse_api.models.generic_api_response.GenericApiResponse]
delete_sample_by_id
Delete a particular sample from a project given the sample ID.
Parameters
sample_id: int
api_key: Optional[str] = None
timeout_sec: Optional[float] = None
Return
Optional[edgeimpulse_api.models.generic_api_response.GenericApiResponse]
delete_samples_by_filename
Delete any samples from an Edge Impulse project that match the given filename.
Note: the filename
argument must not include the original extension. For example, if you uploaded a file named my-image.01.png
, you must provide the filename
as my-image.01
.
Parameters
filename: str
category: Optional[str] = None
api_key: Optional[str] = None
timeout_sec: Optional[float] = None
Return
Tuple[Optional[Any], ...]
download_samples_by_ids
Download samples by their associated IDs from an Edge Impulse project.
Downloaded sample data is returned as a DownloadSample
object, which contains the raw data in a BytesIO object along with associated metadata.
Important! All time series data is returned as a JSON file (in BytesIO format) with a timestamp column. This includes files originally uploaded as CSV, JSON, and CBOR. Edge Impulse Studio removes the timestamp column from any uploaded CSV files and computes an estimated sample rate. The timestamps are computed based on the sample rate, will always start at 0, and will be in milliseconds. These timestamps may not be the same as the original timestamps in the uploaded file.
Parameters
sample_ids: Union[int, List[int]]
api_key: Optional[str] = None
timeout_sec: Optional[float] = None
max_workers: Optional[int] = None
show_progress: Optional[bool] = False
pool_maxsize: Optional[int] = 20
pool_connections: Optional[int] = 20
Return
List[edgeimpulse.data.sample_type.Sample]
get_filename_by_id
Given an ID for a sample in a project, return the filename associated with that sample.
Note that while multiple samples can have the same filename, each sample has a unique sample ID that is provided by Studio when the sample is uploaded.
Parameters
sample_id: int
api_key: Optional[str] = None
timeout_sec: Optional[float] = None
Return
Optional[str]
get_sample_ids
Get the sample IDs and filenames for all samples in a project, filtered by category, labels, or filename.
Note that filenames are given by the root of the filename when uploaded. For example, if you upload my-image.01.png
, it will be stored in your project with a hash such as my-image.01.png.4f262n1b.json
. To find the ID(s) that match this sample, you must provide the argument filename=my-image.01
. Notice the lack of extension and hash.
Because of the potential for multiple samples (i.e., different sample IDs) with the same filename, we recommend providing unique filenames for your samples when uploading.
Parameters
filename: Optional[str] = None
category: Optional[str] = None
labels: Optional[str] = None
api_key: Optional[str] = None
num_workers: Optional[int] = 4
timeout_sec: Optional[float] = None
Return
List[edgeimpulse.data.sample_type.SampleInfo]
infer_from_filename
Extract label and category information from the filename and assigns them to the sample object.
Files should look like this my-dataset/training/wave.1.cbor
where wave
is the label and training
is the category. It checks if there is training
, testing
or anomaly
in the filename to determine the sample category.
Parameters
sample: edgeimpulse.data.sample_type.Sample
file: str
Return
None
numpy_timeseries_to_sample
Convert numpy values to a sample that can be uploaded to Edge Impulse.
Parameters
values
sensors: List[edgeimpulse.data.sample_type.Sensor]
sample_rate_ms: int
Return
edgeimpulse.data.sample_type.Sample
pandas_dataframe_to_sample
Convert a dataframe to a single sample. Can handle both timeseries and non-timeseries data.
In order to be inferred as timeseries it must have:
More than one row
A sample rate or an index from which the sample rate can be inferred
Therefore must be monotonically increasing
And int or a date
Parameters
df
sample_rate_ms: Optional[int] = None
label: Optional[str] = None
filename: Optional[str] = None
axis_columns: Optional[List[str]] = None
metadata: Optional[dict] = None
label_col: Optional[str] = None
category: Literal['training', 'testing', 'split'] = 'split'
Return
edgeimpulse.data.sample_type.Sample
upload_directory
Upload a directory of files to Edge Impulse.
Tries to autodetect whether it's an Edge Impulse exported dataset, or a standard directory. The files can be in CBOR, JSON, image, or WAV file formats. You can read more about the different file formats accepted by the Edge Impulse ingestion service here:
https://docs.edgeimpulse.com/reference/ingestion-api
Parameters
directory: str
category: Optional[str] = None
label: Optional[str] = None
metadata: Optional[dict] = None
transform: Optional[<built-in function callable>] = None
allow_duplicates: Optional[bool] = False
show_progress: Optional[bool] = False
batch_size: Optional[int] = 1024
Return
edgeimpulse.data.sample_type.UploadSamplesResponse
upload_exported_dataset
Upload samples from a downloaded Edge Impulse dataset and preserve the info.labels
information.
Use this when you've exported your data in the studio, via the export
functionality.
Parameters
directory: str
transform: Optional[<built-in function callable>] = None
allow_duplicates: Optional[bool] = False
show_progress: Optional[bool] = False
batch_size: Optional[int] = 1024
Return
edgeimpulse.data.sample_type.UploadSamplesResponse
upload_numpy
Upload numpy arrays as timeseries using the Edge Impulse data acquisition format.
Parameters
data
labels: List[str]
sensors: List[edgeimpulse.data.sample_type.Sensor]
sample_rate_ms: int
metadata: Optional[dict] = None
category: Literal['training', 'testing', 'split', 'anomaly'] = 'split'
Return
edgeimpulse.data.sample_type.UploadSamplesResponse
upload_pandas_dataframe
Upload non-timeseries data to Edge Impulse where each dataframe row becomes a sample.
Parameters
df
feature_cols: List[str]
label_col: Optional[str] = None
category_col: Optional[str] = None
metadata_cols: Optional[List[str]] = None
Return
edgeimpulse.data.sample_type.UploadSamplesResponse
upload_pandas_dataframe_wide
Upload a dataframe to Edge Impulse where each column represents a value in the timeseries data and the rows become the individual samples.
Parameters
df
sample_rate_ms: int
data_col_start: Optional[int] = None
label_col: Optional[str] = None
category_col: Optional[str] = None
metadata_cols: Optional[List[str]] = None
data_col_length: Optional[int] = None
data_axis_cols: Optional[List[str]] = None
Return
edgeimpulse.data.sample_type.UploadSamplesResponse
upload_pandas_dataframe_with_group
Upload a dataframe where the rows contain multiple samples and timeseries data for those samples.
It uses a group_by
in order to detect what timeseries value belongs to which sample.
Parameters
df
timestamp_col: str
group_by: str
feature_cols: List[str]
label_col: Optional[str] = None
category_col: Optional[str] = None
metadata_cols: Optional[List[str]] = None
Return
edgeimpulse.data.sample_type.UploadSamplesResponse
upload_pandas_sample
Upload a single dataframe sample.
Upload a single dataframe sample to Edge Impulse.
Parameters
df
label: Optional[str] = None
sample_rate_ms: Optional[int] = None
filename: Optional[str] = None
axis_columns: Optional[List[str]] = None
metadata: Optional[dict] = None
label_col: Optional[str] = None
category: Literal['training', 'testing', 'split'] = 'split'
Return
edgeimpulse.data.sample_type.UploadSamplesResponse
upload_plain_directory
Upload a directory of files to Edge Impulse.
The samples can be in CBOR, JSON, image, or WAV file formats.
Parameters
directory: str
category: Optional[str] = None
label: Optional[str] = None
metadata: Optional[dict] = None
transform: Optional[<built-in function callable>] = None
allow_duplicates: Optional[bool] = False
show_progress: Optional[bool] = False
batch_size: Optional[int] = 1024
Return
edgeimpulse.data.sample_type.UploadSamplesResponse
upload_samples
Upload one or more samples to an Edge Impulse project using the ingestion service.
Each sample must be wrapped in a Sample
object, which contains metadata about that sample. Give this function a single Sample
or a List of Sample
objects to upload to your project. The data
field of the Sample
must be a raw binary stream, such as a BufferedIOBase object (which you can create with the open(..., "rb")
function).
Parameters
samples: Union[edgeimpulse.data.sample_type.Sample, List[edgeimpulse.data.sample_type.Sample]]
allow_duplicates: Optional[bool] = False
api_key: Optional[str] = None
timeout_sec: Optional[float] = None
max_workers: Optional[int] = None
show_progress: Optional[bool] = False
pool_maxsize: Optional[int] = 20
pool_connections: Optional[int] = 20
Return
edgeimpulse.data.sample_type.UploadSamplesResponse
Last updated