Use the Edge Impulse Python SDK with SageMaker Studio
Amazon SageMaker Studio is an integrated development environment (IDE) that provides a single web-based visual interface where you can access purpose-built tools to perform all machine learning (ML) development steps, from preparing data to building, training, and deploying your ML models, improving data science team productivity by up to 10x. You can quickly upload data, create new notebooks, train and tune models, move back and forth between steps to adjust experiments, collaborate seamlessly within your organization, and deploy models to production without leaving SageMaker Studio.
SageMaker Studio
To learn more about using the Python SDK, please see: Edge Impulse Python SDK Overview.This guide has been built from AWS reference project Introduction to SageMaker TensorFlow - Image Classification, please have a look at this AWS documentation page.Below are the changes made to the original training script and configuration:
The Python 3 (Data Science 3.0) kernel was used.
We used a dataset to classify images as car vs unknown.
The dataset has been imported in the Edge Impulse S3 bucket configured when creating the SageMaker Studio domain. Make sure to adapt to your path or use the AWS reference project.
bucket = sess.default_bucket()subfolder = 'car-vs-unknown/training/'s3 = boto3.client('s3')files = s3.list_objects(Bucket=bucket, Prefix=subfolder)['Contents']print(f"Number of images: {len(files)}")# or print the files# for f in files:# print(f['Key'])
You can continue with the default model, or can choose a different model from the list. Note that this tutorial has been tested with MobileNetv2 based models.
A complete list of SageMaker pre-trained models can also be accessed at Sagemaker pre-trained Models.
Copy
from sagemaker.jumpstart.notebook_utils import list_jumpstart_models# Retrieves all image classification models available by SageMaker Built-In Algorithms.filter_value = "task in ['ic']"ic_models = list_jumpstart_models(filter=filter_value)# od_models = list_jumpstart_models()print(f"Number of models available for inference: {len(ic_models)}")# display the model-ids.for model in ic_models: print(model)
Copy
from sagemaker import image_uris, model_urismodel_id, model_version = "tensorflow-ic-imagenet-mobilenet-v3-small-075-224", "*" # You can change the based model with one from the list generated above# Retrieve the base model uribase_model_uri = model_uris.retrieve( model_id=model_id, model_version=model_version, model_scope="inference")print(base_model_uri)
Optional, ship this next cell if you don’t want to retrain the model. And uncomment the last line of the cell after
Copy
from sagemaker import image_uris, model_uris, script_uris, hyperparametersfrom sagemaker.estimator import Estimatortraining_instance_type = "ml.m5.large"# Retrieve the Docker imagetrain_image_uri = image_uris.retrieve(model_id=model_id,model_version=model_version,image_scope="training",instance_type=training_instance_type,region=None,framework=None)# Retrieve the training scripttrain_source_uri = script_uris.retrieve(model_id=model_id, model_version=model_version, script_scope="training")# Retrieve the pretrained model tarball for transfer learningtrain_model_uri = model_uris.retrieve(model_id=model_id, model_version=model_version, model_scope="training")# Retrieve the default hyper-parameters for fine-tuning the modelhyperparameters = hyperparameters.retrieve_default(model_id=model_id, model_version=model_version)# [Optional] Override default hyperparameters with custom valueshyperparameters["epochs"] = "5"# The sample training data is available in the following S3 buckettraining_data_bucket = f"{bucket}"training_data_prefix = f"{subfolder}"# training_data_bucket = f"jumpstart-cache-prod-{aws_region}"# training_data_prefix = "training-datasets/tf_flowers/"training_dataset_s3_path = f"s3://{training_data_bucket}/{training_data_prefix}"output_bucket = sess.default_bucket()output_prefix = "ic-car-vs-unknown"s3_output_location = f"s3://{output_bucket}/{output_prefix}/output"# Create SageMaker Estimator instancetf_ic_estimator = Estimator( role=aws_role, image_uri=train_image_uri, source_dir=train_source_uri, model_uri=train_model_uri, entry_point="transfer_learning.py", instance_count=1, instance_type=training_instance_type, max_run=360000, hyperparameters=hyperparameters, output_path=s3_output_location)# Use S3 path of the training data to launch SageMaker TrainingJobtf_ic_estimator.fit({"training": training_dataset_s3_path}, logs=True)
def download_from_s3(url): # Remove 's3://' prefix from URL url = url[5:] # Split URL by '/' to extract bucket name and key parts = url.split('/') bucket_name = parts[0] s3_key = '/'.join(parts[1:]) # Download the file from S3 s3.download_file(bucket_name, s3_key, 'model.tar.gz')# Downloadtrained_model_s3_path = f"{s3_output_location}/{tf_ic_estimator._current_job_name}/output/model.tar.gz"print(trained_model_s3_path)download_from_s3(trained_model_s3_path)# or if you just want to use the based model#download_from_s3(base_model_uri)
Copy
import shutil, os# Extract the .tar.gz file to a temporary directorytemp_directory = 'tmp' # Replace with your actual temporary directorytar_gz_file = 'model.tar.gz' # Replace with the path to your .tar.gz file# Create directory if does not existif not os.path.exists(temp_directory): os.makedirs(temp_directory)shutil.unpack_archive(tar_gz_file, temp_directory)
Copy
import tensorflow as tfprint(tf.__version__)model = tf.keras.models.load_model('tmp/1/')converter = tf.lite.TFLiteConverter.from_keras_model(model)tflite_model = converter.convert()# Save the model.with open('model.tflite', 'wb') as f: f.write(tflite_model)
You will need to obtain an API key from an Edge Impulse project. Log into edgeimpulse.com and create a new project. Open the project, navigate to Dashboard and click on the Keys tab to view your API keys. Double-click on the API key to highlight it, right-click, and select Copy.
Copy API key from Edge Impulse project
Note that you do not actually need to use the project in the Edge Impulse Studio. We just need the API Key.Paste that API key string in the ei.API_KEY value in the following cell:
Copy
import edgeimpulse as eiei.API_KEY = "ei_0a85c3a5ca92a35ee6f61aab18aadb9d9e167bd152f947f2056a4fb6a60977d8" # Change to your key
Copy
ei.model.list_profile_devices()
Copy
# Estimate the RAM, ROM, and inference time for our model on the target hardware familytry: profile = ei.model.profile(model='model.tflite', device='raspberry-pi-4') print(profile.summary())except Exception as e: print(f"Could not profile: {e}")
Copy
# List the available profile target devicesei.model.list_deployment_targets()
Copy
# Get the labels from the label_info.jsonimport jsonlabels_info = open('tmp/labels_info.json')labels_obj = json.load(labels_info)labels = labels_obj['labels']print(labels)
Copy
# Set model information, such as your list of labelsmodel_output_type = ei.model.output_type.Classification(labels=labels)deploy_filename = "my_model_cpp.zip"# Create C++ library with trained modeldeploy_bytes = Nonetry: deploy_bytes = ei.model.deploy(model=model, model_output_type=model_output_type, engine='tflite', deploy_target='zip"')except Exception as e: print(f"Could not deploy: {e}")# Write the downloaded raw bytes to a fileif deploy_bytes: with open(deploy_filename, 'wb') as f: f.write(deploy_bytes.getvalue())
Voila!
You now have a C++ library ready to be compiled and integrated in your embedded targets. Feel free to have a look at Edge Impulse deployment options on the documentation to understand how you can integrate that to your embedded systems.You can also have a look at the deployment page of your project to test your model on a web browser or test