1 of 11

Edge AI

Introduction to edge AI

Edge AI is the process of running artificial intelligence (AI) algorithms on devices at the edge of the Internet or other networks. The traditional approach to AI and machine learning (ML) is to use powerful, cloud-based servers to perform model training as well as inference (prediction serving). While edge devices might have limited resources compared to their cloud-based cousins, they offer reduced bandwidth usage, lower latency, and additional data privacy.

Click here to watch the video

Edge AI series

The following series of articles and videos will guide you through the various concepts and techniques that make up edge AI. We will also present a few case studies that demonstrate how edge AI is being used to solve real-world problems. We encourage you to work through each video and reading section.

You will find a quiz at the end of each written section to test your knowledge. At the end of the course, you will find a comprehensive test. If you pass it with a score of at least 80%, you will be sent a digital certificate showing your completion of the course. You may take the test as many times as you like.

This series can be viewed as a course. We will cover the following concepts with the given learning objectives:

What is edge computing?
- Understand the differences between cloud and edge computing
- Advantages and disadvantages of processing data on edge devices
- What is the Internet of Things (IoT)
What is machine learning (ML)?
- What are the differences between artificial intelligence, machine learning, and deep learning
- Understand the history of AI
- What are the different categories of machine learning, and what problems do they tackle
What is edge AI?
- Articulate the difference between training and inference
- How does traditional cloud-based AI inference work
- What are the benefits of running AI algorithms on edge devices
- Examples of edge AI systems
- What are the business implications for future edge AI growth
How to choose an edge AI device
- Define and provide examples for the different edge computing devices
- How to choose a particular edge computing device for your edge AI application
Edge AI lifecycle
- How to identify a use case where edge AI can uniquely solve a problem
- Identify constraints to edge AI implementations
- Understand the edge AI pipeline of collecting data, analyzing the data, feature engineering, training a model, testing the model, deploying the model, and monitoring the model's performance
What is edge MLOPs?
- Identify the three principles of MLOps: version control, automation, governance
- Describe the benefits of automating various parts of the edge AI lifecycle
- Define operations and maintenance (O&M)
- How does edge MLOps differ from cloud-based MLOps
- Define the causes of model drift: data drift and concept drift
What is Edge Impulse?
- How does a short learning curve lead to faster go-to-market times
- Articulate the advantages and disadvantages of using an edge AI platform versus building one from scratch
Case study: Tunstall Healthcare - Coming soon!
- How is edge AI being used to improve existing fall detection technology
- Why does reducing false positives and false negatives reduce costs and save lives
- How is edge AI used to improve healthcare technology beyond fall detection
Case study: Izoelectro
- How is edge AI used to detect anomalies on power lines
- How anomaly detection on edge devices saves power over cloud-based approaches
Going further and certification
- Resources to dive deeper into the technology and use cases of edge AI
- How to get started with Edge Impulse
- Comprehensive test and certification

The network edge

Edge computing is a strategy where data is processed and stored at the periphery of a computer network. In most cases, processing and storing data on remote servers, especially internet servers, is known as "cloud computing." The edge includes all computing devices not part of the cloud.

Edge computing devices includes personal computers, smartphones, IoT devices, home and enterprise routing equipment, and remote or regional servers. As these devices become more powerful, we can start to run various AI algorithms on them, which opens up new ways to solve problems.

In the next section, we will dive into the advantages and disadvantages of edge computing.

Quiz

Practice your understanding with the quiz below. Submit your answer and click View accuracy to see your score. Note that this will open a new browser tab.

What is edge computing?

Edge computing is a computer networking strategy where data is processed and stored at the periphery of the network. The "periphery" includes end-user devices and equipment that connects those devices to larger networking infrastructure, such as the internet. For example, laptops, smartphones, IoT devices, routers, and local switches count as edge computing devices.

In the previous article, we introduced this edge AI series. We start the series by examining the advantages and disadvantages of edge computing and how it differs from cloud computing.

Click here to watch the video

By processing data closer to where the data is generated, we can reduce latency, limit bandwidth usage, improve reliability, and increase data privacy.

Network architecture overview

Most networking architectures can be divided into the "cloud" and the "edge." Cloud computing consists of applications and services running on remote, internet-connected devices. Edge computing is essentially everything that is not part of the cloud (i.e. in the internet).

Typically, local infrastructure IT equipment, such as servers and databases, are not considered either "edge" or "cloud." For our purposes, we will consider them part of the "edge," as running services on this gear often requires on-site customization and maintenance.

In general, data will be created by end-point devices. "End-point devices" or "end devices" refer to physical equipment at the very edge of the network, such as laptops, smartphones, and connected sensors. Sometimes, these end devices have a user interface where a person can interact with various applications, enter data, etc. Other times, the device is embedded into other equipment or offers no user interface. These embedded devices, if connected to the internet or other networks, are referred to as the Internet of Things.

Examples of IoT devices include smart speakers, smart thermostats, doorbell cameras, GPS trackers, and networked pressure sensors in factories used to provide flow metrics and detect anomalies.

Note: a sensor is a device that measures a physical property in its environment (such as temperature, pressure, humidity, acceleration, etc.) and converts that measurement into a signal (often an electrical signal) that can be interpreted by a human or computer.

Sometimes, data can be stored and processed on the end device, like saving a local spreadsheet or playing a single-player game. In other cases, you need the power of cloud computing to stream movies, host websites, perform complex data analysis, and so on.

Cloud computing

You are likely already familiar with many cloud computing services, such as Netflix, Spotify, Salesforce, HubSpot, Dropbox, Google Drive. These services run on powerful, internet-connected servers that you access through a client application, such as a browser.

Most of the time, these services run on top of one of the major cloud computing platforms, like Amazon Web Services, Microsoft Azure, or Google Cloud Platform. Such platforms offer containerized operating systems that allow you to easily build your application in a modular fashion and scale up production to meet the demand of thousands or millions of users.

The benefits of cloud computing include:

Large servers offer powerful computing capabilities that can crunch numbers and run complex algorithms quickly
Remote access to services from any device (as long as you have an internet connection)
Processing and storage can be scaled on demand
Physical servers are managed by large companies (e.g. Google, Amazon, Microsoft) so that you do not need to handle the infrastructure and maintenance

Edge computing

In addition to cloud computing, you also have the option of running services directly on the end devices or on local network servers. Processing such edge data might include running a user application (e.g. word processing document), analyzing sensor data to look for anomalies, identifying faces in a doorbell camera, and hosting an intranet website accessible only to local users.

According to Ericsson, there will be over 7 billion smartphones in the world by 2025. Additionally, the International Data Corporation (IDC) predicts a staggering 41.6 billion IoT devices will be in use by 2025. These devices will produce nearly 80 zettabytes that year, which amounts to about 200 million terabytes every day. The sheer amount of raw data is likely to strain existing infrastructure. One way to handle such data is to process locally or on the edge, rather than transmit everything to the cloud.

The network edge can be divided into "near" edge and "far" edge. Near edge equipment consists of on-premises or regional servers and routing equipment controlled by you or your business. Near refers to the physical proximity or relatively low number of router hops it takes for traffic to go from the border of the internet to your equipment. In other words, "near" and "far" are from the perspective of the internet service provider (ISP) or cloud service provider.

Far edge consists of the devices further away from the internet gateway on your network. Examples include user end-devices, such as laptops and smartphones, as well as IoT devices and local networking equipment, such as routers and switches.

The border between the cloud, near edge, and far edge can often be nebulous. In fact, a relatively recent trend includes fog computing, which is a term coined by Cisco in 2012. In fog computing, edge devices (often near edge servers) are used to store and process data, often replicating the functionality of cloud services on the edge.

Advantages

Edge computing offers a number of benefits:

Reduced bandwidth usage - you no longer need to constantly stream raw data to have it stored, analyzed, or processed by a cloud computing service. Instead, you can simply transmit the results of such processing.
Reduced network latency - network latency is the round-trip time it takes for information to travel to its destination (e.g. a cloud server) and for the response to return to the end-point device. For cloud computing, this can be 100s of milliseconds or more. If processing is performed locally, such latency is often reduced to almost nothing.
Improved energy efficiency - Transmitting data, especially via a wireless connection like WiFi, usually requires more electrical power than processing the data locally.
Increased reliability - Edge computing means that data processing can often be done without an internet connection.
Better data privacy - If raw data is processed directly on an end device without travelling across the network, it becomes harder to access by malicious parties. This means that user data can be made more secure, as there are fewer avenues to access that raw data.

These benefits can easily be remembered with the acronym BLERP: bandwidth, latency, energy usage, reliability, and privacy.

Disadvantages

While edge computing offers a host of benefits, there are several limitations:

Resource constraints - Most edge devices do not offer the same level of raw computing as most cloud servers. If you need to crunch numbers quickly or run complex algorithms, you might have to rely on cloud computing.
Limited remote access - Services running locally on edge or end devices might not be easily accessed via remote clients. To provide such remote access, you often need to run additional services (such as a web server) and/or configure a VPN on your local network.
Security - Many IoT devices come from the manufacturer with default login credentials and open ports, making them prime targets for attackers (such as with the infamous Mirai botnet attack in 2016). You and your network administrators are responsible for implementing and enforcing up-to-date security plans for all edge devices.
Scaling - Adding more computing power and resources is often easy in cloud computing; you just pay the cloud service provider more money. Scaling your resources for edge computing often requires purchasing and installing additional hardware along with maintaining the infrastructure.

Examples of edge computing

Anything that runs locally on your computer or smartphone is considered edge computing. That includes word processing, spreadsheets, most programming development environments, and many video games. Some applications require both edge computing and cloud computing elements, such as video conferencing applications (e.g. Zoom). Cloud-based applications that you use in your browser (such as Google Docs or Netflix) require heavy processing on cloud servers as well as some light local processing on your phone or computer.

In addition to user applications, you can also find IoT devices performing local processing of data. Some examples of this include smartwatches monitoring exercise levels, smart speakers waiting for a keyword (such as "Alexa"), and industrial controllers automatically operating machinery based on input sensor values.

One example of edge computing on networking gear is QoS. Your home or office router may monitor web traffic to determine packet priority in a technique known as quality of service (QoS). As QoS requires to the router to monitor traffic destinations (and sometimes content) to quickly make such prioritization decisions, edge computing on the router is a natural fit.

Quiz

Edge computing offers a number of advantages over cloud computing, but it comes with some limitations. You should consider your options carefully before investing in either strategy for your computing needs.

Test your knowledge on edge computing with this quiz:

What is machine learning (ML)?

Machine learning (ML) is a branch of artificial intelligence (AI) and computer science that focuses on developing algorithms and programs that can learn over time. ML specifically focuses on building systems that learn from data.

In the last article, we discussed the advantages and disadvantages of edge computing. This time, we define machine learning, how it relates to AI, and how it differs from traditional, rules-based programming.

Click here to watch the video

Differences between human and artificial intelligence

One way to understand AI is to compare it to human intelligence. In general, we consider human intelligence in terms of our ability to solve problems, set and achieve goals, analyze and reason through problems, communicate and collaborate with others, as well as an awareness of our own existence (consciousness).

AI is the ability for machines to simulate and enhance human intelligence. Unlike humans, AI is still a rules-based system and does not need elements of emotions or consciousness to be useful.

In their 2016 book, Artificial Intelligence: A Modern Approach, Stuart Russell and Peter Norvig define AI as "the designing and building of intelligent agents that receive precepts from the environment and take actions that affect that environment."

Machine learning vs. artificial intelligence

Machine learning is a subset of artificial intelligence. AI is a broad category that covers many systems and algorithms.

Both AI and ML can be considered subsets of data science, which is the application of the scientific method to extract insights from data to make decisions or predictions. For example, an investment banker might look at stock trends or other factors to figure out the best time to buy and sell securities. Additionally, a software engineer might develop a computer vision model to identify cars in images (as images are a form of data).

As described earlier, AI is the development of algorithms and systems to simulate human intelligence. This can include automatically making decisions based on input data as well as systems that can learn over time.

Machine learning, on the other hand, is the development of algorithms and systems that learn over time from data. Often, such algorithms include the development of mathematical and statistical models that have been trained on input data. These models are capable of extracting patterns from the input data to come up with rules that can make decisions and predictions.

Deep learning, a term coined by computer scientist Rina Dechter in 1986, describes ML models that are more complex and can learn representations from the data in successive layers. Deep learning has been the most studied and hyped form of ML since 2010.

A brief history of AI

While AI seems like a recent invention, the study of mathematical models that can update themselves dates back to the 1700s.

Carl Friedrich Gauss studied linear regression, which is evidenced by the Gauss-Markov theorem. The theorem, a collaboration between Gauss and Andrey Markov, was released in a 1821 publication. As regression algorithms are a form of mathematical model that improves over time given additional data, we consider them part of machine learning (and, as a result, a part of AI).

The term "artificial intelligence" came from John McCarthy's proposal to host a conference in 1956 for academics to discuss the possibility of developing intelligent machines. This gathering was known as the "Dartmouth Summer Research Project on Artificial Intelligence."

After the Dartmouth conference, research and public interest in AI flourished. Computer technology improved exponentially, which allowed for early AI and ML systems to be developed. However, AI eventually stalled in the 1970s, when computers could not store information or process data fast enough to keep up with the algorithm research.

The late 1970s and early 1980s witnessed the first "AI winter," where public interest in AI died, and research funding evaporated. The mid-1980s saw a resurgence of interest with symbolic AI and expert systems. As computers at the time were powerful enough to run these complex algorithms, AI programs could be employed to solve real problems in industry.

The technique of automatically adjusting weights in a weighted sum of inputs was well known in Guass's time. This formed the basis for the "perceptron," which eventually gave way to the "multilayer perceptron" in the 1950s. The idea of using multiple perceptrons to predict values as a machine learning tool was inspired by the human brain's massive interconnected network of neurons, thus inspiring the name "neural network" (or more specifically, "artificial neural network"). Even though Rina Dechter published her paper in 1986 coining the term "deep learning," it would be at least 20 years before neural networks became popular.

The second AI winter occurred in the early 2000s. Government research funding died as did public interest in AI. However, the current deep learning revolution started in 2010 with newfound interest in large, complex machine learning models, mostly built using neural networks. This AI renaissance came about thanks to several important factors:

Massive amounts of data is being generated from personal computers easily accessible via the internet
Computers, including accelerators like graphics processing units (GPU), became powerful enough to run deep learning models with relative ease
New, complex deep learning models were developed that surpassed classical, non-neural-network algorithms in accuracy
Public interest surged with renewed vigor after several high-profile media publications, including Microsoft's Kinect for Xbox 360 released in 2010, IBM Watson winning on Jeopardy in 2011, and Apple unveiling Siri in 2011

AlexNet, a deep neural network designed in 2012, surpassed all previous computer vision models at recognizing images. This marked the turning point in AI development, where deep learning became the primary model architecture and focus for researchers.

Since 2010, we have seen a resurgence in AI funding and interest. The availability of large amounts of data and capable computers have kept pace with machine learning research. As a result, ML has entered our lives through nearly every piece of computing equipment.

Categories of ML

Machine learning can be broadly categorized into supervised learning, unsupervised learning, and reinforcement learning.

Supervised learning is concerned with finding a function (or mathematical model) that maps input data to some output, such as a predicted value or classification. Supervised learning requires ground-truth labels to be present with the data during training. Such labels are usually set by humans.

Supervised learning can be subdivided into two further categories. In regression, the model attempts to predict a continuous value. For example, regression can be used to predict a house's price based on various input factors (such as livable area, location, size, etc.). Classification is the process of predicting how well the input data belongs to one (or more) of several discrete classes.

Unsupervised learning is used to identify patterns in data. As such, no ground-truth labels are used. Examples of unsupervised learning are clustering, outlier (anomaly) detection, and segmentation.

Reinforcement learning focuses on models that learn a policy that selects actions based on provided input. Such models attempt to achieve goals through trial and error by interacting with the environment.

Other categories of ML exist, such as semi-supervised learning, and they often involve combinations of the main three categories.

Traditional vs. machine learning algorithms

When developing traditional algorithms, the parameters and rules of the system are designed by a human. Such algorithms accept data as input and produce results.

Some examples of traditional programming algorithms include:

Edge detection filters are used to extract meaning from images
Sorting algorithms are popular with search engines to present web search results
The Fourier transform is used in signal processing to convert a time-series data sample into its various frequency components
Advanced Encryption Standard (AES) is a popular encryption protocol to keep data secret during transmission

Note that some artificial intelligence algorithms, including classical symbolic AI and expert systems, fall into this category, as the rules are built by humans. They are AI algorithms but not considered "machine learning."

In machine learning, the ML training algorithm automatically develops the parameters and rules based on the input data. For supervised learning, you provide the input data along with the ground-truth answers or labels. During the training phase, the ML algorithm develops the rules to classify the input data as accurately as possible.

The rules developed during the training phase is a mathematical or statistical model and is often referred to as a "model."

The rules (model) can then be used to predict answers and values from new data that was never seen during training. This process is known as "inference," as the model is attempting to infer values or meaning from new data. If the rules perform well on this task (with never-before-seen input data), then we can say that the ML model is "generalizing" well.

Going further

Machine learning can help solve unique problems where traditional rules-based designs fall short. If you would like to dive into the technical details of how neural networks operate, see our guides here.

Quiz

Test your knowledge of machine learning with the following quiz.

What is edge AI?

Edge AI is the development and deployment of artificial intelligence (AI) algorithms and programs on edge devices. It is a form of edge computing where data is analyzed and processed near where the data is generated or collected. Edge AI contrasts cloud-based AI, which involves data being transmitted across the internet to be processed on a remote server.

Click here to watch the video

Machine learning training and deployment

In machine learning (ML), data is fed into the training process. For supervised learning, the ground-truth labels are also provided along with each sample. The training algorithm automatically updates the parameters (also known as "weights") in the ML model.

During each step of the training process, we evaluate the model to see how good it is at predicting the correct label given some data. Over time, we ideally want this accuracy to increase to some acceptable level. In most cases, training a machine learning model is computationally expensive, and training does not need to be performed on an edge device. As a result, we can do model training in the cloud with the help of powerful accelerator hardware, such as graphics processing units (GPUs).

Once we are happy with the performance of the model, we can deploy it to our end device. At this point, the model accepts new, never-before-seen data and produces an output. For supervised learning and classification, this output is a label that the model believes most accurately represents the input data. In regression, this output is a numerical value (or values). This process of making predictions based on new data after training is known as inference.

In traditional, cloud-based ML model deployment, inference is run on a remote server. Clients connect to the inference service, supply new data along with their request, and the server responds with the result. This cloud-based inference process is known as prediction serving.

In the majority of cases, inference is not nearly as computationally intensive as training. As a result, we could run inference on an edge device instead of on a powerful cloud server.

Because edge devices often offer less compute power than their cloud counterparts, ML models trained for the edge often need to be less complex. With that in mind, edge AI offers several benefits over cloud AI.

Benefits of edge AI

Assuming that you can run your ML model on an edge device, such as a laptop, smartphone, single-board computer, or embedded Internet of Things (IoT) device, edge AI has the following advantages over a cloud-based approach:

Reduced bandwidth - Rather than transmitting raw data over the network, you can perform inference on the edge device directly. From there, you would only need to transmit the results, which is often much less data than the raw input.
Reduced latency - Transmitting data across networks (including the internet) can take time, as that data has to travel through multiple switches, routers, and servers. The round trip latency is often measured in 100s of milliseconds when waiting for a response from a cloud server. On the other hand, there is little or no network latency with edge AI, as inference is performed on or relatively close to where the data was collected.
Better energy efficiency - Most cloud servers require large overhead with containerized operating systems and various abstraction layers. By running inference on edge devices, you can often do away with these layers and overhead.
Increased reliability - If you are operating in an environment with little or no internet connection, your edge devices can still continue to operate. This is important in remote environments or applications like self-driving cars.
Improved data privacy - While IoT devices require care when implementing security plans, you can rest assured that your raw data does not leave your device or edge network. Users can raw data, such as images of their faces, is not leaving the network to be intercepted by malicious actors.

Just like with edge computing, the benefits can be summarized by the acronym BLERP: bandwidth, latency, energy usage, reliability, and privacy.

Limitations of edge AI

Edge AI has a number of limitations that you should take into consideration and, you should weigh your options carefully versus cloud deployment.

Resource constraints - In general, edge devices offer fewer computational resources than their cloud-based counterparts. Cloud servers can offer powerful processors and large amounts of memory. If your ML model cannot be optimized or constrained to run on an edge device, you should consider a cloud-based solution.
Limited remote access - Prediction serving from the cloud offers easy access from any device that has internet access. Remotely access edge devices often requires special network configuration, such as running a VPN service.
Scaling - Scaling prediction services of cloud models usually requires simply cloning your server and paying the service provider more money for additional computing power. With edge computing, you need to purchase and configure additional hardware.

Examples of edge AI

Edge AI is already being used in our everyday lives as well as offering money savings as an extension of industrial IoT applications. One of the most prominent home automation example of edge AI is the smart speaker.

The speaker is constantly listening for a key word or phrase ("Alexa" or "Hey Google"). This process is known as "keyword spotting," and it involves performing inference on incoming sound data with a small ML model trained to recognize only that word or phrase. Latency is important here; the speaker needs to respond to the listener within a few milliseconds. It also saves on bandwidth, as the raw audio does not need to be constantly transmitted over the network.

Once the speaker recognizes the keyword, it "wakes up" and begins streaming audio over the internet to a powerful server where a more complex model can perform intent analysis to determine what the user is requesting. The smart speaker is a perfect combination of edge AI and cloud AI working in tandem to provide a unique user interaction.

Many smart watches also rely on edge AI.

Some can perform keyword spotting directly on the the watch hardware or are capable of streaming that audio to a connected smartphone for analysis. Either way, the processing is performed on an edge device. They also work with smartphones to analyze sleep patterns and track fitness activities.

Factories and industrial plants are turning to edge AI to help monitor equipment and measure workflows. For example, the Lexmark Optra is a single-board computer that acts as an IoT hub and can perform important analysis jobs like automated optical inspection of assembly line parts.

Finally, a popular example of edge AI is the self-driving vehicle. These cars, trucks, and buses promise to transport people and goods without needing a human driver.

Because vehicles cannot rely on a constant internet connection, much of the data processing from the myriad sensors must be performed on the vehicle itself. This means engineers must find a balance between computing power, size, and ML model complexity.

Market size

The International Data Corporation (IDC) predicts 41.6 billion IoT devices will produce nearly 80 zettabytes that year. Additionally, Gartner predicts that 55% of all data analysis by AI and ML algorithms will occur on the same device that captured the raw data in 2025. This figure shows massive growth in edge AI capabilities, up from 10% of on-device processing in 2021. Gartner also predicts that revenue from specialized AI processors, such as GPUs and neural processing units (NPUs), will be "$137 billion by 2027, growing by a five-year CAGR of 26.5%."

The rapid adoption of AI technology and deployment of IoT devices shows how the market is expanding to include edge AI solutions. Note that this is not a shift from cloud-based AI; cloud solutions will continue to grow in addition to edge deployments.

Going further

Edge AI can be seen as an extension of IoT where data analysis and processing is performed on or close to the sensors that captured the data. While edge AI does not offer the same raw compute power as cloud-based applications, it does help limit bandwidth usage, lower latency, reduce energy consumption, avoid reliance on constant network connection, and enhance data privacy.

To learn more about edge AI, see our guides on embedded ML and edge ML.

Quiz

Test your knowledge on edge AI with this quiz:

How to choose an edge AI device

Choosing a device for edge AI can be tricky, as the plethora of computing devices available is daunting. We consider popular edge AI use cases, such as time-series classification and object detection, along with other design constraints to offer a helpful guide for choosing the best hardware.

In the previous section, we defined edge AI. In this article, we examine popular edge AI use cases and offer some guidance on choosing the best hardware for an edge AI project.

Click here to watch the video

Design considerations

Your problem or project requires careful consideration along with the various design constraints for choosing the right hardware. Let us begin by looking at the various use cases and constraints.

Is edge AI the right approach?

Before looking at hardware, you should consider if edge AI is the right approach for your particular problem. In many cases, a traditional rules-based approach with classical algorithms may be enough to tackle the issue.

For example, if you are creating an anomaly detection system based on vibration sensor data, perhaps a fast Fourier transform (FFT) to give you the various frequency components is sufficient. You could set a simple threshold to see if the machine in question is vibrating at a particular frequency. This approach usually requires enough domain knowledge around your particular problem to identify which data is important and how to analyze it.

Use cases

While the idea behind edge AI is to run any AI algorithm on edge devices, the compute limitations of edge devices restricts most edge AI to a few popular use cases at this time. As hardware and AI technology improves, possible use cases will continue to expand.

Often, edge AI works on data collected from sensors, which are devices that detect and react to their physical environment. In most cases, we work with electrical sensors that convert measurable environmental factors into electrical signals. Examples of such sensors include digital thermometers (temperature), accelerometers (acceleration and vibration), current sensors (electrical current), microphones (audio), and cameras (images).

Time-series sensor data - Classify occurrences (e.g. sleep patterns) or identify anomalies (e.g. arrhythmia, mechanical equipment failure) from sensor data patterns over time. These time-series data often have relatively slow sample rates, ranging from less than 1 sample per second (1 Hz) to around 1000 Hz.
Audio - Identify wake words, classify sounds (e.g. animals, machinery), or identify anomalies (e.g. mechanical failure). Audio is a form of time-series data, but it usually requires a higher sample rate, often in the 10 kHz to 40 kHz range.
Image classification - Identify if an image contains a particular object, animal, or person. Such processing requires a camera for the sensor. Resolution can be low (e.g. 96x96 pixels) to very high (e.g. 15360x8640 pixels). Response time can be slow, such as 1 frame per second (fps), to very fast (e.g. 60+ fps).
Object detection - Detect one or more target objects, animals, or people in an image, and determine the relative position of each target object in the image. Object detection requires more complex models than image classification. Cameras are also used, and detection can be performed on low to high resolution images. Response times can vary depending on your particular needs.

Example of object detection identifying a dog, ball, and toy

See if your particular project is close to one of the use cases listed above. If not, then you may need to dig into the technical details about the problem's domain, possible machine learning (ML) approaches, and ML model compute requirements.

Design constraints

Whether you are building an edge AI device for sale or buying off-the-shelf (OTS) components to solve a business need, you should consider your environmental and use constraints.

Interfaces - Does your device connect to sensors? Will it need a connection to the internet (e.g. WiFi) or a smartphone (e.g. Bluetooth)? Does it need to have a user interface (screen, buttons, etc.), or can it be embedded in another device without human interaction?
Power constraints - If the device is battery-powered, how long does it need to operate on a single charge? Even if the device can be plugged into the wall, optimizing for energy savings means you can save money on electricity usage.
Form factor - Do you have the space for a large, powerful server? If not, can you mount a small box containing your device somewhere? Alternatively, is the device wearable, or does it need to conform to some unique shape?
Operating environment - Most electronics work best in a climate-controlled environment, free from moisture and debris. Can you place your device in climate-controlled room like a server room or office? If not, does your device need to be hardened for a specific operating environment, like the outdoors, vehicle, or in space?
Code portability - If you are designing an edge AI application, you should weigh your available options for code portability. Code optimized for a particular piece of hardware can often execute faster and with less energy usage. However, optimized code can often be difficult to port to different hardware and may require unique expertise and extra time to develop. Portable code, on the other hand, usually requires some overhead in the form of an operating system, but it is often easier to run on different hardware (i.e. port to a different device).

Off the shelf (OTS) versus do it yourself (DIY)

You have the option of buying any or all parts of an edge AI solution from a third-party provider. OTS usually involves a higher unit price, as vendors have overhead and profit margins built in. However, purchasing the device, software, or framework likely means faster setup and time-to-market. Additionally, some of the support/maintenance needs can be passed on to the vendor.

If you are developing or selling an electrical device, OTS options often include compliance testing, such as UL, FCC, and CE. Such testing can be expensive and time-consuming, but they are almost always necessary for selling devices in a given country.

On the other hand, developing the device or solution yourself requires more up-front time and costs in engineering, programming, and compliance testing. However, the device can be customized and optimized for particular use cases and environments. You also gain economies of scale if you plan to manufacture and sell hundreds or thousands of devices.

The following chart summarizes the tradeoffs between OTS and DIY.

Buy (OTS)

Build (DIY)

Time efficiency

More engineering effort

Ease of use

Customization

Higher unit cost

Potential hidden costs

Third-party support

Independence from third-party vendors

Choosing hardware

Once you have an idea of your problem scope and design constraints, you can choose the appropriate hardware. Most edge AI is performed by one of the following hardware categories:

Low-end microcontroller - A microcontroller (also known as a microcontroller unit or MCU) is a self-contained central processing unit (CPU) and memory on a single chip, much like a tiny, low-power computer. Low-end microcontrollers are often optimized for a single or few tasks with a focus on collecting sensor data or communicating with other devices. Such MCUs usually have little or no user interface, as they are intended to be embedded in other equipment. Examples include controllers for microwave ovens, fitness trackers, TV remote controls, IoT sensors, modern thermostats, and smart lights.
High-end microcontroller - High-end MCUs offer more powerful CPUs, more memory, and more peripherals (built-in WiFi, sensors, etc.) than their low-end counterparts. You can find high-end microcontrollers in vehicle engine control units (ECUs), infotainment systems in cars, industrial robotics, smart watches, networking equipment (e.g. routers), and medical imaging systems (e.g. MRI, X-ray).You can read more about microcontrollers here.
Microprocessor unit (MPU) - An MPU is a CPU (often more than one CPU core) packaged on a single chip for general purpose computing. MPUs can be found in laptops, tablets, and smartphones. Unlike MCUs, they require external memory (e.g. RAM, hard drive) to function. They are almost always more powerful than MCUs and capable of crunching numbers at a faster rate. However, they also generally require more energy to function versus MCUs. You can read more about microprocessors here.
Graphics processing unit (GPU) - Graphics processing units were originally designed to render complex 2D and 3D graphics to a computer screen. They are sold either as coprocessors on the same motherboard as an MPU (known as integrated graphics) or as a separate graphics card that can be plugged into a motherboard. In both cases, they require another processor (usually an MPU) to handle the general computing needs. Because graphics are generally created using parallel matrix operations, GPUs have also seen success performing similar matrix operations for activities like cryptocurrency mining and machine learning. NVIDIA is the most popular GPU maker. You can read more about GPUs here.
Neural processing unit (NPU) - NPUs are special-purpose AI accelerator chips designed to perform neural network calculations quickly and efficiently. Like GPUs, they almost always require a coprocessor in the form of an MCU or MPU to handle the general purpose computing needs. NPUs range from tiny coprocessors in the same chip as an MCU to powerful, card-based options that can be plugged into a motherboard. The Google Tensor Processing Unit (TPU) is one example of an NPU. You can read more about AI accelerators and NPUs here.

The boundary between low- and high-end microcontrollers is not clearly defined. However, we try to differentiate them here to demonstrate that your choice of hardware can affect your ability to execute different edge AI tasks.

The above chart makes general suggestions for which class of hardware is best suited for each edge AI task. Not all AI tasks are included, as some are better suited for cloud AI, and AI is an evolving field where such needs are constantly changing.

Hardware combinations

As noted, many of the processor types are not intended to operate alone. For example, GPUs are optimized for a particular type of operation (e.g. matrix math) and need to be paired with another processor (e.g. MPU) for general purpose computing needs. In some cases, you can create processor-specific modules, such as GPUs and NPUs on cards that easily slot into many personal computer (PC) motherboards.

In some cases, you may come across single-chip solutions that contain multiple processors and various peripherals. For example, a chip might contain a high-end MCU for general processing, a specialized radio MCU for handling WiFi traffic, a low-end MCU for managing power systems, random-access memory (RAM), and a specialized NPU for tackling AI tasks. This type of chip is often marketed as a system on a chip (SOC).

Conclusion

Asking the right questions when creating the scope of your edge AI project is crucial for choosing the right hardware to meet your needs. In many cases, you can simply purchase an off-the-shelf solution, such as buying a doorbell camera, person counting security camera, smart speaker, etc. If your project requires customization, optimization, economies of scale, or a specific operating environment, you may need to develop your own edge AI solution.

Understanding the use case and computing needs for the model can help direct your purchasing or development decisions when it comes to choosing hardware.

Quiz

Test your knowledge on choosing edge AI hardware with the following quiz:

Edge AI lifecycle

The edge AI lifecycle includes the steps involved in planning, implementing, and maintaining an edge AI project. It follows the same general flow as most engineering and programming undertakings with the added complexity of managing data and models.

Previously, we examined techniques for choosing hardware for edge AI projects. In this lesson, we will look at the machine learning (ML) pipeline and how to approach an edge AI project.

Click here to watch the video

Identify need and scope

Before starting a machine learning project, it is imperative that you examine the actual need for such a project: what problem are you trying to solve? For example, you could improve user experience, such as creating a more accurate fall detection or voice-activated smart speaker. You might want to monitor machinery to identify anomalies before problems become unmanageable, which could save you time and money in the long run. Alternatively, you could count people in a retail store to identify peak times and shopping trends.

Once you have identified your requirements, you can begin scoping your project:

Can the project be solved through traditional, rules-based methods, or is AI needed to solve the problem?
Is cloud AI or edge AI the better approach?
What kind of hardware is the best fit for the problem?

Note that the hardware selection might not be apparent until you have constructed a prototype ML model, as that will determine the amount of processing power required. As a result, it can be helpful to quickly build a proof-of-concept and iterate on the design, including hardware selection, to arrive at a complete solution.

Machine learning pipeline

Most ML projects follow a similar flow when it comes to collecting data, examining that data, training an ML model, and deploying that model.

This complete process is known as a machine learning pipeline.

Data collection

To start the process, you need to collect raw data. For most deep learning models, you need a lot of data (think thousands or tens of thousands of samples).

In many cases, data collection involves deploying sensors to the field or your target environment and let them collect raw data. You might collect audio data with a smartphone or vibration data using an IoT sensor. You can create custom software that automatically transmits the data to a data lake or store it directly to an Edge Impulse project. Alternatively, you can store data directly to the device, such as on an SD card, that you later upload to your data storage.

Examples of data can include raw time-series data in a CSV file, audio saved as a WAV file, or images in JPEG format.

Note that sensors can vary. As a result, it's usually a good idea to collect data using the same device and/or sensors that you plan to ultimately deploy to. For example, if you plan to deploy your ML model to a smartphone, you likely want to collect data using smartphones.

Data cleaning

Raw data often contains errors in the forms of omissions (some fields missing), corrupted samples, or duplicate entries. If you do not fix these errors, the machine learning training process will either not work or contain errors.

A common practice is to employ the medallion architecture for scrubbing data, which involves copying data, cleaning out an errors or filling missing fields, and storing the results into a different bucket. The buckets have different labels: bronze, silver, gold. As the data is successively cleaned and aggregated, it moves up from bronze to silver, then silver to gold. The gold bucket is ready for analysis or to be fed to a machine learning pipeline.

The process of downloading, manipulating, and re-uploading the data back into a separate storage is known as extract, transform, load (ETL). A number of tools, such as Edge Impulse transformation blocks and AWS Glue, can be used to build automated ETL pipelines once you have an understanding of how the data is structured and what cleaning processes are required.

Data analysis

Once the data is cleaned, it can be analyzed by domain experts and data scientists to identify patterns and extract meaning. This is often a manual process that utilizes various algorithms (e.g. unsupervised ML) and tools (e.g. Python, R). Such patterns can be used to construct ML models that automatically generalize meaning from the raw input data.

Additionally, data can contain any number of biases that can lead to a biased machine learning model. Analyzing your data for biases can create a much more robust and fair model down the road.

Feature extraction

Sometimes, the raw data is not sufficient or might cause the ML model to be overly complex. As a result, manual features can be extracted from the raw data to be fed into the ML model. While feature engineering is a manual step, it can potentially save time and inference compute resources by not having to train a larger model. In other words, feature extraction can simplify the data going to a model to help make the model smaller and faster.

For example, a time-series sample might have hundreds or thousands of data points. As the number of such points increases, the model complexity also often increases. To help keep the model small, we can extract some features from each sample. In this case, performing the Fast Fourier Transform (FFT) breaks the signal apart into its frequency components, which helps the model identify repeating patterns. Now, we have a few dozen data points going into a model rather than a few hundred.

In general, smaller models and fewer inputs mean faster execution times.

Train machine learning model

With the data cleaned and features extracted, you can select or construct an ML model architecture and train that model. In the training process, you attempt to generalize meaning in the input data such that the model's output matches expected values (even when presented with new data).

Deep neural networks are the current popular approach to solving a variety of supervised and unsupervised ML tasks. ML scientists and engineers use a variety of tools, such as TensorFlow and PyTorch to build, train, and test deep neural networks.

In addition to using these lower-level tools to design your own model architecture, you can also rely on pre-built models or tools, like Edge Impulse, that contain the building blocks needed to tackle a wide variety of edge AI tasks.

Pretrained models, such as those available from NVIDIA TAO, can be retrained using custom data in a process known as transfer learning. Transfer learning is often faster and requires less data than training from scratch.

The combination of automated feature extraction and ML model is known as an impulse. This combination of steps can be deployed to cloud servers and edge devices. The impulse takes in raw data, performs any necessary feature extraction, and runs inference during prediction serving.

Model testing

In almost all cases, you want to test your model's performance. Good ML practices dictate keeping a part of your data separate from the training data (known as a test set, or holdout set). Once you have trained the model, you will use this test set to verify the model's functionality. If your model performs well on the training set but poorly on the test set, it might be overfit, which often requires you to rethink your dataset, feature extraction, and model architecture.

The process of data cleaning, feature extraction, model training, and model testing is almost always iterative. You will often find yourself revisiting each stage in the pipeline to create an impulse that performs well for your particular task and within your hardware constraints.

Additionally, you might need to collect new data if your current dataset does not produce an acceptable model. For example, vibration data from an accelerometer alone might prove insufficient for creating a robust model, so you have to collect supplemental data, such as audio data from a microphone. The combination of vibration and audio data is usually better at identifying mechanical anomalies than one sensor type alone.

Model deployment

For cloud-based AI, you can use tools like SageMaker to deploy your model to a server as part of a prediction serving application. Edge AI can be somewhat trickier, as you often need to optimize your model for a particular hardware and develop an application around that model.

Optimization can involve a number of processes that reduce the size and complexity of the ML model, such as pruning unimportant nodes from the neural network, quantizing operations to run more efficiently on low-end hardware, and compiling models to run on specialized hardware (e.g. GPUs and NPUs).

The ML model is simply a collection of mathematical operations. On it's own, it cannot do much. Due to this limitation, an application needs to be built around the model to collect data, feed data to the impulse for feature extraction and inference, and take some action based on the inference results.

In cloud-based AI, this application is often a prediction serving program that waits for web requests containing raw data. The application can then respond with inference results. On the other hand, edge AI usually requires a tighter integration between performing inference and doing something with the results, such as notifying a user, stopping a machine, or making a decision on how to steer a car.

Programmers and software engineers are often needed to build the application. In many cases, these developers are experts with the target deployment hardware, such as a particular microcontroller, embedded Linux, or smartphone app creation. They work with the ML engineering team to ensure that the model can run on the target hardware.

Operations and maintenance (O&M)

As with any software deployment, operations and maintenance is important to provide continuing support to the edge AI solution. As the data or operating environment changes over time, model performance can begin to degrade. As a result, such deployments often require monitoring model performance, collecting new data, and updating the model.

In the next section on edge MLOps, we will examine the different types of model drift and how parts of the ML pipeline can be automated to create a repeatable system for O&M.

Quiz

Test your knowledge on the edge AI lifecycle with the following quiz:

What is edge MLOps?

Edge machine learning operations (MLOps) is the set of practices and techniques used to automate and unify the various parts of machine learning (ML), system development (dev), and system operation (ops) for edge deployments. Such activities include data collection, processing, model training, deployment, application development, application/model monitoring, and maintenance. Edge MLOps follows many of the same principles of MLOps but with a focus on edge computing.

In the previous section, we discussed the edge AI lifecycle. We will build on that knowledge by examining how to monitor model performance in the field and how to automate various parts of the lifecycle.

Click here to watch the video

DevOps

DevOps is the collaboration between software development teams and IT operations to formalize and automate various parts of both cycles in order to deliver and maintain software.

In this cycle, the software development team works with management and business teams to identify requirements, plan the project, create the required software, verify the code, and package the application for consumption. In many instances, this packaged software is simply "thrown over the fence" to the operations team to manage the release, which consists of pushing the software to users, configuring and installing the software for users, and monitoring the deployment for any issues.

The concept of DevOps comes into play when these two teams work together to ensure smooth delivery and operation of the software. Many aspects of the packaging and delivery can be automated in a process known as continuous integration and continuous delivery (CI/CD). Any problems or maintenance needs can be identified by the operations team and fed back to the development team for fixes and improvements in future releases.

MLOps

Machine learning operations extends the DevOps cycle by adding the design and development of ML models into the mix.

Data collection, model creation, training, and testing is added to the flow. The machine learning team must work closely with the software development and operations teams to ensure that the model meets the needs of the customer and can operate within the parameters of the application, hardware, and environment.

For cloud-based deployments, the application may be a simple prediction serving web interface, or the model may be fully integrated into the application. In most edge AI deployments, an application is built around the model, as inference is often performed locally on the edge device.

Building frameworks for inter-team operation and lifecycle automation offers a number of benefits:

Shorter development cycles and time to market
Increased reliability, performance, scalability, and security
Standardized and automated model development/deployment frees up time for developers to tackle new problems
Streamlined operations and maintenance (O&M) for efficient model deployment

Team effort

In most cases, implementing an edge MLOPs framework is not the work of a single person. It involves the cooperation of several teams. These teams can include some of the following experts:

Data scientists - analyze raw data to find patterns and trends, create algorithms and data models to predict outcomes (which can include machine learning)
Data engineers - build systems to collect, manage, and transform raw data into useful information for data scientists, ML researchers/engineers, and business analysts
ML researchers - similar to data scientists, they work with data and build mathematical models to meet various business or academic needs
ML engineers - build systems to train, test, and deploy ML models in a repeatable and robust manner
Software developers - create computer applications and underlying systems to perform specific tasks for users
Operations specialists - oversee the daily operation of network equipment and software maintenance
Business analysts - form business insights and market opportunities by analyzing data

Edge AI lifecycle

The edge AI lifecycle consists of the steps required to collect data, clean that data, extract required features, train one or more ML models, test the model, deploy the model, and perform necessary maintenance. Note that these steps do not include some of the larger project processes of identifying business needs and creating the application around the model.

In edge MLOps we can automate many of these steps to make the flow through this process easier and without human intervention.

Principles

Edge MLOps is built on three main principles: version control, automation, and governance.

Version control

In software development, the ability to track code versions and roll back versions is incredibly important. It goes beyond simply "saving a copy," as it allows you to create branches to try new features and merge code from other developers. Tools like git and GitHub offer fantastic version control capabilities.

While these tools can be used for files and data beyond just code, they are mostly focused on text-based code. Versioning data can be tricky, as the storage requirements increases with the amount of data. You likely also want to version various ML pipelines in addition to the training/testing code and model itself.

Edge Impulse offers the ability to version control individual blocks as well as your entire project and pipeline.

Automation

Automating anything requires an initial, up-front investment to build the required processes and software. In cases where you need to use that process multiple times, such automation can pay off in the long run. Setting up automated tasks is a crucial step in edge MLOps, as it allows your teams to work on other tasks once the automation is built.

Almost anything in the edge AI lifecycle can be automated, including data collection, data cleaning, model training, and deployment. These often fall into one of the following categories:

Continuous collection - Data collection happens continuously or triggered by some event.
Continuous training - Feature extraction and model training/testing can occur autonomously.
Continuous integration - Any code changes checked into a repository can trigger a series of unit and system tests to ensure correct operation before the code is merged into the main application.
Continuous delivery - Software is created in short cycles and can be reliably released to users on a continuous basis as needed. Some deployment steps in this stage can be automated.
Continuous monitoring - Automated tools are used to monitor the performance and security of an application or system to detect problems early to mitigate risks.

The development teams can decide how such automated processes are triggered. Examples of triggers include:

User-requested - the user makes a request to update or rebaseline the model
Time - one or more steps in the lifecycle can be executed on a set schedule, such as once per day or once per month
Data changes - the presence of newly collected data can trigger a new lifecycle execution to clean the data, train a model, and deploy the model
Code change - a new version of the application might necessitate a new model and thus trigger any of the collection, cleaning, training, testing, or deployment processes
Model monitoring - issues with deployed models (such as model drift) might require any or all of the lifecycle to execute in order to update the model

Governance

Part of edge MLOps includes ensuring that your data and processes adhere to best practices and complies with any necessary regulations. Such regulations might include data privacy laws, such as HIPAA and GDPR. Similar rules are currently being enacted around AI, such as the EU AI act. Be sure to become familiar with any potential governing regulations around data, privacy, and AI! The rules can vary by country and specific technology usage (e.g. medical vs. consumer electronics).

In addition to adhering to laws, you should check for fairness and bias in your data and model. Bias can come in many different forms and greatly impact your resulting model. The popular computer science phrase garbage in, garbage out applies here: if you train a model on biased data, the model will reflect that bias.

Finally, like with any computer system, you should design and implement best security practices to ensure:

Confidentiality to protect sensitive data from unauthorized access
Integrity to guarantee that data has not been altered
Availability of data to authorized users when needed

Machine learning can involve lots of (potentially personal) data that you must use and control carefully. Edge computing devices should also be secured to limit potential intrusion risks. For digging deeper into security, we recommend checking out CISA's guides on best practices and Amazon's ultimate IoT security best practices guide. Hiring or consulting with a cybersecurity expert is also highly advised.

As a good steward of AI, it is your responsibility to ensure that your systems comply with laws and regulations, data and models are free from bias, and devices are secured from unauthorized access.

Model drift

Model drift occurs when an ML model's loses accuracy over time. This can happen over the course of days or years.

In reality, the model does not lose accuracy. Instead, the data being fed to the model or the relationships that data represents in the physical world change over time. Such drift can be placed into two categories:

Data drift occurs when the incoming data skews from the original training/test datasets. For example, the operating environment may change (e.g. collecting data on a machine in winter and expecting inference to work the same during the summer).
Concept drift happens when the relationship between the input data and target changes. For example, spammers discover a new tactic to outwit spam filters. The spam filters are still accurate, but only on older methods.

One way to combat model drift is to consistently monitor the model's performance over a period of time. If the accuracy dips below a threshold or users notice a decline in performance, then you may be experiencing such drift. At this point, you would need to collect new data (either from scratch or supplement your existing dataset), retrain the model, and redeploy.

You can set up automatic processes to handle this. For example, perhaps an on-device process notices too many false positives, which triggers another process to collect data to send to your datalake. The presence of new data in that store then triggers a retraining of the model, which can then be deployed back to the edge device.

Edge device updates

For cloud-based AI, updating the model involves a little effort. Either the end device requests or the server pushes the model to the prediction server. Because most of these servers run operating systems (e.g. Linux), stopping a process or program and restarting it is often trivial. The same holds true for edge devices like laptops and smartphones.

On the other hand, updating models on microcontroller-based IoT devices is more involved. The model is usually baked into the firmware compiled for the device. As such, the firmware must be completely reloaded (flashed) onto the device. In general, these devices are created with the intention of requiring little or no interaction from the user to update its application.

If a model or application update is required, you could notify your users to manually update the firmware (e.g. by plugging the device into a computer). Alternatively, you could create an over-the-air (OTA) solution to push and update the firmware automatically.

You can see how Edge Impulse helps support OTA updates to create automated updates to IoT devices here.

Examples of MLOps tools

A number of MLOps tools exist to help data scientists, ML experts, and developers create fully automated ML pipelines. Here are a few examples:

Edge Impulse is a unique solution by offering the tools necessary to build full MLOps pipelines optimized for the edge.

Quiz

Test your knowledge on edge MLOps with the following quiz:

What is Edge Impulse?

Edge Impulse is the leading edge AI platform for collecting data, training models, and deploying them to your edge computing devices. It provides an end-to-end framework that easily plugs into your edge MLOps workflow.

Previously, we looked at edge MLOps and how it can be used to standardized your edge AI lifecycle. This time, we introduce Edge Impulse as a platform for building edge AI solutions and edge MLOps pipelines.

Click here to watch the video

Edge AI lifecycle

Edge Impulse helps with every step along the edge AI lifecycle, from collecting data, extracting features, designing machine learning (ML) models, training and testing those models, and deploying the models to end devices.

Edge Impulse easily plugs into other machine learning frameworks so that you can scale and customize your model or pipeline as needed.

Note that while we have some pre-compiled software for supported boards to help you get started, we offer a variety of ways to collect data. In many cases, data collection requires customized software (and sometimes custom hardware). This data can easily be stored in a third-party location, such as an AWS S3 bucket. From there, data can be fetched and transformed using custom blocks.

Deployment can also be tricky, as edge devices can vary in their processing power, operating system (or lack thereof), and supported languages. As a result, Edge Impulse offers a number of deployment options that you can build your application around. In most cases, these deployed options come as open-source libraries that make interacting with the models easy.

Finally, all aspects of Edge Impulse can be scripted using a web API. This allows you complete the MLOps loop by monitoring models and triggering new data collection, model training, and redeployment as needed.

Edge Impulse Studio

Edge Impulse Studio is a web-based tool with a graphical interface to help you collect data, build an impulse, and deploy it to an end device.

Data can be stored, sorted, and labeled using the data acquisition tool.

From there, an impulse can be created that includes one or more feature extraction methods along with a machine learning model.

A number of off-the-shelf feature extraction methods can be used and modified to suit the needs of your particular project. You can also design your own feature extraction method using a custom processing block.

Next, you can train a machine learning model (including classification, regression, or anomaly detection) using a learning block. A number of pre-made learning blocks can be used, but you can also create your own custom learning block or use the expert mode to modify the ML training code.

Once trained, the models can be tested using a holdout set or by connecting your device to ingest live data.

Finally, your full impulse can be deployed in a variety of formats, including a C++ library, Linux process (controlled via Python, Node.js, Go, C++, and others), Docker container, WebAssembly executable, or a pre-built firmware for supported hardware.

Edge Impulse includes advanced features like the autoML tool known as EON Tuner to try various impulse configurations to determine the best combination of blocks.

As mentioned previously, you can script all aspects of Studio using the web API, which allows you to construct full MLOps pipelines.

Enterprise features

Edge Impulse has a number of enterprise features to help you build full edge ML pipelines and scale your deployments. First, you have access to faster performance and more training time to create larger and more complex models.

You also gain access to an organization hub to easily monitor and maintain projects along with automated data pipelines, which allow you to configure and run transformation blocks in sequence to extract, transform, and load (ETL) data from a variety of sources.

You can look through this health machine learning example design to see how data is captured, stored, loaded, and transformed from production servers using Edge Impulse tools.

Try our Professional Plan or FREE Enterprise Trial today.

Getting started

One of the fastest ways to try Edge Impulse is to follow this guided tour of creating your own keyword spotting model in 5 minutes. No programming experience is required!

Even though Edge Impulse works well for beginners and students, it is highly extensible for experts and engineers alike. The following guides can help you get started depending on your background:

Quiz

Test your knowledge on Edge Impulse with the following quiz:

Case study: Izoelektro smart grid monitoring

The push towards more efficient and reliable energy distribution has highlighted the importance of addressing power grid vulnerabilities to preemptively prevent outages and infrastructure failures. In response to these challenges, Izoelektro, in collaboration with IRNAS, Arm, and Edge Impulse, developed the RAM-1, an innovative power grid monitoring device equipped with edge AI.

Click here to watch the video

Izoelektro RAM-1

The RAM-1 is an Internet of Things (IoT) device that monitors power grids for a variety of faults, including outage localization load fluctuations. Because these devices are installed in remote locations, the only connections available are long-distance, low data rate wireless channels, such as NB-IoT and LoRaWAN. Raw sensor data cannot be transmitted over these connections. As a result, edge AI is a natural fit.

The RAM-1 performs anomaly detection locally on a low-power microcontroller and only uses the wireless connections to transmit infrequent updates and important notifications.

You can read the full Izoelektro RAM-1 case study here.

The advent of smart grid technologies offers a promising pathway to enhance grid reliability and helps prevent the rapid escalation of simple failures into widespread crises.

Quiz

Test your knowledge on the Izoelektro case study with the following quiz:

Test and certification

You have made it to the end of the edge AI course! In the previous section, we looked at Izoelektro's RAM-1 device for monitoring power grid anomalies. The following video and written sections provide guidance on how to continue your journey in edge and embedded machine learning (ML). Scroll to the bottom of this page to take the comprehensive test and earn your digital certificate.

Click here to watch the video

Going further

The following sections offer opportunities to continue your learning journey.

AI at the Edge book

If you would like to dive deeper into many of the topics presented in this course, we highly recommend checking out Dan and Jenny's AI at the Edge book.

You can download a free digital copy of the ebook here.

Hands-on experience with Edge Impulse

If you would like to try Edge Impulse, this 5 minute tutorial will walk you through the process of creating your own keyword spotting system. When you finish, you will be able to load the program onto your phone to watch the ML identify your keyword in real time.

Case studies

While we just looked at case studies from Izoelektro and Tunstall in this course, Edge Impulse has worked with companies all over the world to solve complex problems in healthcare, agriculture, manufacturing, conservation, and more. You can read more of these case studies here.

Embedded ML course

Edge Impulse created a full technical course on Coursera. If you would like to learn the details behind neural networks, how to collect data, train models, and deploy them to embedded systems, we recommend taking this course. Accessing the materials on Coursera is free, and you can choose to pay for an official certificate.

University program

If you are looking to teach edge AI in your school, we recommend taking a look at the Edge Impulse university program. We offer a variety of free and open source content and example projects for you to use in your classroom.

Contact

If you have questions about Edge Impulse (or edge AI in general), you can reach out to us using one of the links here.

Test

The following test covers material from all sections in the edge AI course. When you submit your answers, you will receive an email in a few minutes with your score. To pass, you must receive an 80% or more. You can take the test as many times as you would like. If you pass, you will receive a digital certificate via email.

What is edge MLOps?

Click here to watch the video

DevOps

DevOps is the collaboration between software development teams and IT operations to formalize and automate various parts of both cycles in order to deliver and maintain software.

MLOps

Machine learning operations extends the DevOps cycle by adding the design and development of ML models into the mix.

Building frameworks for inter-team operation and lifecycle automation offers a number of benefits:

Shorter development cycles and time to market
Increased reliability, performance, scalability, and security
Standardized and automated model development/deployment frees up time for developers to tackle new problems
Streamlined operations and maintenance (O&M) for efficient model deployment

Team effort

In most cases, implementing an edge MLOPs framework is not the work of a single person. It involves the cooperation of several teams. These teams can include some of the following experts:

Data scientists - analyze raw data to find patterns and trends, create algorithms and data models to predict outcomes (which can include machine learning)
Data engineers - build systems to collect, manage, and transform raw data into useful information for data scientists, ML researchers/engineers, and business analysts
ML researchers - similar to data scientists, they work with data and build mathematical models to meet various business or academic needs
ML engineers - build systems to train, test, and deploy ML models in a repeatable and robust manner
Software developers - create computer applications and underlying systems to perform specific tasks for users
Operations specialists - oversee the daily operation of network equipment and software maintenance
Business analysts - form business insights and market opportunities by analyzing data

Edge AI lifecycle

In edge MLOps we can automate many of these steps to make the flow through this process easier and without human intervention.

Principles

Edge MLOps is built on three main principles: version control, automation, and governance.

Version control

Edge Impulse offers the ability to version control individual blocks as well as your entire project and pipeline.

Automation

Almost anything in the edge AI lifecycle can be automated, including data collection, data cleaning, model training, and deployment. These often fall into one of the following categories:

Continuous collection - Data collection happens continuously or triggered by some event.
Continuous training - Feature extraction and model training/testing can occur autonomously.
Continuous integration - Any code changes checked into a repository can trigger a series of unit and system tests to ensure correct operation before the code is merged into the main application.
Continuous delivery - Software is created in short cycles and can be reliably released to users on a continuous basis as needed. Some deployment steps in this stage can be automated.
Continuous monitoring - Automated tools are used to monitor the performance and security of an application or system to detect problems early to mitigate risks.

The development teams can decide how such automated processes are triggered. Examples of triggers include:

User-requested - the user makes a request to update or rebaseline the model
Time - one or more steps in the lifecycle can be executed on a set schedule, such as once per day or once per month
Data changes - the presence of newly collected data can trigger a new lifecycle execution to clean the data, train a model, and deploy the model
Code change - a new version of the application might necessitate a new model and thus trigger any of the collection, cleaning, training, testing, or deployment processes
Model monitoring - issues with deployed models (such as model drift) might require any or all of the lifecycle to execute in order to update the model

Governance

Finally, like with any computer system, you should design and implement best security practices to ensure:

Confidentiality to protect sensitive data from unauthorized access
Integrity to guarantee that data has not been altered
Availability of data to authorized users when needed

As a good steward of AI, it is your responsibility to ensure that your systems comply with laws and regulations, data and models are free from bias, and devices are secured from unauthorized access.

Model drift

Model drift occurs when an ML model's loses accuracy over time. This can happen over the course of days or years.

Data drift occurs when the incoming data skews from the original training/test datasets. For example, the operating environment may change (e.g. collecting data on a machine in winter and expecting inference to work the same during the summer).
Concept drift happens when the relationship between the input data and target changes. For example, spammers discover a new tactic to outwit spam filters. The spam filters are still accurate, but only on older methods.