Edge AI Architecture: The Edge is the Place to be for AI

By ·Categories: Tech Explained·Published On: March 8th, 2024·11.7 min read·

AI seems to be part of every technology-related conversation lately. While for some time AI appeared more hype than reality, we now seem to be at an important inflection point. As one example of AI’s ubiquity, I had the pleasure of attending the “AI Hardware and Edge AI” summit in September, 2023, and Andrew Ng, one of the thought leaders in the AI space, opened his speech by describing AI as the “New Electricity”. 

According to the MIT Management Sloan School, AI can boost the productivity of highly skilled workers by 40% compared to workers who do not use it. Now, that is a promising proposition. While historically we would consider gains of a few percent a huge improvement, such leaps are truly transformative. But why? Why all of a sudden is there so much momentum and enthusiasm around AI? AI is not new; the core concepts have been around since the 1950s. Let’s have a look at the factors fueling the growth of AI.

What’s fueling the growth of AI?

Firstly, significant progress has been made regarding deep neural networks. The introduction of the “Transformer Model” by Ashish Vaswani in 2017 is considered a watershed moment in the field. The transformer based model is more accurate than other models because it is able to understand the relationship among sequential elements that are far from each other. In addition, it is fast because it pays more attention to a sequence’s most important parts. These challenges had plagued previous models. The transformer model has been key in enabling the powerful large language models we have today. In addition, over the last few years, a tremendous number of pre-trained models have been developed that are free to use, drastically shrinking the work required to create a customized model for a specific use case.

Next, there is the abundance of data. Tremendous amounts of data are being produced at the edge, in modern factories for example. In his keynote at the 2024 CES, Dr. Roland Busch, CEO of Siemens, estimated that a highly automated factory today generates about 2,000 TB (equivalent to the data volume of 500,000 movies) in data every month. Today, only a very small portion of this data is actually being used, meaning there is huge potential in harvesting it and creating actionable insights from it.

Another important driver is the growth in compute power that can process AI workloads. Beyond the more traditional growth of CPU processing power according to Moore’s Law, we are seeing specific so-called “accelerator architectures” emerge, such as Neural Processing Units (NPUs) that are much more effective and powerful in processing neural networks, which form the basis for many AI models.

AI gets a household name

OpenAI’s release of ChatGPT to the public in November, 2022, represents an important milestone when it comes to AI. Why? Because it gave everyone, not just researchers or professionals in technology, the ability to get a sneak peek of what AI could do. It has helped, in a way that’s hard to overstate, to increase the awareness, interest, and understanding of the concepts and potential of AI across a broad audience. This, in turn, has fueled the appetite for identifying ways to use AI.

Businesses are increasingly realizing the potential benefits of using AI across all the functions of a company. At OnLogic, for example, we have identified more than 100 potential use cases across the various functions of the business where AI could potentially assist in driving efficiency and effectiveness within the organization. That is a tremendous amount of opportunity, and it’s important to remember that we are only at the very beginning of harnessing the potential of AI. Companies are recognizing a real risk of being left behind if they don’t use AI to increase their competitiveness within their market. The question businesses are facing has quickly become – can you afford not to invest in AI? According to Forbes, 64% of businesses expect AI to increase productivity. However, research shows that only 35% of businesses were leveraging AI in 2023. Once again, it’s clear that there’s an AI wave cresting.

As a result, AI is expected to see a compound annual growth rate of 37.3% from 2023 to 2030. Let’s put that into perspective. Let’s, for example, assume that you have $100k in your 401k at the beginning of 2023 and you make no more contributions. If your balance grows at 37.3% per year, then by the end of 2030 your balance would be $1,262,886, more than a factor 12 in growth versus the beginning of 2023. That is truly tremendous growth! This staggering expected growth of AI means that there is a lot of work to be done developing and implementing these AI solutions and the devices and infrastructure to support them.

What and where is the edge?

So, we’ve established that AI has potential, but how does it relate to edge computing? First, let’s make sure we’re on the same page about what the edge is. Below is a diagram developed by the Linux Foundation that shows the edge to cloud continuum.

A graphic that shows the edge to cloud continuum

Our focus is on the “User Edge”, which describes the hardware on the business premises. This can include everything from end point devices such as PLCs controlling manufacturing equipment, to hardware on Automated Guided Vehicles (AGVs) moving products or components around a factory floor. The user edge also encompasses any industrial PCs used on the factory floor which, for example, could be running inference workloads for a machine vision system. The diagram below shows manufacturing operations with two factories. Each factory can house many edge devices performing for example inference operations. The user edge typically also includes a localized data center that can be used for data storage, training models, data analytics, and more.

A graphic explaining edge AI architecture

Why edge AI architecture?

Now we get to the heart of the question – why, in many cases, is it advantageous to run AI workloads at the edge instead of relying on the cloud?

To begin with, the edge is where the data is being created. It’s where all the Operational Technology (OT) of a manufacturing plant lives. Sending all of the data to the cloud is expensive. Remember the equivalent data volume of 500,000 movies being generated in a highly automated factory every month? It would be cost prohibitive to send and store all of this data in the cloud. Beyond the cost for sending and storing data, there is the cost for using compute resources in the cloud. The cloud is in strong demand these days, and the cost for computing resources in the cloud is substantial. Conversely, if the data processing is performed at the edge on the hardware the customer owns, the only cost is the cost of the hardware and the cost for operating and maintaining it, which can be further mitigated through the use of the proper industrial hardware.

Beyond the cost aspect, latency is another important consideration. This can be problematic and limiting depending on the use case because fast decisions are often needed on the factory floor, or at the edge in general. If, for example, freshly produced connectors are coming down the manufacturing line at a high rate, it is imperative that the decision which connector passes and which fails the quality inspection is made near-instantly. In addition, if your production line relies on decisions made exclusively in the cloud, a loss of connection most likely results in downtime to the production line, costing money, time, and valuable production throughput.

Another major advantage of an edge computing architecture is security. Many businesses are hesitant to send confidential data about their company into the cloud, and have a strong preference for keeping it on premises, thereby reducing cybersecurity risks. The merger of IT & OT grows the potential surface area of attack, and it is important that businesses harden their edge computing nodes to protect their data. Many techniques, including SASE (Secure Access Service Edge) as well as Zero Trust, are increasingly being deployed to enhance data security, but an edge computing architecture is still preferred by many businesses.

Gartner summarizes the edge computing drivers in the following diagram:

A graphic of edge computing drivers from Gartner

Environmental considerations are an additional driver of edge computing adoption. For example, using the edge versus the cloud reduces the overall energy profile. Data centers consume large amounts of energy, thereby contributing to greenhouse gas emissions. In addition, energy is required to send data back and forth between the cloud and the edge. Companies can lower their energy cost by moving AI workloads from the cloud to the edge.

Let me add one more point of evidence that Edge AI is turning into reality. Let’s take a look at the 2023 Gartner Hype cycle. The chart shown below was released in July 2023 and it shows “Edge AI” reaching the ”Plateau of Productivity”, meaning the state in which it reaches full maturity for implementation, within 2 years from then. With that said, let us review how we implement AI at the edge.

A graphic from Gartner showing the Hype Cycle for Artificial Intelligence, 2023

How to implement AI at the edge

So, you’ve considered your options (and the information above) and decided an edge AI implementation is the right fit for your business. Where do you begin? Inherently, AI is software, but to deploy that software you’ll need the right hardware. Here are a few things to consider.

Environment

In edge computing, the installation environment often differs dramatically from a climate controlled data center or office. It may be hot, cold or humid, located outside or in a boiling hot steel plant (for example). The air may be filled with particulates, ranging from saw dust at a cabinet manufacturer to powdered flavorings on a potato chip production line. The edge is made up of a huge range of places where a standard computer wouldn’t survive long. To overcome such harsh environments, industrial or rugged computers are needed. These types of systems are built for tough environments and provide reliability and peace of mind to operators.

Workloads

For inference operations, your compute power needs will depend on the exact use case, how much data needs to be processed, and how fast it needs to be processed. This may come as a surprise to some, but for many AI use cases the onboard compute power of the CPU may be sufficient. Processor manufacturers such as Intel®, beyond simply growing CPU power from generation to generation, are adding more integrated GPU computing within the processor package. With the latest generations of processors Neural Processing Units (NPUs) are being added, allowing a growing number of inference use cases to be handled without an additional accelerator. If the application does require more power than the CPU package can offer—multiple high FPS camera feeds for example—then industrial PCs with graphics cards or specialized accelerators will be required.

Model Training

Beyond the inference workload, in most applications the models need to be continually trained to incorporate ongoing learning. This training will typically require a larger amount of compute power and can be handled by an edge server. This edge server can reside on the manufacturing floor or in a dedicated server room. Edge servers can then also be used to monitor and handle model drift or perform any analytics required by the business. For more, check out “AI acceleration at the Edge”, which was a session held at the AWS re:invent 2023 and focuses on various aspects of edge implementations.

Software Requirements

In order to operate this type of an edge computing architecture a number of software layers beyond the operating system layer are needed. Edge deployments will often require a large number of gateways, edge servers, inference devices, etc. The ability to seamlessly onboard, deploy and efficiently update devices is key. This is especially important because there are often few to no technical resources on site, and businesses with multiple locations can face challenges when system updates or deployments become out of sync. This is commonly handled by edge orchestration and remote management software. AI workloads are also frequently run as containers, requiring the appropriate software. And on top of all of these layers, the AI application software for the specific use case is required.

Cloud repatriation

So, we’ve looked at how to successfully deploy AI at the edge, but often companies start with an AI solution implemented in the cloud. More and more companies are realizing the value of edge computing and are in the process of cloud repatriation to move computing resources to the edge. According to GFT, in a recent IDC survey 71% of respondents indicated they are planning to either partially or fully repatriate. The most common reasons cited for this intended move are reducing cost, improving performance, and increasing security. Do these sound familiar? 

The fact is that every business is unique and the optimal location of your commuting resources will depend on exactly what you’re trying to do with your data. The final implementation will in a number of cases not be as black and white as cloud or edge, but rather a hybrid solution. Many software packages can run the workloads either on cloud or on the edge and are agnostic to the specific hardware being used. This provides users with a lot of flexibility to pursue a best of both worlds approach – the benefits of the cloud in terms of compute power and the physical control of the edge.

Conclusions

Let me close by citing one more statistic: According to Gartner, by year-end 2026 70% of large enterprises will have a documented strategy for edge computing, compared to fewer than 10% in 2023. According to the same study by 2026 50% of edge computing deployments will involve machine learning, compared to 5% in 2022. All of this means that the edge will continue to be the place to be for businesses of every size in the coming years and those who go into it with the knowledge of its value will be best positioned to benefit.

Get the Latest Tech Updates

Subscribe to our newsletters to get updates from OnLogic delivered straight to your inbox. News and insights from our team of experts are just a click away. Hit the button to head to our subscription page.

Share

About the Author: Michael Kleiner

Dr. Michael Kleiner is the VP of Edge AI Solutions at OnLogic, helping to bring our industry-tested computing hardware to the world of edge solutions and AI consulting. Michael joined OnLogic in 2015 to lead the Engineering and Product Management teams. Michael loves technology and is passionate about growing teams and building businesses.