Artificial Intelligence
The Future of AI is Embedded
What PCB designers need to know to bring AI hardware to the device level. by Zachariah Peterson
A few buzzwords dominated headlines in 2020, many centered around Covid-19 and politics. Those who follow trends in technology probably noticed one area saw an explosion of growth: artificial intelligence. Unfortunately for the hardware developer, the tech world’s interest always seems to be drawn to the software side of AI.

The software industry has quickly embraced AI to the point where many software-driven services incorporate some element of AI to provide a meaningful user experience. As of the first quarter of 2021, it’s getting difficult to find a SaaS platform that doesn’t use AI for some specialized task. SaaS-ification is fine, and it’s creating a wealth of productivity tools that businesses can mix and match to make their processes more intelligent. And there are the big players like Facebook, whose AI models run quietly in the background, determining which advertisements and inflammatory memes you’re most likely to click.

What about the hardware side? What is its role in the AI ecosystem, and how is AI used in embedded systems? The reality is on-device AI with the same capabilities found in a data center is a long way off, but many companies are working to make these platforms a reality. When you examine many use cases, there is major motivation to move user-facing AI tasks outside the data center and run them directly on the device. What some in the software world don’t realize, and where embedded systems designers play a major role, is this requires specialized hardware that is starting to proliferate into the market. Some of these new hardware-driven AI developments are being created by familiar names in the electronics industry, while others are being introduced by startups, research labs, the US government, and some surprising names from the tech sector.

What does this mean for the average PCB designer, and what do they need to know to bring AI hardware to the device level? Although these tasks may appear challenging, PCB designers are instrumental and already have many of the skills needed to implement on-device AI.

On-Device AI Compute Challenges
The goal in bringing AI onto the device is to reduce reliance on the cloud, meaning there would be no requirement for an internet connection to do (at minimum) inference tasks. In fact, not all AI tasks need to be run in the cloud; the compute power in data centers is overkill for many practical inference tasks. To rely less on the cloud, you need to diligently design systems to provide comparable levels of computing power on the device. This is not, however, a simple matter of stacking multiple CPUs/GPUs on a PCB, as was the trend in 2019-20.

AI-driven hardware must overcome multiple “walls” to enable fast, power-efficient AI on the device:

  • Memory wall: The problem here is not just memory capacity but also memory speed. Faster read/write access to data in memory reduces total time required for inference and training tasks.
  • Heat wall: Anyone who has heard the fan on their CPU or GPU spin up to high speed knows high compute tasks generate a lot of heat. AI is no different, but reducing the number of bit transfers during computation reduces cooling requirements and total system size.
  • Size wall: Beyond the heat wall, it’s desirable to continue reducing system size by packing more features into smaller spaces. Some existing commercial off-the-shelf (COTS) components are simply too large to enable compact embedded AI products, while also satisfying the other areas on this list.
  • Compute wall: Efficient on-device AI requires much greater compute power in smaller packages, which then requires redesigning the fundamental architecture of transistor-based logic circuits.
  • Autonomy wall: I coined this term while writing this article, and hopefully it illustrates the type of ecosystem that can be created between embedded AI products, data centers, and edge computing assets. Certain tasks may be best left in the data center, while others are best performed on the device. The lack of efficient hardware in the latter area creates a situation where embedded AI systems are only “intelligent” when they have cell service or are in WiFi range.

As mentioned above, the goal is to eventually get AI out of the data center, but the industry initially responded by transplanting data center hardware to embedded devices, basically replicating data center computing (and hardware) on a smaller scale. Newer products are changing that dynamic, and we should expect the range of available products to continue growing.

What’s Available Now?
In 2021, multiple processor options and hardware platforms are addressing each of these areas. According to Fortune Business Insights, the size of the AI chipset market stood at $8.14 billion in 2019, and the industry is expected to grow to $108.85 billion by 2027, equivalent to a CAGR of 38.9% during the forecast period,1 with available components and chipsets primarily spanning across GPUs, FPGAs, specialty processors, and supporting components. Some newer options focusing on lightweight devices with small footprints are entering or are in production, while others have been available for some time. Below are just a few examples of components targeting a range of applications essential for truly autonomous devices:

  • Google Coral: Google realizes the potential in the embedded AI market, and has responded with a purpose-built AI processor for a range of applications. This component runs on a single PCIe Gen2 lane/USB 2.0 and targets just about anything that can be compiled from TensorFlow Lite.
  • Intel Movidius: Some may remember the Movidius USB accelerator stick, which plugged into a desktop and could be used for AI development. Now Intel is marketing its Movidius VPU product for computer vision applications.
  • Nvidia GPUs: We shouldn’t be surprised Nvidia is pushing hard into this area with newer GPU products. As the undisputed leader in GPUs, with many of its products present in the world’s most powerful supercomputers, expect Nvidia to continue producing smaller, more power-efficient products like the Jetson Nano platform.
  • Baidu Kunlun 2: The Kunlun 2 AI processor was scheduled to go into production in December 2020. This component targets cloud-to-edge applications, with the goal of breaking the autonomy wall.
  • Maxim Integrated MAX78000: The MAX78000 is a lightweight AI processor targeting voice recognition, but it nicely illustrates the push to bring AI into embedded devices that have less resemblance to a typical computer. Don’t be surprised when other names in the semiconductor sector follow suit with competitive components.
  • MediaTek APUs: First released in 2019, MediaTek’s AI processor units (APUs) have been adapted into larger parts of its product portfolio. These processors target embedded devices at all levels, with the newest SoC product targeting AI on 5G smartphones.

This list is not exhaustive and is not intended to be an endorsement. In addition to the companies listed here, a smattering of startups and research laboratories are developing competitive products based on totally reengineered transistor architectures. Si is still the material of choice, but MoS2 is proving an extremely promising alternative in terms of MAC count and power efficiency. These hardware platforms aim to streamline MAC operations in neural network models with an optimized cascaded fabric (targeting the compute wall) with less power consumption and implementation cost compared to general computing architectures.

Device Architecture and the Embedded AI Ecosystem
The great thing about many of the examples listed above is they can be paired with a range of existing MPUs, FPGAs, and even MCUs, either directly or through an interface bridge. MCU/FPGA-SoC/MPU-based solutions carry smaller cost compared to pure FPGA or GPU-based systems. These types of designs are easier to implement for most embedded designers: Arm Cortex cores still dominate, and manufacturer support for existing MCU/MPU tools can be made immediately extensible to these newer platforms.
A system block diagram depicting how AI processing would integrate into embedded systems
FIGURE 1. High level system block diagram for integrating AI processing into embedded systems.
Three icons lines side-by-side detailing how AI-capable products are connected in various ecosystems
FIGURE 2. AI-capable embedded products in the connected ecosystem.
The high-level block diagram in FIGURE 1 shows how these components can fit into an embedded system. The AI processor section is shown as an external component, as this is how many components are designed to be used, although newer AI-capable SoCs are likely to follow the broader integration trend seen with many other components.

FIGURE 2 shows how embedded AI-capable devices interface with the edge and the cloud, as the level of compute involved in each area. Lightweight inference tasks can be performed on the device, with the throughput, level of parallelization, and available tasks dependent on the type of processor placed on the device. Higher compute training and inference tasks can be performed at the edge or in the cloud, depending on the amount of data and time required. The benefit of this model is mission-critical inference tasks can be performed in the field without an internet connection. Once an embedded device encounters an internet connection, software/firmware updates can be pushed back to the device. Today’s cloud services currently enable this type of ecosystem and integration among multiple services.

Because these components run on the back of existing high-speed (PCIe, USB, MIPI, etc.) and low-speed (I2C, SPI, UART, etc.) protocols, the high-speed PCB designer will be familiar with implementing these components. What’s left for the designer to think about? The challenge isn’t necessarily one of layout and routing. It’s about creating a design that enables a meaningful user experience. At the end of the day, AI is worthless if it does not provide value to the user, even if it’s only perceived value.

The other point to consider is the place of the product in the larger computing ecosystem. With PCB designers also playing the role of embedded engineers, they’ll have to act more like systems engineers and less like layout engineers. Think about the level of autonomy the product requires, and select AI-enabled processors with this in mind. If we take AI out of the data center and embed it in the field, design reliability becomes critical in many applications, such as aerospace, smart infrastructure, industrial automation, and automotive. Security remains a critical challenge for IoT products and embedded systems in general,2 and solutions to these challenges will permeate all levels of the AI-enabled embedded ecosystem (device, edge and cloud).

Looking to the Future
We live in a world where AI is used behind the scenes in many tech products and services, but it has also been SaaS-ified to death. Advances in AI-centric ICs provide a new generation of products that enable much more resilient applications in robotics, IoT, industrial automation, and much more. Eventually, as the foundational embedded AI hardware platforms and the ICs that enable them evolve, they will filter back into the data center. I predict less reliance on typical GPUs for the high compute power needed to train machine learning models used in AI, or more specialized GPUs like those in Nvidia’s portfolio. PCB designers will continue to be a driving force at all levels, especially because these specialized components will use high-speed computing interfaces requiring meticulous layout and routing. As we gain a deeper understanding of human intelligence and perception, future ICs for embedded AI products may have an entirely new architecture we can’t imagine yet.
  1. Fortune Business Insights, “Artificial Intelligence (AI) Chipsets Market Share,” December 2020. Retrieved from
  2. B. Buntz, “Addressing IoT Security Challenges From the Cloud to the Edge,” IoT World Today, May 26, 2020. Retrieved from
Zachariah Peterson has an extensive technical background in academia and industry. He runs Northwest Engineering Solutions (, a PCB design and technical marketing firm that serves industrial automation, defense and EDA software clients;