For best experience this site requires Javascript to be enabled. To enable on your browser, follow our accessibility instructions.

About NXP
Smarter World Blog
Powerful Hardware and a Strong Software Ecosystem Help Layerscape Excel at AI

Powerful Hardware and a Strong Software Ecosystem Help Layerscape Excel at AI

October 31, 2018
by Joseph Byrne

placeholder

Although known for their networking prowess, Layerscape processors are gaining traction in artificial-intelligence applications. These applications include security and surveillance, home and building automation, factory safety and machine inspection. The reason is that Layerscape’s connectivity and general-purpose processing enable these processors to address applications where wired and wireless communications is a key requirement, and powerful multicore CPUs can tackle multiple computationally intensive tasks.

For those surprised that the networking-centric Layerscape family is considered for AI designs, I’ve got news. Layerscape executes AI algorithms quite well, and it’s a good fit for a lot of designs. On the hardware side, Layerscape combines either the efficient Cortex-A53 or the powerful Cortex-A72 CPUs from Arm with sizeable caches and DRAM bandwidth.

Figure 1 shows how key functions in a design using Layerscape for AI-based image processing can map to a Layerscape LS1043A or LS1046A processor. Cameras and radar sensors connect via USB or Ethernet. Ethernet can also connect to a WAN uplink and to the LAN (also available via PCIe-connected Wi-Fi) if this system is an edge gateway. The four CPUs handle application logic, networking functions, capture of camera and radar data and AI-based classification of this data.

Figure 1: Mapping AI-Enabled Application to Layerscape

The software side is at least as important. Frameworks—software libraries for AI-related numerical computation—o ptimized for mobile and embedded devices instead of servers are coming to market, enabling performance increases. These include open source frameworks, such as Google’s TensorFlow Lite and Tencent’s NCNN, and commercial engines like DeepView from Au-Zone. By optimizing models through judicious pruning (eliminating less-useful neural-network parameters) and quantization (for example, mapping floating-point value to eight-bit integers), these frameworks reduce memory and computation required to crunch models. In the case of video analysis, faster performance can be seen in 5-10x gains in frames per second.

Another software approach is to bypass implementing models with generic frameworks and taking a bespoke approach to developing models optimized for a specific hardware target. Optimizations beyond pruning and quantization (for example, relying on the similarity among adjacent frames in a video stream to quickly find previously detected objects) can extract further performance. Companies like Pilot.AI and Invision.AI have ported their object-detection models to Layerscape, achieving movie-quality frame rates.

Invision.AI, Au-Zone and a stealth startup with AI software optimized for edge computing and IoT endpoints recently presented their software at a webinar hosted by NXP. These companies made interesting points about the cost, risk and time to market advantages of performing AI on Layerscape. Companies already fielding Layerscape-based designs can add AI capability without redesigning their hardware, provided the design has CPU headroom. We’ve seen this with companies looking to add video surveillance to their enterprise access points or home automation to their residential gateways.

A system-level approach can also rationalize limited hardware resources available for retrofitting AI and streamline upgrading systems already in the field. For example, a first level of AI classification can be added to an IP camera, smart door lock or other device, taking advantage of any available processing headroom and memory. This level can extract features or do other preliminary classification, cascading the results downstream to the associated Layerscape-powered camera headend or home-automation hub to complete the analysis process. If a deployment has insufficient resources, it need not be ripped out and replaced but instead supplemented with an adjunct Layerscape system or module for the AI functions.

Figure 2 shows this approach in the context of a roadside unit (RSU). These are systems deployed throughout a smart city to help implement an intelligent transportation system (ITS). They monitor roads and intersections with various sensors and communicate with vehicles and adjacent RSUs. NXP has shown RSU demos in the past, see https://www.nxp.com/intelligentRSU. In the Figure 2 example, the vehicles, cameras and radars preclassify the data they capture, communicating their findings to the RSU. The RSU tracks and plots vehicles and pedestrians, analyzes their motion and queuing, controls traffic signals and communicates with other systems—a big load that would be even bigger if a first level of processing hadn’t been done near the various sensors.

Figure 2: Cascaded AI Can Play a Role in the Smart City

The Layerscape recipe of combining processing and I/O is well suited to supporting AI. We find the Arm Cortex-A72 CPU—the workhorse used in many Layerscape processors with one to 16 cores—performs about as well as a single thread of a server-grade processor or a single core of a PC-grade processor. We’ve seen this result on benchmarks in the SPEC suite, in networking tasks and in video compression.

The Arm Cortex-A53 CPU—the lower-cost stablemate of the Cortex-A72—works well for applications when paired with optimized software and in less-demanding situations. For example, a video surveillance system operating at only 8fps can compress this video in the H.264 format using only a single Cortex-A53 CPU, with cycles remaining for other tasks. An adjacent Cortex-A53 CPU running commercial AI software can identify bodies at this frame rate or faster.

Layerscape’s abundant USB, Ethernet and PCI ports can connect to cameras, radar modules and other sensors generating input to be analyzed. These I/O ports are also essential for LAN and WAN connections. It’s hard to imagine a system using AI that doesn’t also communicate. Competing processors may have useful multimedia engines but cannot match Layerscape’s interfacing options and networking performance.

In conclusion, Layerscape can support AI functions. Developers need not rely on an expensive coprocessor add-on or think their only option is a competing chip with hardware acceleration but without Layerscape’s networking and I/O or cost efficiency. Nor must one wait to implement AI. Get started today!

Tags: Artificial Intelligence and Machine Learning, Technologies

Author

Joseph Byrne

Joseph Byrne

Senior strategic marketing manager for NXP's Digital Networking Group

Prior to joining NXP, Byrne was a senior analyst at The Linley Group, where he focused on communications and semiconductors, providing strategic guidance on product decisions to senior semiconductor executives. Prior to working at The Linley Group, he was a principal analyst at Gartner, leading the firm's coverage of wired communications semiconductors. There, he advised semiconductor suppliers on strategy, marketing and investing. Byrne started his career at SMOS Systems after graduating with a bachelor of science in engineering from Duke University. He spent three years at SMOS as part of the R&D engineering team working on 32-bit RISC microcontrollers. He then returned to school for an MBA, which he received with high distinction from the University of Michigan. He worked with Deloitte & Touche Consulting Group for a year before going on to work at Gartner, where he spent the next nine years until going to work for The Linley Group in 2005.

Related Articles

Artificial Intelligence: Beyond the Hype

Artificial Intelligence: Beyond the Hype

October 22, 2018

by Fari Assaderaghi and Lars Reger

AI: Implications for IoT Security

AI: Implications for IoT Security

October 24, 2018

by Fari Assaderaghi and Lars Reger

Glow Compiler Optimizes Neural Networks for Low-Power NXP MCUs

Glow Compiler Optimizes Neural Networks for Low-Power NXP MCUs

July 28, 2020