Running Zephyr real-time operating system (RTOS) on ARM® Cortex®-A or Cortex-M cores is a well-established practice,
supported by extensive
documentation and examples. However, several processors in the NXP i.MX and i.MX RT families
include additional
compute engine—one or more Cadence Tensilica digital signal processors (DSPs) cores—designed for high-performance
audio,
voice and neural network
processing.
This blog article will focus on utilizing the Cadence Xtensa HiFi4 DSP, which is most widely used across NXP’s
product
lineup. However, the concepts and methods described here are also applicable to other Cadence DSPs, which are
carefully
selected by NXP to give the best power efficiency and performance tradeoffs.
The HiFi4 DSP offloads compute-intensive workloads from the main ARM cores,
improving
overall system performance and energy efficiency. With Zephyr RTOS support, the HiFi4 DSP becomes an accessible,
open and highly flexible platform for embedded developers targeting heterogeneous NXP applications.
The Role of the HiFi4 DSP in NXP Architectures
The NXP i.MX 8M Plus is a
representative example of a heterogeneous
architecture,
integrating:
- Four ARM Cortex-A53 application cores (up to 1.8 GHz)
- One ARM Cortex-M7 real-time core (up to 800 MHz)
- One Cadence HiFi4 DSP (up to 800 MHz)
Similarly, several devices in the i.MX RT crossover microcontroller unit (MCU) family also include a HiFi4 DSP core,
combining
microcontroller
simplicity with DSP acceleration for advanced real-time and audio processing.
These heterogeneous designs enable workload partitioning according to performance, latency, and power requirements.
Linux typically operates on the Cortex-A cores, while Zephyr RTOS runs on the Cortex-M core. Zephyr can also be
deployed
on the HiFi4 DSP for signal or data processing tasks.
This illustrates how heterogeneous NXP architectures partition workloads across ARM
and DSP cores for
optimal performance.
The DSP is optimized for:
- Audio and voice codecs
- AI and neural network pre- and post-processing
- Fast Fourier Transform (FFT), filtering and echo cancellation
- Low-latency communication with ARM cores through Open Asymmetric Multi-Processing (OpenAMP) inter-processor
communication (IPC)
By offloading such functions to the DSP, systems can achieve higher responsiveness, reduced CPU load and lower energy
consumption.
Zephyr RTOS on HiFi4 DSP
The Zephyr Project is an open source, scalable RTOS optimized for embedded and heterogeneous
environments. It supports multiple hardware architectures while providing a consistent, modular framework for device
drivers, IPC and synchronization.
NXP has contributed extensions to Zephyr RTOS to enable HiFi4 DSP support across both i.MX and i.MX RT product
families.
These enhancements make it easier for developers, and the wider community, to take full advantage of DSP
acceleration in
mixed-core systems.
Supported platforms include:
Additionally, on some i.MX RT targets, we have other DSPs such as HiFi1 or Fusion F1.
| Device Family |
Zephyr Target Board |
| i.MX RT500 |
mimxrt595_evk/mimxrt595s/f1 |
| i.MX RT700
|
mimxrt700_evk/mimxrt798s/hifi1 |
The same Zephyr build environment can be used for all of these targets, allowing a unified development workflow
across
ARM
and DSP cores.
Firmware loading and runtime management are handled by the Linux remoteproc driver (on i.MX platforms) or multicore
management frameworks (on i.MX RT platforms), while OpenAMP provides robust intercore messaging.
This illustrates how OpenAMP enables fast, reliable intercore communication between
ARM and DSP in NXP
systems.
From Basic Execution to Intercore Collaboration
The Zephyr project offers a variety of examples that demonstrate its capabilities—from basic system bring-up to
advanced processing and intercore communication. The following sections will walk you through several examples
that demonstrate how to use Zephyr on the HiFi4 DSP.
Hello World Example
The standard Zephyr hello_world sample
demonstrates successful boot and execution of Zephyr on the HiFi4 DSP. Once
the
firmware is built and loaded, the DSP console output confirms successful startup:
Example of Hello World from Zephyr OS and booting up innovation on i.MX platforms.
This sample establishes a foundation for more advanced applications involving inter-processor communication and
workload offloading.
Number Crunching and DSP Acceleration
The number_crunching
example highlights the computational advantages of the HiFi4 DSP. This sample performs vector
operations, fast Fourier transform (FFT) and filtering using either the Cortex microcontroller software interface
standard–digital signal
processing (CMSIS-DSP) backend or the highly optimized Cadence NatureDSP
library.
Execution cycle counts demonstrate the significant efficiency gains achieved by the NatureDSP backend, particularly
for
FFT and infinite impulse response (IIR) filter routines. These performance advantages make the HiFi4 DSP ideal for
tasks such as audio
post-processing, beamforming and real-time data filtering.
OpenAMP Inter-Processor Communication
Many applications benefit from collaboration between the ARM and DSP cores. The openamp_rsc_table
sample demonstrates how Zephyr running on the HiFi4 DSP communicates with Linux running on an ARM
core, using OpenAMP and Remote Processor Messaging (RPMsg). This enables reliable and low-latency message passing
between heterogeneous cores.
For example, imagine a mixed-OS multicore system where a Cortex-A core runs Linux while the HiFi4 DSP runs Zephyr.
Linux
can handle user-space interfaces and high-level control, while the DSP executes computational tasks under Zephyr
RTOS,
exchanging data in real time through shared memory.
Audio Offload with Sound Open Firmware (SOF)
For advanced audio applications, SOF builds on Zephyr RTOS to provide a complete open source
audio
processing framework on the HiFi4 DSP.
SOF enables professional-grade, low-latency audio pipelines, fully integrated with Advanced Linux Sound Architecture
(ALSA) on Cortex-A platforms. It supports:
- Multi-channel audio routing
- Voice pre- and post-processing
- Audio effect chains and dynamic reconfiguration
This framework demonstrates how Zephyr enables scalable, production-ready DSP solutions for i.MX product line.
Advantages of Running Zephyr on the DSP
Running Zephyr RTOS on the HiFi4 DSP provides multiple benefits:
- Unified development flow: Common APIs, tools and build systems across ARM and DSP targets running the RTOS
- Performance optimization: Offload high-intensity compute or signal processing workloads from ARM cores
- Open and extensible: Leverages the open source Zephyr and SOF ecosystems to minimize long-term technical debt
- Scalable system design: enables seamless cooperation between Linux, Zephyr on ARM and Zephyr on DSP
Bringing it All Together
The Cadence HiFi4 DSP integrated in NXP i.MX and i.MX RT processors is a high-performance, low-power compute engine
well
suited for signal processing, audio and AI acceleration. Through Zephyr RTOS support, this DSP becomes an integral
part
of a unified, heterogeneous processing environment.
From basic Zephyr examples such as hello_world, to performance-oriented number-crunching routines and complex
intercore
communication with OpenAMP, Zephyr on the HiFi4 DSP delivers a scalable foundation for innovation. Together with
SOF, this capability extends to production-ready audio pipelines and advanced embedded workloads, offering
flexibility and openness across the NXP ecosystem.
Zephyr RTOS enables the HiFi4 DSP to operate as a powerful coprocessor—unlocking new performance and efficiency
opportunities for next-generation of embedded designs.