Software-defined cars will need various mechanisms to keep the vehicle safe and operational under
all circumstances. Proprietary solutions for these mechanisms require large verification efforts
and are hard to integrate with diverse software architectures. Is there a standardized software
framework for safety-critical distributed communications?
The Shift Towards Software-Defined Cars
For many years, the traditional automotive systems have been adding weakly programmable electronic control units
(ECUs) that perform isolated functions. At present, however, the advanced
automotive design is starting to evolve towards flexible and interoperable software distributed across
only a few
(zonal) processors. The distributed software performs coordinated tasks of automated driving, infotainment,
powertrain and body control, while sharing processors, networks and sensors to reduce the system
cost. The transition to
software-defined cars is one of the most significant trends in the automotive industry, making
software features a key differentiator .
To compete in this market, car manufacturers need to quickly and easily build modular distributed
applications, which require programmable, reliable and cost-effective semiconductor devices to run
on. Therefore, standardized software platforms with easy-to-use application programming interfaces (APIs),
such as POSIX
and AUTOSAR ,
are becoming more popular. A key component in these software platforms is the middleware, the
software layer between the various operating systems and high-level applications (see the figure
below). Simply put, the middleware is a software library that enables distributed system
components to communicate with each other. The safety of software-defined cars highly depends on
the middleware and the underlying network processors for reliable real-time data communication
among distributed processes.
A Safety Checker Prototype On the S32G Processor for Automated Driving
State-of-the-art automated driving (AD) systems often adopt the dual-channel architecture for
redundancy, i.e. a fallback channel is implemented next to the main channel that controls the AD
system in normal situations. If the main channel fails, the vehicle control would switch back to
the fallback channel. This way both safety and availability of the AD system are enhanced. Such an
architecture requires a safety checker to verify the health status of the main channel and trigger
a safety mechanism, such as a safe stop of the vehicle, when necessary. Obviously, the safety
checker’s computation and communication are safety-critical, which sets high demands on its fault
tolerance and reliability.
NXP S32G vehicle network processors
are an ideal fit for implementing highly reliable AD systems with various safety mechanisms. The
Arm® Cortex®-A53 cores in the S32G offer high-performance computing
capabilities and the ASIL D Cortex-M7 safety cores are suitable for running safety-critical
functionality in the lockstep mode. Moreover, the
SJA1110 Ethernet switch
integrated on the
S32G GoldBox reference design
for service-oriented gateways offers time sensitive networking (TSN) features for real-time and
reliable communication to the higher-level AD applications distributed on the network.
Besides high integrity hardware, the
data distribution service (DDS) middleware software running across the Cortex-A53 and Cortex-M7 cores in the S32G manages
the data and communication of the distributed system. The DDS middleware protocol is based on the
publish-subscribe pattern that is standardized by the object management group®
(OMG). DDS has been integrated into various key automotive platform ecosystems,
such as AUTOSAR Adaptive and ROS2. DDS provides low-latency data connectivity, reliability and
scalable data-centric communication. Moreover, DDS comes with a rich set of built-in quality of service
(QoS) policies that control the DDS behavior, such as resource consumption and
communication reliability. To learn the fundamentals of DDS and the QoS policies, you can try the
application or view the
demo video .
Note that DDS for an extremely resource-constrained environments is implemented using the
OMG DDS-XRCE protocol . This is a client-to-agent protocol, meaning the DDS-XRCE client node talks to the DDS
network via an external agent node. DDS-XRCE is ideal for developing lightweight DDS applications
for IoT devices, but the agent can become a single point of failure when used in safety-critical
Connext® DDS Micro
running on S32G Cortex-M7, however, talks directly to the full-fledged DDS network without
any bridge or broker, thus eliminating a single point of failure. RTI Connext DDS Micro can also
be built and integrated in ISO 26262 automotive safety contexts up to ASIL D.
Here are a few DDS QoS policies that are particularly interesting for implementing a redundant
automated driving channel:
Deadline indicates if the data send and receive time requirements are met. DataWriters and
DataReaders can notify the application each time transmission and/or reception timing
constraints are not met, respectively.
Liveliness indicates if a new DataWriter (DDS publisher node) joins or is still present on the
Exclusive ownership and ownership strength specifies that only the DataWriter with the highest
strength value can write to a particular instance.
Transport priority specifies that the data sent by a DataWriter or DataReader is of a certain
priority. To learn more about how this QoS policy can link DDS topics to TSN streams, please
webinar on DDS and TSN integration
and our open-source
example project of DDS-TSN integration on GitHub .
The DDS built-in QoS policies are ready for use once the DDS middleware layer is in place. This
eases the development process and highly improves the interoperability and reusability of the
software components. There are several variations of DDS distributions that suit different system
requirements of the distributed AD components. Implementing DDS across the distributed AD system
establishes both a common communication and data management framework and also provides increased
system diversity with little effort. In addition, the system built on top of DDS can be easily
modeled and configured using one single DDS XML file. The XML file format makes system development
easier and helps the architects and the application developers design the software-defined car at
the system level.
Safety Mechanisms Using DDS QoS Policies
When combined properly, DDS QoS policies can be used to enable various fault handling mechanisms
and safety measures against performance limitations. The DDS middleware layer establishes a common
framework for all the AD components running on top of it. Various safety mechanisms at different
scales can be implemented without much engineering effort, such as the
to a complete redundant AD channel or the seamless takeover of components. Below we elaborate on
safety mechanisms which are implemented in our proof-of-concept demo setup.
Fail-over is a widely used safety mechanism in safety-critical systems. It often relies on
fail-silent components, which stop producing output when they fail. Typically, when the main AD
channel silently fails, the system should fall back to the redundant safety channel, which
maneuvers the vehicle to a safe state. This mechanism can be implemented using DDS Liveliness and
Ownership QoS policies. If the vehicle control DataWriter in the main channel silently fails or
loses communication with the rest of the system, the samples produced by the safety channel’s
DataWriter with a lower ownership strength will automatically become visible to the vehicle
actuators and will start controlling the vehicle seamlessly. Meanwhile, the change of the DDS
network Liveliness due to the failed DataWriter is monitored by the safety checker. Recovery
mechanisms, such as reboot, can be implemented based on such diagnostic information.
Even when the failing AD component is not fail-silent, a takeover safety mechanism can be
implemented to actively overrule the malfunctioning or unreliable component without compromising
the system availability. The takeover can be realized by using DDS Exclusive Ownership and
Ownership Strength QoS policies. These QoS policies control which DataWriter is allowed to send
data to the DataReader. When the safety checker detects that the primary DataWriter does not
operate properly, such as missing the Deadline or sending out-of-boundary data, it can trigger a
healthy DataWriter with higher ownership strength to send data to the DataReader.
A Hybrid Approach Combining Fail-Over and Takeover
DDS Deadline, Liveliness, Exclusive Ownership and Ownership Strength can be combined to implement
a hybrid mechanism that takes advantages of both fail-over and takeover mechanisms. For example,
by monitoring the DDS network Liveliness, the safety checker can flexibly trigger the fail-over
mechanism when a node fails silently, or activate the takeover mechanism when a running node is
not fail-silent and publishes faulty data or misses the Deadline. Transition faults in the system
can also be easily dealt with by seamlessly switching between the main channel and safety channel,
thanks to the different Ownership Strength QoS values.
Evaluation of the Safety Mechanisms
To evaluate our DDS-based safety mechanisms on S32G in a realistic setup, NXP teamed with an
automotive engineering team of experts at
Real-Time Innovations (RTI) . RTI is a leading software framework provider for autonomous systems, marketing a family
of DDS products and tools called
Connext DDS .
Together, we integrated the NXP safety checker into an
Autonomous Valet Parking (AVP) demonstration based on
Autoware.Auto , an open source project by the
Autoware Foundation . The
demo shows how the vehicle drives itself into a valet parking lot. Autoware.Auto is a full-fledged
end-to-end automated driving framework based on ROS2 which uses DDS as its underlying middleware.
Demo Setup Architecture
The architecture of our hardware-in-the-loop evaluation demo setup is shown in the figure below:
The majority of the Autoware.Auto AD stack, such as Localization, Perception, Prediction, Path
Planning, is running on ROS2/DDS on the Layerscape processor on the NXP BlueBox automotive high performance compute development platform. The DDS middleware in this case is
RTI’s Connext Pro , integrated with ROS2 via RTI’s
RMW layer component.
The S32G in the
NXP GoldBox for vehicle networking
serves as a zonal controller in our setup, where the drive-by-wire software interface is running
on ROS2/DDS on the S32G Cortex-A53 cores. In a real vehicle, this interface is used to convert
vehicle control commands in Ethernet packets to CAN messages for the actuators. In our
simulation environment, it is used to convert data between formats used by the Autoware.Auto and
LG SVL end-to-end simulation platform . Our safety checker with the safe takeover and fail-over mechanisms is based on RTI DDS
Connext Micro running on S32G Cortex-M7 cores.
The road users, ego-vehicle actuators and sensors data are simulated by the LG SVL simulator
running on an external simulation PC.
Demo Video of Fault Handling Using DDS-based Safety Mechanisms
In our evaluation setup, we injected faults similar to real-life issues into the AD system and
observed how our DDS-based safety mechanisms handle the situation. The demo video below shows how
our safety checker monitors, detects and reacts to system faults such as software crashes, power
loss and network connection loss.
To cope with the transition to software-defined cars, automotive system software needs to be
modular, reliable and scalable. As shown in our Autoware.Auto AVP experiments, the NXP S32G ASIL D
Cortex-M7 processor cores are well capable of functioning as a safety checker in automated driving
systems. The RTI Connext DDS middleware contributes to this process by offering a communication
framework for both powerful processors and resource-constrained microcontrollers across the
automotive system. With its rich set of quality-of-service policies, DDS enables safety mechanisms
in a software-defined car with low engineering effort and high interoperability.