Engineering POV: How Snips & NXP offline voice control solutions enable
simple, natural interaction with everyday devices.
Snips and NXP are already providing full voice natural understanding on MPUs,
now they’re working together to bring voice to every device.
Controlling devices through voice interactions is more natural and
straightforward than fumbling through complex user interfaces, especially on
smaller, lower cost devices that don’t usually have a touch screen.
To help manufacturers easily add voice capabilities to their products, Snips
has combined their expertise in on-device voice interface solutions with
i.MX RT crossover processors. This new solution works with an application-specific model. For example, in
a washing machine, a user may initiate a wash cycle through spoken commands.
The washing machine will then ask appropriate questions to set water
temperature, spin cycle and any other appropriate parameters.
The combination of the offline implementation that eliminates the need for
cloud connectivity cost adders, such as a Wi-Fi module, running on
NXP’s low-cost i.MX RT
crossover processor platform enable breakthrough system cost savings, making
it suitable for a broader range of applications such as switches, dimmers,
small appliances and thermostats.
Another key benefit is its privacy by design, which means none of the audio
gets transmitted to the cloud – all processing is done locally on the
device itself. This voice solution incorporates many cutting-edge technologies
that are typically found in high-end hardware and co-processor DSPs.
Leveraging the performance of i.MX RT processors, this solution can accomplish most and in many cases – all
of the capabilities that are typically offered in MPU+DSP designs.
The audio processing front end and the Snips local control library are the
unique enabling technologies. The local control library package is easy to use
and features both hot word and command detection. These two features can be
used together or separately to customize the user experience.
From the software perspective, the library is efficient and easy to integrate
into any application. It uses less than 100KB of RAM for typical models,
leaving plenty of RAM for the rest of the application. Integrating the library
into an application is easy as well.
After setting up and initializing the library, the application simply feeds an
input audio stream into the Snips library. As the library detects the hot word
or command, it executes callbacks for the user application to handle them.
Feeding the library is the audio processing front-end. This component is
responsible for listening to multiple microphones (up to 3 on the voice
solution) to clean up the audio by applying processing such as beam forming
and echo cancellation. The front-end then chooses the best beam and sends the
audio to the library.
Together, NXP and SNIPS are providing a complete, fully tested implementation
of local voice control that can be rapidly integrated into any application.
On the horizon, NXP’s scalable IoT solutions architecture based on i.MX
processors and Snips voice technology can easily be combined with other
leading AI/ML capabilities such as facial recognition, object detection and
anomaly detection to enable a variety of exciting new applications.
Snips and NXP are currently working with select partners on this solution.