For best experience this site requires Javascript to be enabled. To enable on your browser, follow our accessibility instructions.

Introducing Voice Intelligent Technology

April 6, 2022
by Chris Welsh

From smart homes to automotive infotainment to smart factories, new and innovative use cases for voice control are rapidly emerging. Yet, implementing reliable, on-device voice control can be challenging for developers. NXP has introduced royalty-free voice intelligent technology software and online training tools to reduce the cost and complexity of adding voice to edge devices.

Touch-based human-machine interfaces (HMI) are evolving toward easier, more intuitive ways to control the myriad devices in our lives. Voice – human speech – is the most natural, intuitive HMI and a popular touchless interface for the next generation of intelligent edge devices.

We all want voice control, and more of it. With the explosion of smart speakers and AI-based voice assistants like Alexa, Siri and Google Assistant, we’ve grown accustomed to the convenience of voice control. This is creating strong demand for voice-enabled devices in our homes, workplaces and cars.

We’re seeing more and more use cases for voice enablement beyond smart speakers and remote controls. Tapping into the power of machine learning (ML) and speech modeling, developers can add local voice control capabilities to countless smart home, building automation, wearables, industrial and automotive infotainment applications using custom commands and multiple wake words. All of this capability is possible without the end product needing an internet connection. Running entirely on-device, local voice control minimizes the privacy, security and latency concerns associated with cloud-based voice assistants.

What’s new? Voice intelligent technology offers customers a free fully comprehensive voice control software package delivered as a ready-to-use library.

Just Add Voice Intelligent Technology

From IoT developers to system integrators, everyone wants to add voice to smart edge devices. But there’s no one-size-fits-all voice solution. Just as there are different price points, feature sets and performance levels within NXP’s edge processing portfolio, we also provide varying levels of voice technologies, from low-cost, easy-to-deploy local voice control to highly accurate, high-performance voice solutions. Our comprehensive portfolio of silicon- and software-based voice solutions scales across our EdgeVerse^™ processing portfolio.

NXP is committed to simplifying voice control deployment, reducing system cost and complexity, making it easier for developers to bring new, on-device voice control innovations to market. Our goal is to help developers everywhere add voice interfaces to almost everything, from consumer electronics and smart home controls to industrial and automotive applications. Imagine the hands-free convenience of controlling a washing machine, pre-heating an oven, opening the trunk of a car, or selecting an elevator floor with a simple voice command. The use cases for voice are endless.

The latest addition to our voice enablement portfolio is voice intelligent technology (VIT), a comprehensive, state-of-the-art voice control software solution available as a ready-to-use library in the MCUXpresso software development kit (SDK). We’ve launched the VIT solution to inspire developers to invent new and innovative applications for local voice control and the freedom to easily train their own commands without the need for specialty tools or audio recordings required by other solutions. Because VIT software is royalty-free, it can be scaled to mass production on edge device applications at no cost to developers.

Based on state-of-the-art deep learning and speech recognition technologies, VIT software provides a complete far-field audio front end (AFE) supporting up to three microphones, an always-on wake word engine and a voice command engine, along with online tools to generate customer-defined wake word and voice command models.

VIT Overview Video

Simplifying Voice Enablement for Mass Deployment

Implementing reliable, on-device voice control can be challenging for developers who need to select the optimal signal processing hardware platform, as well as speech processing software, which includes an AFE beamformer, a separate wake-word engine and a voice command engine.

VIT simplifies the developer's job and streamlines the development process by providing a comprehensive, flexible software solution that incorporates everything needed to create on-device voice control applications with no need for the complexities of cloud connectivity.

VIT Key Features:

Always-on technology
Custom command and wake word creation
Far-field audio front end supporting different microphones topologies (no tuning required) -with up to 3 microphones supported
Voice activity detection that helps minimize processing load during silent, non-speech periods

VIT uses state-of-the-art deep learning technology to help developers create and program voice command vocabularies. The VIT tool maps user-entered text commands to phonemes sequences and generates a downloadable model file for your target device software. Speech commands are processed using ML and deep learning technologies to create the neural network model.

VIT Diagram

The VIT far-field AFE supports different microphone topologies with no tuning required, as well as local voice command recognition with on-device processing. With VIT’s text-to-model approach, it’s easy to make custom versions of wake words and commands.

Broad Edge Processing Platform Support

VIT software is available on several popular NXP i.MX edge processing platforms based on Arm^®Cortex^®-M7 and M33, Cadence^®Xtensa^®HiFi 4 and Fusion F1 cores. VIT is currently supported on the following i.MX crossover MCU platforms, with device support for other products in the future.

i.MX RT500 MCUs with M33, DSP and GPU cores
i.MX RT600 MCUs with M33 and DSP cores
i.MX RT1060 MCUs with a M7 core
i.MX RT1160 MCUs with M7 and M4 cores
i.MX RT1170 MCUs with up to 1 GHz MCU with M7 and M4 cores

Tools to Accelerate Time to Market

VIT software is easy to use, removing barriers to entry for developing on-device voice applications on edge devices. To accelerate time to market, we provide a comprehensive development environment including our popular MCUXpresso SDK and fully functional example applications, enabling quick evaluation of voice control on target MCU platforms.

Online training tools for VIT software are available to customers free of charge, regardless of end application production volumes. These online tools enable developers to define custom wake words and voice commands using a simple text entry and without the need for voice recordings.

VIT Model Generation Tool

The Intelligent Way to Add Voice

Add affordable, easy-to-use local voice control to your next edge application. VIT is now available at zero cost to NXP customers as a comprehensive, ready-to-use library in the MCUXpresso SDK. For more information and access to our VIT model generation tool, visit www.nxp.com/vit.

NXP offers a range of voice control and communications software and systems solutions that provide high quality, reliable embedded speech processing for human-to-human and human-to-machine voice applications.

Tags: Edge Computing, Technologies

Author

Chris Welsh

Director of Business Development for Voice and Audio, IoT Segment, Edge Processing Business Line

Chris joined NXP in July 2021 as a part of the acquisition of Retune DSP where he was a Partner. Chris is focused on creating customer value through differentiated voice software technology and services. Chris brings to NXP more than 25 years of experience in the Embedded Voice and Audio business as an Engineer, Business Developer, Founder, General Manager and Executive at companies including AT&T, Lucent Technologies, MWM Acoustics, Harman International and Retune DSP. Chris holds a BSME from Purdue University and a MSME with a specialization in Acoustics from Pennsylvania State University.

Introducing New eBook: Essentials of Edge Computing

Introducing Voice Intelligent Technology

Just Add Voice Intelligent Technology

Simplifying Voice Enablement for Mass Deployment

Broad Edge Processing Platform Support

Tools to Accelerate Time to Market

The Intelligent Way to Add Voice

Author

Chris Welsh

Related Articles

Introducing New eBook: Essentials of Edge Computing

Jump Start Product Development with NXP Application Software Packs

Developing Winning Accessories for Today's Gaming Market