Voice Processing

Comprehensive software that enables voice processing at the edge.

Applications
Products
Design Resources
Documentation
FAQs

NXP embedded voice communication suite.

NXP offers a range of voice control, audio and communications software and solutions that provide high-quality, reliable embedded speech processing for human-to-human and human-to-machine local voice applications. NXP voice communication software offerings are designed for small-footprint, low-power applications running on our portfolio of MCUs, MPUs and DSPs.

Voice Processing Applications

Industrial

Consumer

Design Resources

Development Boards and Designs

EdgeReady Voice Solutions

Complete, production-grade software and hardware platform, certified by NXP, for fast development and turnkey solution.

Documentation

Software for Voice Processing at the Edge

NXP delivers reliable voice, audio and comms solutions for human and machine speech processing.

Fact Sheet

Sep 19, 2023

Rev 1

FAQs

How do I get started with VIT?

VIT wake word and Voice Command Engine can be accessed through online tools and our MCUXpresso SDK. For VIT Speech to Intent, please contact us at voice@nxp.com with your specific requests.

Does NXP have voice software application examples?

Yes, visit our application software pack page or our Application Code Hub. You can also view demo videos showcasing our voice software.

What is the difference between voice UI and voice communications?

Voice UI refers to “voice-first” devices that use voice as a user interface. NXP's Voice UI software technologies are VIT, VoiceSpot and VoiceSeeker.

Voice communications refer to two-way person-to-person communication using voice; i.e., telephony. NXP's Voice communications software technology is Conversa.

What is the difference between VoiceSpot and VIT? When should you use one versus the other?

VoiceSpot is a very accurate, highly optimized wake word and acoustic event detection engine. It is based on deep learning neural network techniques and requires large datasets for training. VoiceSpot is appropriate for customers who need the highest response rates with the fewest false alarms and is also appropriate for customers who need to run in ultralow power states while waiting for the voice/acoustic trigger.

VIT software suite is built on phoneme-based automatic speech recognition technology. This technology maps spoken phonemes (the basic building blocks of speech) into words, which can then be recognized as wake words and commands and transformed into intents and actions. Because VIT is based on phonemes, it is possible to create wake words and command models quickly with a keyboard and NXP's online model creation tools. VIT wake word and Voice Command Engines are appropriate for customers who want to build custom wake words and voice commands independently or those who want to quickly experiment with voice as a user interface. VIT Speech to Intent is for customers who want to create a natural language understanding like experience on edge processors without the use of cloud connectivity and cloud ASR transcription services.

What is VoiceSeeker and how do you use it?

VoiceSeeker is a multi-microphone beamforming audio front end signal processing solution for voice user interfaces. VoiceSeeker discriminates between signal and noise and is especially effective in far-field, reverberant conditions. VoiceSeeker is offered in a standard free-to-use option and a premium option. VoiceSeeker without AEC is freely available via NXP's MCUXpresso SDK and integrates easily with VoiceSpot or VIT. The premium VoiceSeeker option includes an acoustic echo canceler (AEC) and is available via controlled distribution from NXP. VoiceSeeker is frequently used in far-field voice control applications like smart speakers and home controllers but can also be used in the mid- and near-field where interfering noise needs to be canceled.

NXP Smarter World Blog

Voice Processing

NXP embedded voice communication suite.

Voice Processing Applications

Industrial

Consumer

Voice Processing Products

VOICE PROCESSING SOFTWARE PORTFOLIO

Audio Processing

Audio Front End

Conversational AI

Voice Call

Voice User Interaction

Speech Enhancement