Voice Processing
Comprehensive software that enables voice processing at the edge.
Important: This page contains secure information about our products. Sign in to access authorized resources.
Important: This page contains secure information about our products.
View Secure InformationImportant: Authentication is required to view secure content.
reenter your passwordComprehensive software that enables voice processing at the edge.
NXP offers a range of voice control, audio and communications software and solutions that provide high-quality, reliable embedded speech processing for human-to-human and human-to-machine local voice applications. NXP voice communication software offerings are designed for small-footprint, low-power applications running on our portfolio of MCUs, MPUs and DSPs.
Advanced audio tools for playback and tuning: equalizers, 3D sound, bass/treble enhancement, limiters, and stereo PCM support.
Optimized for wake word, ASR and AI chat with high-pass filtering, beamforming and acoustic echo cancellation.
End-to-end voice AI: wake-word detection, ASR, RAG-enhanced LLM for context-aware responses, and TTS for natural speech output.
Complete AI pipeline: wake-word detection, ASR, RAG-enhanced LLM for smart responses, TTS output, and chatbot fine-tuning from manuals.
Comprehensive voice solutions: wake word detection, voice commands, speech-to-intent, ASR transcription, and TTS conversion.
Smart noise reduction and echo cancellation for clear speech in one-way or full-duplex communication, with small and large AI models.
|
Software for Voice Processing at the Edge NXP delivers reliable voice, audio and comms solutions for human and machine speech processing. |
Fact Sheet |
Sep 19, 2023 |
Rev 1 |
VIT wake word and Voice Command Engine can be accessed through online tools and our MCUXpresso SDK. For VIT Speech to Intent, please contact us at voice@nxp.com with your specific requests.
Yes, visit our application software pack page or our Application Code Hub. You can also view demo videos showcasing our voice software.
Voice UI refers to “voice-first” devices that use voice as a user interface. NXP's Voice UI software technologies are VIT, VoiceSpot and VoiceSeeker.
Voice communications refer to two-way person-to-person communication using voice; i.e., telephony. NXP's Voice communications software technology is Conversa.
VoiceSpot is a very accurate, highly optimized wake word and acoustic event detection engine. It is based on deep learning neural network techniques and requires large datasets for training. VoiceSpot is appropriate for customers who need the highest response rates with the fewest false alarms and is also appropriate for customers who need to run in ultralow power states while waiting for the voice/acoustic trigger.
VIT software suite is built on phoneme-based automatic speech recognition technology. This technology maps spoken phonemes (the basic building blocks of speech) into words, which can then be recognized as wake words and commands and transformed into intents and actions. Because VIT is based on phonemes, it is possible to create wake words and command models quickly with a keyboard and NXP's online model creation tools. VIT wake word and Voice Command Engines are appropriate for customers who want to build custom wake words and voice commands independently or those who want to quickly experiment with voice as a user interface. VIT Speech to Intent is for customers who want to create a natural language understanding like experience on edge processors without the use of cloud connectivity and cloud ASR transcription services.
VoiceSeeker is a multi-microphone beamforming audio front end signal processing solution for voice user interfaces. VoiceSeeker discriminates between signal and noise and is especially effective in far-field, reverberant conditions. VoiceSeeker is offered in a standard free-to-use option and a premium option. VoiceSeeker without AEC is freely available via NXP's MCUXpresso SDK and integrates easily with VoiceSpot or VIT. The premium VoiceSeeker option includes an acoustic echo canceler (AEC) and is available via controlled distribution from NXP. VoiceSeeker is frequently used in far-field voice control applications like smart speakers and home controllers but can also be used in the mid- and near-field where interfering noise needs to be canceled.