Nowadays, from smart televisions to Bluetooth speakers, there is a multitude of electronic products that we can control with our voice. New generation multi-modal human machine interfaces allow hands-free operation of digital devices through voice control. Voice interfaces tend to be quicker, more reliable and less invasive than touch or mechanical button interfaces as they do not need the users to take their eyes and hands off their primary job.

Voice control has become ubiquitous as its applications have grown across several types of products used by the modern human. As per a survey by Meticulous Research, the overall voice and speech recognition market is expected to grow at a CAGR of 17.2% between 2019 to 2025 to reach $26.8 billion by 2025.

Voice Control in the automotive sector

The automotive sector is an early beneficiary of voice control through integration of voice-enabled infotainment and ADAS systems. The basic form of voice control through a predefined set of commands like ‘increase temperature’, ‘play next song’ etc. has become an inevitable requirement in mass market vehicles. Meanwhile, luxury cars are equipped with AI enabled systems that interpret indirect commands from natural language utterance by drivers and passengers. For example, when a passenger says “I feel cold”, the multi-zone comfort system increases the temperature setting in passenger zone and if the driver says “I need some coffee”, the car’s AI system will aid navigation to nearby coffee shops. Certain systems even sense sirens of approaching emergency vehicles and assist drivers to change lines or slow down.

Voice Control in the healthcare sector

Healthcare is another segment where voice control is emerging rapidly. In an operating room, voice enabled technical equipment reduces the degree of mechanical interventions by the surgeon and helps to keep their hands and eyes on the patient. In sensitive diagnostic environments such as X-Ray machines, MRI-Scanners and so on, operators can control the devices through voice. Speech to text conversion even liberates doctors from the burden of having to scribble prescriptions by hand.

Voice Control in other sectors

Though in the nascent stages, several other industries have also started research and implementation of voice control systems. For example:

  • Aerospace – Aircraft cockpits have intricate instrument panels and traditional displays that make a pilot’s job very complex. Adding voice control to these systems in the cockpit will reduce the pilot’s workload significantly.
  • Energy – Automation of IoT devices is a well-known application of voice control. Intelligent Power Grid Control Systems shall be equipped with voice control in the near future.
  • Consumer Electronics (CE) – Televisions, Smart Speakers, Refrigerators, Air Conditioners and in any other consumer electronic device, the end user’s preferred mode of operation has changed to voice. With voice, users can operate these devices while they are occupied in other tasks such as cooking, watching TV and so on. Hence, CE Manufacturers now consider voice control as a standard feature.
  • Manufacturing – In addition to enabling hands-free and remote work environments, voice control can also help workers to search for information from digital manuals. For example, during troubleshooting and initial machine setup or maintenance activities, a technician could start a google search with just a voice prompt.
  • Rail – Train drivers too can benefit from the ease of voice control as in flight cockpit. Voice controlled ticket vending machines will be a boon to the blind and visually impaired passengers.

How Voice controlled solutions work?

The fundamental idea of voice control involves capturing voice signals, executing speech recognition and natural language processing algorithms on noise cancelled signals, generating intents for device controller applications or just display / prompt output to user. A voice session could be initiated either through a ‘Push To Talk’ button or a ‘Wakeword’ pronounced by the user.


The voice processor software may be embedded locally or be deployed in a cloud server. Locally embedded voice apps work very fast, but the overall system performance may be affected as they are highly resource intensive. However, if the voice-processing solution is deployed on the cloud server, internet connectivity issues might hinder the uninterrupted voice experience of users.

Therefore, hybrid solutions are also considered by some vendors. In such systems, a light-weight voice processor engine that supports minimal set of features will be deployed locally as a fallback solution, if the natural language capable full functional cloud based solution is inaccessible.

The implementation approaches for voice controlled solutions are broadly classified into two:

  • Develop an independent custom voice processor engine using automatic speech recognition SDKs provided by vendors like Cerence, Sensory, Sestek and LumenVox.
  • Integrate virtual voice assistants (such as Amazon Alexa, Google Assistant and Apple Siri), with their built-in skill sets and develop additional skills if required.

The challenges

Despite the advantages, the development and testing of a new generation voice control system has several complexities such as:

  • Millions of command combinations are possible in multi-language and natural language understanding capable systems. The system has to accurately identify all such commands.
  • Support for context-aware multi-turn conversations
  • Support for proactive suggestions to customer based on system behavior
  • Co-existence of multiple voice assistants
  • Data privacy challenges

However, service providers with relevant development and testing experience in voice control domains can help to address these challenges through custom tailored solutions.

QuEST’s Capabilities in Voice Control

QuEST’s two-decade old experience in software encompasses ample exposure in integrating and testing embedded as well as cloud based voice control solutions. QuEST’s expertise in embedded software, cloud computing and machine learning also supplements its position as a voice control software solution provider.

In certain customer projects for infotainment, QuEST has handled integration of voice assistant (Alexa). The use cases in such projects not only include voice control of infotainment functions (e.g. temperature adjustment, playback control of local media sources) and access of remote media sources (like Amazon Music, Spotify), navigation and weather features through Alexa app in Infotainment system, but also the control of smart home devices from car through the Alexa voice skill deployed in cloud server.

Questians have worked on development of targeted Virtual Digital Assistance technologies (VDAs) that can help businesses in enhancing transaction activities or optimizing call center operations which help reduce operational costs across automotive, smart home systems, insurance and robotics.

QuEST also offers development of language translation solutions that can be used in public domains like mass transit systems, signboards, public announcements systems (PAs) to translate text or voice instructions from local language into user’s choice language set on his smartphone, smart glass and other devices.

Written by Divya MS

on 15 Dec 2021

Senior Technical Architect

Automotive Vertical