Voice UI: Overhyped or Overdue?




Since HAL in Arthur C. Clarke's "A Space Odyssey" and StarTrek's "Computer", and perhaps before, we have been circling ever closer towards voice-UI.  Often more in aspiration and desire than in reality.  The question is, with the emergence and refinement of Alexa, Siri, Cortana and "OK, Google" has voice-UI finally arrived?

The reality is that we are in the early stages of the 'long nose' of innovation.  But now that voice-UI can migrate from the curated environments of movies and television, we have a different question: "How useful is voice-UI"?

In an open-plan office, would everyone be dictating their word documents to their computers?
Would you really expect a coder to say "for open bracket i equals zero, i is less than collection length, i plus plus"?
Can you effectively use a voice-UI in a night-club?

It is worthwhile looking back on previous UI innovations such as the mouse and the touch screen.  The experiences with these show us that voice is likely to complement existing input devices, rather than replacing them.  In fact, voice may be more limited due to the scenarios in which it is useful.

We need to be cognisant of some of the limitations of voice:
  • There are privacy concerns with the data being sent to 'the cloud' for analysis, recognition and interpretation
  • Voice recognition is slower than other input (for now)
  • Listening is far slower than reading, especially when it comes to scanning through text

However, there are "eyes-busy" and "hands-busy" scenarios where voice-UI is preferable:
  • Driving
  • Cooking
  • Manual labour in 'quiet-ish' environments

Most information sharing today has to be done in the same medium, agreed by all parties:
  • Phone conferences: everyone participates by voice
  • Email and instant messaging: everyone participates by text

Where things start to become transformational is when people can be involved in the same communication exchange, but use different mediums to do so.

Comments