A personal history of voice-recognition software

Speaking to your computer is now normal. I've been doing it for decades.

By Diane Feldman, with additional reporting by Matt Feldman


For most people, the use of voice-recognition apps is for productivity or convenience. If they have their hands full, they ask Siri what the weather is or they ask Alexa to set a timer. Others may try the technology simply out of curiosity.  

If you are able-bodied, you can most likely perform the following tasks with ease: using the keyboard and mouse on your computer, answering the phone, or walking to the thermostat to lower the temperature.  If, however, you have physical impairments, the only way to complete such tasks may be through voice-recognition software.  Individuals, such as those with spinal cord injuries or chronic diseases, may not be capable of typing and thus today rely on popular apps to do those same tasks. 

Some of the more common and mainstream products are Siri, Google Assistant, Alexa, and of particular interest to me, Dragon NaturallySpeaking.  Through improved accuracy and innovation, each of these products are well on their way to “can’t live without” status, though each comes with tradeoffs.  

Siri is on all Apple devices, including my iPhone, and as such, is everywhere I am. It is well-integrated with other Apple programs, but is not as seamless with third-party apps as other voice products are.  For instance, it will aid in making a grocery list but can’t actually order them for me like Alexa can.

The rather popular Alexa uses its Echo smart speaker to control home devices such as lights, thermostats and entertainment systems.  And if you need anything from Amazon, just tell Alexa to order it.

The Google Assistant, as you would expect, has all of the answers.  Just ask your questions out loud, rather than type them, and the answers with references will follow.  I also use it to control my smart home through integration with my Nest thermostat.

Each of these voice assistants is useful in different ways. Siri lets me type quickly on my phone without touching the screen, Alexa makes it easy to order supplies from the Amazon mothership, and the Google Assistant makes it easy to look up references. Though all three of these technologies are just a few years old, I’ve actually been using voice-recognition software for decades. 

Dragon NaturallySpeaking lets me use voice recognition to do just about everything that anyone else can do on a Windows computer. The marquee feature of Dragon is the MouseGrid, which divides the screen into boxes. A user selects one of the numbered tic-tac-toe boxes with their voice, and that box then becomes further subdivided. Drill down enough times to get the cursor exactly where you want it. Whereas the previously mentioned voice assistants condense a sequence of clicks, taps, swipes, and keystrokes into a single request (saying “play The Beatles,” as opposed to double-clicking iTunes, searching for “The Beatles,” and clicking the play button), Dragon translates the granular acts of clicking, typing, and scrolling for users who rely on voice commands. In some ways this can be more tedious, but in other ways it’s also more flexible. The PC software I use doesn’t need complex integration with Dragon to be compatible with it.

Due to a progressive disease, I lost use of my arms and hands in the late 1990s. I was a software architect designing and testing a new product then under development.  When I had to create what turned out to be a 40-page testing specification, my boss hired a temporary secretary for me and I spent all day dictating the document to her. It was then that I decided I had to master voice recognition software or stop working. At that point, Dragon NaturallySpeaking was at version 3 – it is now at version 15.

Dragon allowed me to continue working for several more years. To this day, I say that I am most independent on my computer. 

But Dragon isn’t the only major advance to support people with mobility impairments. When iOS 7 was released in 2013, it added Switch Control functionality to its suite of accessibility features. Switch Control interfaces with simple push-button devices that can have a wired or Bluetooth connection to the iPhone. This allows someone without the dexterity often required to operate a touchscreen to be able to operate an iPhone. (Similar Switch Access functionality was first offered by Android in 2014.)

My husband and children all had smartphones by 2010, but I was unable to join them in this increasingly digital world. At the time, I was using a power wheelchair and controlled it with a head array, a high-tech headrest. My wheelchair has a few different modes: a standard driving mode, one that adjusts the seating position, and one that raises the seat (the last one is very useful for crowded parties). Over five years ago, I read about a device called a TeclaShield, which provides a Bluetooth connection between the head array on my chair and the iPhone. This allows me to put my chair into yet another mode that controls my iPhone. With all these interconnected components, I can make phone calls, field text messages, read books and newspapers, and perform lots of other activities that people do on their phones. Demonstrations of my capabilities often include a hands-free call to my husband – this never disappoints my audience.

Not every app works perfectly – it depends how much the developer has incorporated the iPhone accessibility functions. But there’s much more that I can do than I cannot. I have also tried Google’s Pixel phone, but find that I prefer Apple’s Switch Control to Android’s Switch Access. I use a combination of Switch Control and Siri to accomplish tasks on my phone. If it’s noisy, I use Switch Control to type a message. But I will use the more expedient Siri and the hands-free editing capabilities whenever I can.

According to the federal government’s National Council on Disability, “the more reliant society becomes on technology to perform fundamental aspects of everyday living – how we work, communicate, learn, shop, and interact with our environment – the more imperative it is that people with disabilities have access to that same technology.” I agree. In some ways, voice-recognition technology can be a great equalizer. If humans speak about 150 words per minute but can only type 40, technology that allows speaking to a computer makes all who use it more productive. This technology should be available to everyone — those with and those without disabilities — and voice assistants help normalize this mode of interaction.

And yes, this article was written without ever touching the keyboard.


Elsewhere…


Diane and Matt Feldman are BNet readers from New Jersey. They are also Brian’s parents. Diane’s homework for the summer is to try to get Voice Control working on Matt.