home about sciencentral contact
sciencentral news : making sense of science
life sciences physical sciences technology full archive
April 7, 2013

Vocal Joystick

Post/Bookmark this story:

Search (Archive Only)

Look Ma, No Mouse

Mind Controlled Robot

Read My Eyes


Special Interest Group on Accessible Computing

email to a friend

It's a device that could open a whole new world to people who are paralyzed, and simplify some tasks for the rest of us. As this ScienCentral News video explains, researchers are developing a voice-activated alternative to the computer mouse, something they're calling a "vocal joystick."

Open Wide and Say "Aaahh"

With nothing more than a consumer grade computer microphone, and some very special software, Richard Eldridge is able to guide a computer's cursor across a screen. Like someone warming their voice to sing, Eldridge says "aaaaaa" for a couple of seconds as the cursor travels up, the "eeeeeee" to move the cursor left and a "ch" sound for a mouse click. The cursor moves as long as Eldridge makes the sound.

This is important for Eldridge because a car accident left the psychology student mostly paralyzed from the neck down. "It's more efficient," says Eldridge of the joystick as compared with older programs he uses that use words such as, "move left" to move a cursor a set distance.

Jeff Bilmes, associate professor of electrical engineering at the University of Washington, is creator of the vocal joystick. He says current speech recognition software is an attempt to replace the keyboard, but that there hadn't been much work, "to essentially replace the mouse, using your voice."

The challenge is to make the vocal joystick start and stop instantly, because, as Bilmes explains, "if you are in the middle of drawing, you don't want to see what you're drawing delayed by a couple of seconds."

Current speech recognition programs may take a few seconds to process a voice command, making it impossible to move a cursor to a precise location. Of the vocal joystick, Bilmes says, "about a hundred times a second it takes essentially a snapshot of your voice and figures out what you're saying at the moment."

A second advantage is how simple the commands are to learn. Eldridge says it took him five minutes to associate sounds with directions and that now, "I don't even really think about the vowel sounds I'm making."

Other attempts have been made to control cursors without speech commands. One such method is by using an eye-tracking device, where a camera focuses upon your eyes and translates eye movement into cursor movement. However, Bilmes notes, "You're eyes are really meant for receiving information, not for specifying information. Oftentimes the mouse cursor can get in the way of what you're looking at, say when you're reading an article on the web."

Jon Malkin, a University of Washington graduate student assisting in the project, has been testing the joystick in other applications. In one demonstration he's using the vocal joystick to control a small robotic arm. Since the arm is more complex than a cursor, additional sounds are needed for functions like turning the robotic arm's wrist or opening and closing its claw. Bilmes notes this is the first time vocal commands were used to control a three-dimensional object. Bilmes and Malkin presented this novel use of vocal commands at the October 2007 Assets Conference on Computers and Accessibility.

Bilmes has allowed several hundred people to try out the joystick. Before actually using the vocal joystick, a new user must spend about two minutes saying the various sounds so that the computer can "learn" the user's voice.

While it's working well, he hopes for improvement, noting "We're not as accurate, nor as fast as an existing mouse now." He says the program challenge is keeping the joystick both fast and accurate. He says, "There's sort of a trade off with accuracy and speed. So, with the vocal joystick you can be as accurate as you want if you're willing to be slower." Someday, he hopes the device will be as accurate as a mouse.

This research was presented at the October 2007 Assets Conference on Computers and Accessibility and was funded by the National Science Foundation.

       email to a friend by Jack Penland

Science Videos     Terms of Use     Privacy Policy     Site Map      Contact      About
ScienCentral News is a production of ScienCentral, Inc. in collaboration with The Center for Science and the Media 248 West 35th St., 17th Fl., NY, NY 10001 USA (212) 244-9577. The contents of these WWW sites © ScienCentral, 2000-2013. All rights reserved. This material is based on work supported by the National Science Foundation under Grant No. ESI-0515449 The views expressed in this website are not necessarily those of The National Science Foundation or any of our other sponsors. Image Credits National Science Foundation