Li, Y. "VoiceLink: A Speech Interface For Responsive Media"
Work for a Member company and need a Member Portal account? Register here with your company email address.
Li, Y. "VoiceLink: A Speech Interface For Responsive Media"
We developed VoiceLink, a speech interface package for responsive media applications. It contains a set of speech interface modules that can interface with various multimedia applications written in Isis, a scripting programming language created at the MIT Media Laboratory. Specifically, we designed two command-and-control voice interfaces, one for iCom, a multi-point audio/video communication system, and another for HyperSoap, a hyperlinked TV program. The iCom module enables users to control an iCom station using voice commands while the HyperSoap module allows viewers to select objects and access related information by saying objects' names. We also built a speech software library for Isis, which allows users to develop speech aware applications in the Isis programming environment.
We addressed a number of problems when designing VoiceLink. In the case of the iCom module, visual information is used to seamlessly inform users of voice commands and to provide them with instant feedback and instructions, making the speech interface intuitive, flexible and easy to use for novice users. The major challenge for the HyperSoap module is the open vocabulary problem for object selection. In our design, an item list is displayed on the screen upon viewers' request to show them selectable objects. We also created an object name index to model how viewers may call objects spontaneously. Using a combination of item list and name index in the HyperSoap module produced fairly robust performance, making the speech interface a useful alternative to traditional pointing devices. The result of user evaluation is encouraging. It showed that a speech based interface for responsive media applications is not only useful but also practical.