Project

Teaching Agents to Recognize Text

Increasingly, software agents will need to detect certain kinds of semantically significant information that appear amidst a larger body of relatively unstructured information, for applications such as Web page analysis and data mining. Currently, the only way to express patterns of text is through grammars (a set of textual rules), which drive parsers (the recognition procedures). But grammars are difficult and error-prone to write. We are experimenting with an agent that learns recognition rules by example. The agent generates a set of hypotheses about the interpretation of the examples and dynamically displays them to the user, who may choose among them and edit them incrementally. This agent brings the power of parsing technology into the hands of non-expert users.