Exemplars

Exemplars is a rule-based, object-oriented framework for dynamic generation of text and hypertext. It is designed to flexibly support a wide range of text generation methodologies, while providing high performance, scalability, and ease of integration with other application components.

Text generation applications vary in the depth and granularity of the linguistic models they require in order to produce satisfactory text. In machine translation, the broad range of possible inputs and outputs is most easily handled with a relatively fine-grained, multi-level linguistic model, which the text generation component can use to generate texts from syntactic structures. In applications such as data summarization, on the other hand, the range of both input concepts and output texts is more constrained, making a template-based approach more feasible.

The Exemplars framework lets developers mix and match phrasal templates with more sophisticated linguistic models, in order to provide "just enough" textual articulation for a given application. The basis of the framework is the notion of an exemplar, a template-like text planning rule which represents an exemplary (or expert) way of achieving a communicative goal in a given communicative context. Unlike traditional textual templates, which tend to produce rigid "boilerplate" text, exemplars are both recursive and object-oriented, making it easy to generate a wide variety of fluent texts, in response to widely varying input conditions. Intelligent processing of applicability constraints means that exemplars scale easily to handle large numbers of input variables, unlike simple "if-then" logic, which is subject to combinatorial explosion as the number of inputs increases. Revision rules can also be used to "smooth" textual output in cases where the number of possible phrasal combinations makes pure top-down planning impractical.


Features

  • Java- and XML-based implementation provides high performance and easy integration with other application components.
  • Framework includes utility classes that make it easy to implement Java servlet-based dynamic hypertext systems.
  • Revision rules for "text smoothing" ensure fluency of text at all levels, from rhetorical structure to punctuation.
  • Integration with RealPro allows authors to use abstract specifications of natural language syntax that are automatically converted to appropriate surface forms.

Applications

CoGenTex has used Exemplars to develop these commercial products, custom applications, and research prototypes:
  • Project Reporter — a web-based project report generator
  • Definition Builder — an innovative data mining tool with a natural language interface
  • EMMA — a tool for managing software requirements and design evolution
  • CogentHelp — a prototype help-authoring system based on text "snippets"

For more information

Contact .


Papers

(c) 2010 CoGenTex, Inc. All Rights Reserved.