SALE MX - Model Extraction from Natural Language Texts
SALE MX aims at the extraction of UML models from natural language text. To avoid error prone NLP, SALE MX starts after a (currently manual) annotation of a NL text. The annotation explicitly marks the semantics of the text, thereby documenting a common understanding of the requirements (see Preparing a Text for details).
The basis of the entire process is SENSE, the Software Engineer's Natural Language Semantics Encoding. SENSE describes how semantics can be encoded and used to process NL texts. SALE (the SENSE Annotation Language for English) is one possible realization of the SENSE process and provides a set of thematic roles with which you can explicitly encode the semantics of texts. Even though designed for English, SALE is also usable for various languages like German, French an Hungarian. See the examples section for further information.
SALE also comes with an ANTLR based compiler that transforms the annotated text into a graph representation, which can be loaded into GrGen.NET. This graph is the internal discourse model of the text an is the central artifact of our process. More or less simple graph rewriting rules are then used to evaluate the structure of the semantics. We also use graph rewriting rules to produce an internal graph representation of an UML document which can be saved to an XMI document for further processing.
Apart from the annotation process, the system works without user interaction and produces UML diagrams. This annotation process can be time consuming an is the bootleneck of our system at the moment. Therefore we aim at providing a supportive tool for annotators and try to (pre-) annotate texts automatically (see AutoAnnotator) for details.
SALE MX - System overview
Future components are linked with red arrows, implemented components have a blue underground.
Base system and subprojects
- SENSE
The Software Engineers’ Natural language Semantics Encoding is the basis of SALE MX - SALE
- SALE Project-Info built by Maven
- Prepare a Text
How to annotate a document (e.g. a functional requirements specification) using SALE - Thematic Roles
The full list of supported thematic roles along with examples - Examples
Overview (I) and related subprojects
- Component SUMOX
A framework that is able to identify associated SALE constructs and to give a proposal to which UML elements they could be converted - Component UML - Consists of two subprojects:
- Component GRS2SALE
Disassembles a SALE graph instance back to a SALE document file. - Component UML Feedback
Synchronize changes made in the UML-tool with the original XMI-document.
Overview (II) and related subprojects
Related Publications
ListTagged(publications,salemx)?
Back to Home