wiki:MX/SALE

SALE

We have developed the SENSE Annotation Language for English to be one possible implementation of SENSE. SALE is a simple annotaion language that structures naturale language texts. It labels phrases and subphrases aswell as thematic roles that are played by words or phrases. SALE also has markers for properties (multiplicities and attributes) of words and can handle sets.

Have a look at the annotation howto for detailed information on how to annotate a text using SALE. You can also have a look at our examples.

The SALE Compiler

The SALE Compiler is a Java application that transforms an annotated text into a graph representation. The ANTLR based compiler generates commands that can be used in GrGen.NET to build a graph.

Graph Representation

The graph consists of "top level nodes" for the sentences that are linked using next edges:

phrase -next-> phrase -next-> ... 

Every sentence is also connected to it's elements using typed edges. The types of the edges represent the thematic role the element plays in a given sentence. Note: Edge types omitted for clarity here...

phrase1 ---next---> phrase2 ---next---> ...
 |-> element_1       |-> element_1
 | ...               | ...
 |-> element_2  <----|
 |-> comma
 ... 

As you can see in the above example, elements can be reused. To do that, we use references: element_1 is used twice but without a reference; thereby you can express two entities of the same type (speaking in UML: two instances of one class). element_2 is used twice also - but this time we used a reference; thereby we express that the very same entity is used in both sentences (in UML: one instance of a class).

Elements can be words and (sub-) phrases. There are also elements for punctuation marks and comments (see graphic below) which are mainly used for exporting a SALe graph back into a natural language text.

Properties (attributes and multiplicities) play a special role: They are context sensitive. A property is at first conneced to the element it describes:

phrase -> element -> property

But because a single element can be used in different phrases, we do not know in which context the property holds. To indicate the context of a property, we introduce additional context edges which connect a property node and a phrase node:

phrase_1 ---next---> phrase_2 <------|
 |-> element_1       |-> element_1   |
 | ...               | ...           |
 |-> element_2  <----|               |
       |                             |
       |--> property -----------------

Thus the property holds in phrase_2 but not in prase_1.

Note: Have a look at the FIDE Laws of Chess Example for a screenshot of a large graph.

The Graph Model

If you are interested in the full graph model of SALE please have a look at the most recent version of sale.gm in our repository (requires login). If you use the SALE MX Workbench, a recent version of the model is included as a project in Eclipse.

It is possible to extend the SALE model with additional roles, if you require them for your application.

https://svn.ipd.uni-karlsruhe.de/repos/koerner/mx/public/res/sale/nodehierarchy/nodehierarchy.png https://svn.ipd.uni-karlsruhe.de/repos/koerner/mx/public/res/sale/edgehierarchy/edgehierarchy.png
https://svn.ipd.uni-karlsruhe.de/repos/koerner/mx/public/res/sale/nodehierarchy/nodes.png https://svn.ipd.uni-karlsruhe.de/repos/koerner/mx/public/res/sale/edgehierarchy/edges.png


Back to Home/MX

Last modified 9 years ago Last modified on Oct 4, 2011 11:08:23 AM