Commit f9c59f21 authored by Bruno Guillaume's avatar Bruno Guillaume

Typos

parent e3fc89c1
......@@ -9,7 +9,7 @@ Description = ""
+++
# Graphs definition
The graphs we consider in Grew are defined as usually in mathematics by two sets:
The graphs we consider in **Grew** are defined as usually in mathematics by two sets:
* A set **N** of nodes
* A set **E** of edges
......@@ -23,7 +23,7 @@ Since version 1.2 labels are encoded as feature structures (mainly to ease the w
See [here](../complex_edges#complex-edges-in-graphs) for more detail on complex edge labels.
# Graph input formats
To describe a graph in practice, *Grew* offers several input formats: a native `gr` format, the `conll` format (and a few derived formats), the `amr` format.
To describe a graph in practice, **Grew** offers several input formats: a native `gr` format, the `conll` format (and a few derived formats), the `amr` format.
## CoNLL format
......@@ -34,7 +34,7 @@ For a sentence, some metadata are given in lines beginning by `#`.
The rest of the lines described the tokens of the structure.
Tokens lines contain 10 fields, separated by tabulations.
The file [`n01118003.conllu`](/graph/n01118003.conllu) is an example of CoNLL-U data taken form the corpus `UD_English-PUD` (version 2.3).
The file [`n01118003.conllu`](/graph/n01118003.conllu) is an example of CoNLL-U data taken form the corpus `UD_English-PUD` (version 2.4).
{{< input file="static/graph/n01118003.conllu" >}}
......@@ -42,25 +42,25 @@ The file [`n01118003.conllu`](/graph/n01118003.conllu) is an example of CoNLL-U
We explain here how **Grew** deals with the 10 fields if CoNLL files:
1. **ID**. This field is a number used as an identifier for the corresponding lexical unit (LU).
In Grew, it is available as the feature `position` (most of the times it not useful to use this field, constraints on relative positions can be expressed with the `<` or `<<` syntax).
In **Grew**, it is available as the feature `position` (most of the times it not useful to use this field, constraints on relative positions can be expressed with the `<` or `<<` syntax).
2. **FORM**. The phonological form of the LU.
In Grew, the value of this field is available through a feature named `form`
In **Grew**, the value of this field is available through a feature named `form`
(for backward compatibility, the keyword `phon` can also be used instead of `form`).
3. **LEMMA**. The lemma of the LU. In Grew, this corresponds to the feature `lemma`.
3. **LEMMA**. The lemma of the LU. In **Grew**, this corresponds to the feature `lemma`.
4. **UPOS**. The field `upos` (for backward compatibility, `cat` can also be used to refer to this field).
5. **XPOS**. The field `xpos` (for backward compatibility, `pos` can also be used to refer to this field).
6. **FEATS**. List of morphological features.
7. **HEAD**. Head of the current word, which is either a value of ID or `0` for the root node.
8. **DEPREL**. Dependency relation to the HEAD (root iff HEAD = 0).
9. **DEPS**. Enhanced dependency graph in the form of a list of head-deprel pairs. In Grew, the relation are available with the prefix `E:`
10. **MISC**. Any other annotation. In Grew, annotation of the field are accessible with the prefix `_MISC_`.
9. **DEPS**. Enhanced dependency graph in the form of a list of head-deprel pairs. In **Grew**, the relation are available with the prefix `E:`
10. **MISC**. Any other annotation. In **Grew**, annotation of the field are accessible with the prefix `_MISC_`.
Note that the same format is very often use to describes dependency syntax corpora.
In these cases, a set of sentences is described in the same file using the same convention as above and a blank line as separator between sentences.
It is also requires that the `sent_id` metadata is unique for each sentence in the file.
In practice, it may be useful to deal explicitly with the `root` relation (for instance, if some rewriting rule is designed to change the root of the structure).
To allow this, when reading CoNLL-U format **Grew** also creates a node at position `0` and link it with the `root` relation to the linguistic `root` node of the sentence.
To allow this, when reading CoNLL-U format **Grew** also creates a node at position `0` and link it with the `root` relation to the linguistic root node of the sentence.
The example above then produce the 5 nodes graphs below:
![Dependency structure](/graph/n01118003.svg)
......@@ -68,7 +68,7 @@ The example above then produce the 5 nodes graphs below:
### Note about backward compatibility
In older versions of Grew (before the definition of the CoNLL-U format), the fields 2, 4 and 5 where accessible with the names `phon`, `cat` and `pos` respectively.
In older versions of **Grew** (before the definition of the CoNLL-U format), the fields 2, 4 and 5 where accessible with the names `phon`, `cat` and `pos` respectively.
To have a backward compatibility and uniform handling of these fields, the 3 names `phon`, `cat` and `pos` are replaced at parsing time by `form`, `upos` and `xpos`.
As a consequence, it is impossible to use both `phon` and `form` in the same system.
We highly recommend to use only the `form` feature in this setting. Of course, the same observation applies to `cat` and `upos` (`upos` should be prefered) and to `pos` and `xpos` (`xpos` should be chosen).
......
......@@ -32,7 +32,7 @@ The easiest way to try and test **Grew** is to use one of the two online interfa
## Some of the main features of Grew
* Graph structures can use a build-in notion of **feature structures**.
* Graph structures can use a built-in notion of **feature structures**.
* The left-hand side of a rule is described by a graph called a **pattern**; injective graph morphisms are used in the pattern matching algorithm.
* **Negative pattern** can be used for a finer control on the left-hand side of rules.
* The right-hand side of rules is described by a sequence of **atomic commands** that describe how the graph should be modified during the rule application.
......
......@@ -59,8 +59,8 @@ pattern {
N1[]; N2[]; N1.position = N2.position; N1.user <> N2.user; % N1 and N2 are parallel
M1[]; M2[]; M1.position = M2.position; M1.user <> M2.user; % M1 and M2 are parallel
e1: M1 -> N1; e2: M2 -> N2; % M and N are link by parallel edges
label(e1) <> label(e2); % ask to different labels
id(N1) < id(N2); % avois duplicate (1/2 switching)
label(e1) <> label(e2); % ask two different labels
id(N1) < id(N2); % avoid duplicate (1/2 switching)
}
```
......@@ -88,8 +88,8 @@ pattern {
N1[]; N2[]; N1.position = N2.position; N1.user <> N2.user; % N1 and N2 are parallel
e1: G1 -> N1; % G1 is the governor of N1
e2: G2 -> N2; % G2 is the governor of N2
id(G1) < id(G2); % avoid duplicates (1/2 switching)
G1.position <> G2.position; % G1 and G2 are not parallel
id(G1) < id(G2); % avoid duplicates (1/2 switching)
}
```
......
static/multigraph/multigraph.png

354 KB | W: | H:

static/multigraph/multigraph.png

452 KB | W: | H:

static/multigraph/multigraph.png
static/multigraph/multigraph.png
static/multigraph/multigraph.png
static/multigraph/multigraph.png
  • 2-up
  • Swipe
  • Onion skin
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment