Commit b14caaff authored by Bruno Guillaume's avatar Bruno Guillaume

first online version

parent 073d3be3
selfdoc:
@echo " * make run --> run locally the server"
run:
hugo server -w
talc2:
hugo
scp -r public/* $(stalc2)/www/grew_doc/
languageCode = "en-us"
title = "Grew Manual "
baseURL = "http://example.org/"
baseURL = "http://grew.loria.fr/"
enableEmoji=true
theme = "hyde"
......
+++
Categories = ["Development","GoLang"]
Tags = ["Development","golang"]
Description = ""
date = "2017-05-15T21:43:09+02:00"
title = "Command syntax"
menu = "main"
+++
# Command syntax
Each rule contains a sequence of commands introduced by the keyword `commands`, separated by semicolon symbol ; and surrounded by braces.
## Node deletion
This following command removes the A nodes and all its incident edges.
~~~grew
del_node A
~~~
## Node creation
To create a new node, the command is `add_node`.
The command below create a new node and give it the identifier `A` until the end the rule application.
~~~grew
add_node A
~~~
Moreover, if the node must be placed at a specific position in the linear order of the nodes, the two syntax are available: the new node `B` (resp. `C`) is placed on the immediate left (resp. right) of the node `N`.
~~~grew
add_node B :< N
add_node B :> N
~~~
## Edge deletion
To delete an edge, the `del_edge` command can refer either to the full description of the edge or to an identifier `e` given in the pattern:
~~~grew
del_edge A -[obj]-> B;
del_edge e;
~~~
**NOTE**: for the first syntax, if the corresponding edge does not exists, an exception is raised and the full rewriting process is stopped.
## Add a new edge
There are two ways to add a new edge: with an given label edge or with a label edge coming from the pattern.
### Add a new edge with a given label
The syntax of the command is:
~~~grew
add_edge N -[suj]-> M
~~~
### Add a new edge with a label taken in the pattern
The command `add_edge e: N -> M` add a new edge in the current graph from the node matched with indentifier `N` to the node matched with indentifier `M` with the same label as the edge that was match in the pattern with the edge indentifier `e`.
#### Example:
`add_edge_pattern.grs`:
~~~grew
module deterministic M {
rule r {
match { A[phon=A]; B[phon=B]; e: B -> A }
commands { del_edge e; add_edge e: A -> B }
}
}
sequences { main { M }}
~~~
`input.gr`:
~~~grew
graph {
A [phon="A"];
B [phon="B"];
B -[x]-> A;
B -[y]-> A;
B -[z]-> A;
}
~~~
With the command `grew -det -grs add_edge_pattern.grs -gr input.gr -o output.gr`, the rewriting will produce the graph `output.gr` below.
| `input.gr` | `output.gr` |
|:---:|:---:|
| ![input.gr](/examples/add_edge_pattern/in.svg) | ![output.gr](/examples/add_edge_pattern/out.svg) |
## Edge redirection
Commands are available to move globally incident edges of some node of the pattern.
keywords are `shift_in`, `shift_out` and `shift`, respectively for moving in-edges, out-edges and all incident edges.
Brackets can be used to select the set of edges to move according to their labelling.
~~~grew
shift A ==> B
shift_out B =[suj|obj]=> C
shift_in C =[^suj|obj]=> D
~~~
The action of the 3 commands above are respectively:
* modifying all edges which are incident to `A`: any edge in the graph starting in `A` (resp. ending in `A`) is redirected to start in `B` (resp end in `B`).
* modifying all out-edges which are starting in `B` with a `suj` or `obj` label: they are redirected to start in `C`.
* modifying all in-edges which are ending in `B` with a label different from `suj` and `obj`: they are redirected to end in `D`.
+++
Categories = ["Development","GoLang"]
Tags = ["Development","golang"]
Description = ""
date = "2017-05-04T21:30:18+02:00"
title = "deep_syntax"
menu = "main"
+++
# Deep syntax
The goal of the deep syntax is to give a linguistic description of the input sentence which is closer to a semantic representation.
More information about deep syntax can be found on the [Deep-sequoia project](http://deep-sequoia.inria.fr).
For the sentence:
- "*La souris a été mangée par le chat.*" ["*The mouse was eaten by the cat.*"].
the deep structure is: ![Deep dependency structure](/img/test.deep.svg)
With __grew__, this representation can be computed from the surface syntax in two steps:
1. A general representation (called __mixed__) encodes both surface and deep syntax in the same structure.
2. A projection from the __mixed__ to the __deep__ structure
## Building the mixed structure
The GRS used to build __mixed__ structure can be obtained from InriaGForge by:
```
svn co svn://scm.gforge.inria.fr/svn/semagramme/grew_resources/deep_syntax
```
The input of the GRS which produced the __mixed__ structure is the __surface__ structure.
We recall here the surface structure (see [Dependency parsing](../parsing) page) for our example sentence and we suppose that the file `test.surf.conll` contains the conll description below:
```
1 La le D DET sentid=00000 2 det _ _
2 souris souris N NC det=y|s=c 5 suj _ _
3 a avoir V V m=ind 5 aux.tps _ _
4 été être V VPP m=pastp 5 aux.pass _ _
5 mangée manger V VPP diat=passif|m=pastp _ _ _ _
6 par par P P _ 5 p_obj.agt _ _
7 le le D DET _ 8 det _ _
8 chat chat N NC det=y|s=c 6 obj.p _ _
9 . . PONCT PONCT _ 5 ponct _ _
```
The mixed structure is then computed with the command:
```
grew -det -grs deep_syntax/grs/deep_synt_main.grs -i test.surf.conll -f test.mix.conll
```
which produces the file `test.mix.conll` which contains the code below corresponding the next figure
```
1 La le D DET sentid=00000 2 det _ _
2 souris souris N NC det=y|s=c 5 suj:obj _ _
3 a avoir V V dl=avoir|m=ind|void=y 5 S:aux.tps _ _
4 été être V VPP dl=être|m=part|t=past|void=y 5 S:aux.pass _ _
5 mangée manger V VPP diat=passif|dl=manger|dm=ind|m=part|t=past _ _ _ _
6 par par P P void=y 5 S:p_obj.agt:suj _ _
7 le le D DET _ 8 det _ _
8 chat chat N NC det=y|s=c 6|5 S:obj.p|D:p_obj.agt:suj _ _
9 . . PONCT PONCT _ 5 ponct _ _
```
![Mixed dependency structure](/img/test.mix.svg)
## Building the deep structure
The deep structure is a projection form the mixed structure.
This projection is realized with à GRS file `sequoia_proj.grs` which can be download with the commands:
```
wget https://gitlab.inria.fr/sequoia/deep-sequoia/raw/master/tools/sequoia_decl.dom
wget https://gitlab.inria.fr/sequoia/deep-sequoia/raw/master/tools/sequoia_proj.grs
```
The deep structure is then computed with the command:
```
grew -det -grs sequoia_proj.grs -seq deep -i test.mix.conll -f test.deep.conll
```
The output is given below (code and picture):
```
1 La le D DET sentid=00000 2 det _ _
2 souris souris N NC det=y|s=c 5 obj _ _
3 a avoir V V dl=avoir|m=ind|void=y 0 void _ _
4 été être V VPP dl=être|m=part|t=past|void=y 0 void _ _
5 mangée manger V VPP diat=passif|dl=manger|dm=ind|m=part|t=past _ _ _ _
6 par par P P void=y 0 void _ _
7 le le D DET _ 8 det _ _
8 chat chat N NC det=y|s=c 5 suj _ _
9 . . PONCT PONCT _ 5 ponct _ _
```
![Deep dependency structure](/img/test.deep.svg)
+++
date = "2017-05-23T15:32:25+02:00"
title = "grs"
menu = "main"
Categories = ["Development","GoLang"]
Tags = ["Development","golang"]
Description = ""
+++
# GRS syntax
In Grew, rewriting rules are described in a GRS file (GRS stands for Graph Rewriting System).
A GRS file describes a set of modules, each module contains a set of [rules](../rule).
:warning: Files using this format are expected to used the `.grs` file extension.
A Grew Graph Rewriting System (GRS) is defined by:
* a optional domain definition
* a set of modules (each module is introduced by the keyword `module`)
* the definition of sequences of modules (keyword `sequences`)
## Domain definition
The domain is defined as a pair of a feture domain and an edge label domain.
### Feature domain
In graphs and in rules, nodes contain feature structures.
To control these feature structures, a feature domain may be given first.
In the feature domain declaration, feature names are identifiers and are defined as:
* **closed** feature accepts only an explicit given set of possible values (like the cat feature value below);
* **open** feature name accepts any string value (like the lemma feature value below);
* **numerical** feature (like the position feature below).
In closed feature definition, feature values can be any strings; double quotes are required for string that are not lexical identifier (like values for pers).
~~~grew
features {
cat: n, np, v, adj;
mood: inf, ind, subj, pastp, presp;
lemma: *;
phon: *;
pers: "1","2","3";
position: #;
}
~~~
**REM:** values of pers feature are numerals but the only way to restrict to the finite domain {1, 2, 3} is to declare it as a closed feature and possible values as strings.
### edge labe domain
An explicit set of valid labels for edges may be given after the `labels` keyword.
By default, edges are drawn with a black solid line and above the figure in DEP representation.
To modify the color or the position of the edges, the user can add attributes to a label with suffixes:
`@bottom` to put the label above
`@red`, `@blue`, … to modify the color of the link and the label
`@dot` or `@dash` to modify the style of the link
Several suffixes can be used simultaneously.
~~~grew
labels { OBJ, SUJ, DE_OBJ, ANT, ANT_REL@red, ANT_REP@blue@bottom@dash }
~~~
## Modules
In Grew, rules are grouped in modules.
A module is defined by a name and a set of [rules](../rule).
Example of module:
~~~grew
module name {
rule r_1 {
...
}
rule r_2 {
...
}
}
~~~
A module can be declared as `deterministic`:
~~~grew
module deterministic mod_name { ... }
~~~
If a module is declared deterministic, then only one normal form is computed.
If a non-confluent module is declared deterministic, some normal forms may be lost!
## Sequences
In the sequences part of a GRS file, each sequence is described by a name and a list of modules.
The same module can be used in several sequences but it can also be used several times in the same sequence
(mainly useful when total ordering of module is not possible).
## examples of GRS
A minimal GRS file (without any module) looks like:
~~~grew
features {
cat: v, np;
phon: *;
lemma: *;
}
labels { suj, obj }
sequences { dummy {} }
~~~
A bigger grs file:
~~~grew
features { ... }
labels { OBJ, SUBJ, DE_OBJ, ANT }
module det {
rule det_1 {
...
}
rule det_2 {
...
}
}
...
module ana {
...
}
sequences {
full {det; normsyn; arg; ana}
dn {det; normsyn}
}
~~~
## Split a GRS description into several files
It is possible to describe a GRS through several text files.
### External domain
The two declarations of features domain and of labels domain can be putted in a separate file and include in the main GRS with the keyword `domain`
### External module definition
It is also possible to put a list of modules in a external file `modules_1_and_2.grs`:
~~~grew
module M1 { ... }
module M2 { ... }
~~~
and to include them in a GRS file with the syntax below:
~~~grew
include "modules_1_and_2.grs";
~~~
The recursive use of the include directive is available.
\ No newline at end of file
......@@ -10,7 +10,7 @@ Categories = ["Development","GoLang"]
# Grew Documentation
**Grew** is a Graph Rewriting tool dedicated to applications in Natural Language Processing (NLP). It can manipulate many kind of linguistic representation. It has been used on POS-tagged sequence, surface dependency syntax, deep dependency syntax, semantic representation (AMR, DMRS) but it can be used to represent any graph-based structure.
**Grew** is a Graph Rewriting tool dedicated to applications in Natural Language Processing (NLP). It can manipulate many kinds of linguistic representation. It has been used on POS-tagged sequence, surface dependency syntax, deep dependency syntax, semantic representation (AMR, DMRS) but it can be used to represent any graph-based structure.
## A first taste of Grew
The easiest way to try and test **Grew** is to use one of the two online infefaces.
......
......@@ -5,42 +5,45 @@ title = "installation"
# Grew installation
**Grew** is implemented with the [Ocaml](http://ocaml.org) language. The Graphical User Interface is based on [GTK](http://gtk.org), **Grew** is then easy to install on Linux or MAC OSX (installation on Windows should be possible, but this is untested).
**Grew** is implemented with the [Ocaml](http://ocaml.org) language. The Graphical User Interface is based on [GTK](http://gtk.org), **Grew** is then easy to install on Linux or MAC OS&nbsp;X (installation on Windows should be possible, but this is untested).
:warning: If you run into trouble using the instruction of this page, please open an issue on [GitLab](https://gitlab.inria.fr/grew/grew_doc/issues).
## Step 1: Prerequisites, install non-ocaml needed packages
### On Linux
On Debian/Ubuntu based Linux installation, the following command installs the prerequisites.
```
aptitude install graphviz pkg-config libwebkitgtk-dev librsvg2-dev libglade2-dev m4 automake librsvg2-bin libgtk2.0-dev python-software-properties opam
```
* `aptitude install graphviz pkg-config libwebkitgtk-dev librsvg2-dev libglade2-dev m4 automake librsvg2-bin libgtk2.0-dev python-software-properties opam`
If `aptitude` is not installed, you can install it with `apt get install aptitude`
### On Mac OSX
### On Mac OS&nbsp;X
1. Install [XCode](https://developer.apple.com/xcode/)
2. Install [XQuartz](http://www.xquartz.org/)
3. Install [MacPorts](http://www.macports.org/)
The following command install the prerequisites
`sudo port install graphviz webkit-gtk librsvg libglade2 wget opam`
## Step 2: Initialize OPAM
~~~python
opam init --comp 4.04.0 # Download and install the last version of Ocaml
opam config setup -a
eval `opam config env`
~~~
* `opam init --comp 4.04.0` # Download and install the last version of Ocaml (4.04.0)
* `opam config setup -a` # Update configuration file
* ```eval `opam config env` ``` # Make Ocaml ready to use know
## Step 3: Add the talc local OPAM repository
`opam remote add talc "http://talc2.loria.fr/semagramme/opam"`
* `opam remote add talc "http://talc2.loria.fr/semagramme/opam"`
## Step 4: Install grew
`opam install grew`
* `opam install grew`
# Update to the last Grew version
* update Linux prerequisites: `aptitude update && aptitude upgrade`
* update Mac OSX prerequisites: `sudo port sync && sudo port upgrade`
* update Grew software: `opam update && opam upgrade`
# Update to the last Grew version
1. update prerequisites:
* Linux :arrow_right: `aptitude update && aptitude upgrade`
* Mac OS&nbsp;X :arrow_right: `sudo port sync && sudo port upgrade`
1. update Grew software: `opam update && opam upgrade`
+++
date = "2017-03-15T22:22:20+01:00"
title = "parsing"
menu = "main"
Categories = ["Development","GoLang"]
Tags = ["Development","golang"]
Description = ""
+++
# Dependency parsing
The parsing GRS is described in [IWPT 2015](https://hal.inria.fr/hal-01188694).
It takes as input the POS-tagged representation of a French sentence and returns a surface dependency syntax tree following the FTB/Sequoia format.
## Downloading the GRS system
The GRS can be obtained from InriaGForge with the command:
`svn co svn://scm.gforge.inria.fr/svn/semagramme/grew_resources/parsing`
## Input data for the system
The parsing system is waiting for a pos-tagged input.
One easy way to produce such a pos-tagged French sentence is to use [MElt](https://gforge.inria.fr/frs/?group_id=481).
We consider the sentence:
- "*La souris a été mangée par le chat.*" ["*The mouse was eaten by the cat.*"].
One way to tag the sentence is to use the following command:
`echo "La souris a été mangée par le chat." | MElt -L -T > test.melt`
This produces a file `test.melt` which contains:
```
La/DET/le souris/NC/souris a/V/avoir été/VPP/être mangée/VPP/manger par/P/par le/DET/le chat/NC/chat ./PONCT/.
```
## Running the GR parser in GUI
The following command runs a GTK interface in which you can explore step by step rewriting of the input sentence:
`grew -grs parsing/grs/surf_synt_main.grs -seq full -gr test.melt`
## Running the GR parser from command line
The command to produced a Conll version of the parsed sentence:
`grew -det -grs parsing/grs/surf_synt_main.grs -seq full -i test.melt -f test.surf.conll`
The produced file contains the Conll description:
```
1 La le D DET sentid=00000 2 det _ _
2 souris souris N NC det=y|s=c 5 suj _ _
3 a avoir V V m=ind 5 aux.tps _ _
4 été être V VPP m=pastp 5 aux.pass _ _
5 mangée manger V VPP diat=passif|m=pastp _ _ _ _
6 par par P P _ 5 p_obj.agt _ _
7 le le D DET _ 8 det _ _
8 chat chat N NC det=y|s=c 6 obj.p _ _
9 . . PONCT PONCT _ 5 ponct _ _
```
which encodes the syntactic structure:
![Dependency structure](/img/test.surf.svg)
+++
menu = "main"
Categories = ["Development","GoLang"]
Tags = ["Development","golang"]
Description = ""
date = "2017-05-22T23:01:05+02:00"
title = "pattern"
+++
# Pattern syntax
One way to learn the syntax of patterns in grew is to follow the tutorial part of the [Online Graph Matching](http://grew.loria.fr/demo) tool.
\ No newline at end of file
+++
Description = ""
date = "2017-05-23T15:18:57+02:00"
title = "Rules"
menu = "main"
Categories = ["Development","GoLang"]
Tags = ["Development","golang"]
+++
# Grew basic rules
A **rewrite rule** in grew is defined by:
* One pattern describing the part of graph we want to match (see [pattern page](../pattern)) and on which we will apply rules, introduced by the keyword `pattern`
* A set of negative clauses to filter out unwanted occurrences of the pattern, each clause being introduced by the keyword `without`
* One sequence of commands to apply (see [commands page](../commands)), introduced by the keyword `commands`
## example
~~~grew
rule accuser {
pattern {
V [cat=V, lemma="accuser"];
O [];
D [cat=D, lemma="de"];
DO [cat=V, m = inf | part];
V -[obj]-> O;
V -[de_obj]-> D;
D -[obj]-> DO
}
without {
DO -[suj]-> O
}
commands {
add_edge DO -[suj]-> O
}
}
~~~
# Grew lexical rules
TODO
+++
Description = ""
date = "2017-02-28T14:58:11+01:00"
title = "run"
menu = "main"
Categories = ["Development","GoLang"]
Tags = ["Development","golang"]
+++
# Grew running modes
There are 3 main running modes for **Grew**:
* **GUI** mode (this is the default mode): `grew`
* **Deterministic** mode for one-to-one graph transformation: `grew -det`
* **Grep** mode searches for occurrences of grew pattern in a corpus `grew -grep`
Options available in all modes:
```
-grs <grs_file> chose the grs file to load
-seq <seq> set the module sequence to use
-timeout <float> set a timeout on rewriting
-main_feat <feat_name_list> set the list of feature names used in dep format to set the "main word"
```
## Graphical User Interface
The following option is available only for the GUI mode:
```
-gr <gr_file> set the graph file (.gr or .conll) to use
-doc force to generate the GRS doc
```
The documentation generation takes some time and it is disabled by default.
The documentation given in the GUI can be outdated if the GRS has changed since the last documentation generation.
## Deterministic corpus rewriting
When the GRS file describes a one-to-one transformation, the ''**-det**'' option can be used to transform all graphs of a corpus.
The command below rewrite each graph found in ''**input**'' (a conll file).
The output is written in a file.
```grew -det -i input_file -f output_file ```
**Note**:
In the current version it is possible to use a folder as a corpus (both in input and output) but this will be remove soon, you should avoid to use this.
NB: One way to ensure that a GRS is deterministic is to declare all modules as `deterministic`.
## Grep mode
This mode corresponds to the command line version of the [Online graph matching](http://grew.loria.fr/demo) tool.
The command is:
`grew -grep -pattern <pattern_file> -node_id <id> -i <corpus_file>`
where:
* `<pattern_file>` is a file which describes a pattern
* `<id>` is the name of a node identifier declared in the pattern
* `<corpus_file>` is the corpus in which the search is done
The output is a list of lines, one for each occurrence of the pattern in the corpus.
### Example
With the following files:
* The surface sequoia version 7.0: `sequoia.surf.conll` ([Download](https://gitlab.inria.fr/sequoia/deep-sequoia/raw/master/tags/sequoia-7.0/sequoia.surf.conll)),
* A pattern file with the code below: `subcat.pat` ([Download](https://gitlab.inria.fr/grew/grew_doc/raw/master/examples/grep/subcat.pat))
```
match {
V [cat=V];
V -[a_obj]-> A;
V -[de_obj]-> DE;
}
```
The command:
`grew -grep -pattern subcat.pat -node_id V -i sequoia.surf.conll`
produces the following output:
```
annodis.er_00040 41
annodis.er_00240 12
annodis.er_00441 14
emea-fr-test_00438 19
emea-fr-test_00478 31
Europar.550_00496 14
```
This means that the pattern descibed in the file `subcat.pat` was found 6 times in the corpus, each line gives the sentence identifier and the position of node matched by the node `V` of the pattern.
......@@ -9,3 +9,9 @@ date = "2017-02-27T17:06:34+01:00"
+++
# Page under construction, coming soon.
test of syntax highlighting
~~~grew
% comment
match { N [lemma = "écouter"]; N -[suj]-> M; }
~~~
talc2:
scp ud.dom ${stalc2}/resources
\ No newline at end of file
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"{{with .Site.LanguageCode}} xml:lang="{{.}}" lang="{{.}}"{{end}}>
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
{{ .Hugo.Generator }}
<!-- Enable responsiveness on mobile devices-->
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
{{ if .IsHome }}
<title>{{ .Site.Title }}</title>
{{ else }}
<title>{{ .Title }} &middot; {{ .Site.Title }}</title>
{{ end }}
<!-- CSS -->
<link rel="stylesheet" href="{{ .Site.BaseURL }}css/poole.css">
<link rel="stylesheet" href="{{ .Site.BaseURL }}css/syntax.css">
<link rel="stylesheet" href="{{ .Site.BaseURL }}css/hyde.css">
<link rel="stylesheet" href="{{ .Site.BaseURL }}css/main.css">
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=PT+Sans:400,400italic,700|Abril+Fatface">
<!-- Prism -->
<link rel="stylesheet" href="{{ .Site.BaseURL }}css/prism.css">
<script src="{{ .Site.BaseURL }}js/prism.js"></script>
<script src="{{ .Site.BaseURL }}js/prism_grew.js"></script>
<!-- <link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/highlight.js/9.6.0/styles/default.min.css">
<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/9.6.0/highlight.min.js"></script>
<script>hljs.initHighlightingOnLoad();</script> -->
<!-- Icons -->
<link rel="apple-touch-icon-precomposed" sizes="144x144" href="/apple-touch-icon-144-precomposed.png">
<link rel="shortcut icon" href="/favicon.png">
<!-- RSS -->
<link href="{{ .RSSLink }}" rel="alternate" type="application/rss+xml" title="{{ .Site.Title }}" />
</head>
......@@ -15,21 +15,21 @@
<hr/>
<li class="section">Use Grew</li>
<li><a href="/installation">Installation</a></li>
<li><a href="/todo">Run Grew</a></li>
<li><a href="/run">Run Grew</a></li>
<hr/>
<li class="section">Available GRS</li>
<li><a href="/todo">Dependency parsing</a></li>
<li><a href="/todo">Deep syntax</a></li>
<li><a href="/parsing">Dependency parsing</a></li>
<li><a href="/deep_syntax">Deep syntax</a></li>
<li><a href="/todo">DMRS</a></li>
<li><a href="/todo">Other GRS</a></li>