Commit e6a462dc authored by Bruno Guillaume's avatar Bruno Guillaume

update on install & grew_server

parent 21875823
......@@ -48,7 +48,15 @@ This service returns the list of existing projects.
* `()`
:new: [see #2](https://gitlab.inria.fr/grew/grew_server/issues/2) in **dev**, the returned value is a list of dict:
:warning::warning::warning: Output on **prod** version
```json
[ "project_1", "project_2" ]
```
:warning::warning::warning: Output on **dev** version
The returned value is a list of dict ([see #2](https://gitlab.inria.fr/grew/grew_server/issues/2)):
```json
[
......@@ -86,26 +94,25 @@ This service returns the list of existing samples in a given project.
* `(<string> project_id)`
:new: [see #2](https://gitlab.inria.fr/grew/grew_server/issues/2) in **dev**, the returned value is a list of dict:
:warning::warning::warning: Output on **prod** version
```json
[
{ "name": "sample_1", "number_sentences": 5, "number_tokens": 74, "number_trees": 8, "users": [ "alice", "bob"] },
{ "name": "sample_2", "number_sentences": 4, "number_tokens": 54, "number_trees": 9, "users": [ "alice", "charlie"] }
{ "name": "sample_1", "size": 5, "users": [ "alice", "bob" ] },
{ "name": "sample_2", "size": 4, "users": [ "alice", "charlie" ] }
]
```
[
{
"name": "s",
"number_sentences": 1,
"number_tokens": 12,
"number_trees": 3,
"users": [ "denys", "ellie", "fred" ]
}
]
:warning::warning::warning: Output on **dev** version
[see #2](https://gitlab.inria.fr/grew/grew_server/issues/2)
```json
[
{ "name": "sample_1", "number_sentences": 5, "number_tokens": 74, "number_trees": 8, "users": [ "alice", "bob"] },
{ "name": "sample_2", "number_sentences": 4, "number_tokens": 54, "number_trees": 9, "users": [ "alice", "charlie"] }
]
```
### The `eraseSample` service
......@@ -171,10 +178,12 @@ An error is returned either if `sample_id` does not exist or if `new_sample_id`
## Search with Grew patterns
### The `searchPatternInSentences` service
### The `searchPatternInGraphs` service
This service returns occurrences of some pattern in a project.
Each occurrence is described by a dict `{'sample_id':…, 'sent_id':…, 'nodes':…, 'edges':…}`.
This service returns occurrences of some pattern in a project, for a given user.
Each occurrence is described by a dict `{'sample_id':…, 'sent_id':…, 'user_id':…, 'nodes':…, 'edges':…}`.
:warning::warning::warning: In the **dev** version one more field `conll` gives the CoNLL data of the corresponding graph.
* `(<string> project_id, <string> pattern)` returns a list of occurrences.
......@@ -182,7 +191,7 @@ Each occurrence is described by a dict `{'sample_id':…, 'sent_id':…, 'nodes'
where `clusters` is a list of cluster keys, separated by `;`.
This returns nested dictionaries (the depth being equals to the length of the cluster key list).
The set of occurrences of the `pattern` in `project_id` are clustered with the first key of the list;
each clusters is further clustered recursively with the remaining keys.
each cluster is further clustered recursively with the remaining keys.
For instance:
* If the length of the cluster keys list is 1, the behaviour is similar the the *clustering* feature available in **Grew-match**.
......@@ -191,12 +200,33 @@ Each occurrence is described by a dict `{'sample_id':…, 'sent_id':…, 'nodes'
* `pattern`: `pattern { G -[obj]-> D }`
* `clusters`: `G.upos; D.upos`
### The `searchPatternInGraphs` service
---
---
This service returns occurrences of some pattern in a project, for a given user.
Each occurrence is described by a dict `{'sample_id':…, 'sent_id':…, 'user_id':…, 'nodes':…, 'edges':…}`.
### The `searchPatternInSentences` service
:warning::warning::warning:
Service `searchPatternInSentences` is deprecated, it is no available in the **dev** version
:warning::warning::warning:
This service returns occurrences of some pattern in a project.
Each occurrence is described by a dict `{'sample_id':…, 'sent_id':…, 'nodes':…, 'edges':…}`.
* `(<string> project_id, <string> pattern)` returns a list of occurrences.
* `(<string> project_id, <string> pattern, <string> clusters)`
Nested dictionaries are returned with the same structure as in the case of `searchPatternInSentences` above.
where `clusters` is a list of cluster keys, separated by `;`.
This returns nested dictionaries (the depth being equals to the length of the cluster key list).
The set of occurrences of the `pattern` in `project_id` are clustered with the first key of the list;
each cluster is further clustered recursively with the remaining keys.
For instance:
* If the length of the cluster keys list is 1, the behaviour is similar the the *clustering* feature available in **Grew-match**.
* Data presented in **Relations tables** in **Grew match** can be obtained (for the `obj` relation in the example) with the arguments:
* `pattern`: `pattern { G -[obj]-> D }`
* `clusters`: `G.upos; D.upos`
......@@ -51,11 +51,10 @@ apt-get install wget m4 unzip librsvg2-bin curl bubblewrap
## Step 2: Setup opam
Run:
Run: `opam init` and follow instructions (answer `y` to different questions).
* `opam init` and follow instructions (answer `y` to different questions).
* `opam switch create 4.09.0 4.09.0` installation of Ocaml. Note that it takes some times to download and build the `ocaml` compiler.
* Check that `ocaml` is installed with `ocamlc -v`.
Check that `ocaml` is installed with `ocamlc -v`. This gives you the version of Ocaml installed.
This should be (in March 2020) 4.10.0.
## Step 3: Install the Grew software
......
......@@ -94,7 +94,7 @@ This includes:
* typographical or orthographical errors
* token linked by a `goeswith` relation
See a few examples in [SUD_French-GSD](http://match.grew.fr/?corpus=SUD_French-GSD@master&custom=5e42842249c10).
See a few examples in [SUD_French-GSD](http://match.grew.fr/?corpus=SUD_French-GSD@latest&custom=5e42842249c10).
## Deprecated `_MISC_` and `_UD_` prefixes
In older versions, features declared in column 10 were accessible with the `_MISC_` prefix and multiword tokens or empty nodes were identified with the `_UD_` prefix. These prefixes are deprecated and are replaced by features `textform` and `wordform` (see above).
......
......@@ -26,12 +26,13 @@ The full matching process is:
* Output a set of matchings; a *matching* being a function from nodes and edges defined in the positive items to nodes and edges of the host graph.
1. If the graph does not satisfied one of the global items, the output is empty.
1. Else the set M is initialised as the set of matchings which satisfies the union of positive items.
1. For each negative item, remove from M the matchings which satisfies it.
1. Else the set M is initialised as the set of matchings which satisfy the union of positive items.
1. For each negative item, remove from M the matchings which satisfy it.
### Remarks
* If there is more than one positive `pattern` items, the union is considered.
* If there is more than one negative `without` items, there are all interpreted independently (and the output is different from the one obtained with a union of negative items)
* The order of patterns items in a pattern are irrelevant.
* It there is no positive item, there is a trivial matching which is the empty function.
The syntax of patterns in **Grew** can be learned using the [tutorial part](http://match.grew.fr?tutorial=yes) of the [Grew-match](http://match.grew.fr) tool.
......@@ -48,7 +49,7 @@ In a *node clause*, a node is described by an identifier and some constraints on
N [upos = VERB, Mood = Ind|Imp, Tense <> Fut, Number, !Person, lemma = "être" ]
```
The clause above illustrated the syntax of constraint that can be expressed, in turn:
The clause above illustrates the syntax of constraint that can be expressed, in turn:
* `upos = VERB` requires that the feature `upos` is defined with the value `VERB`
* `Mood = Ind|Imp` requires that the feature `Mood` is defined with one of the two values `Ind` or `Imp`
......@@ -67,7 +68,7 @@ All *edge clauses* below require the existence of an edge between the node selec
* `N -[^nsubj|obj]-> M`: the edge label is different from `nsubj` and `obj`
* `N -[re".*subj"]-> M`: the edge follows the regular expression (see [here](http://caml.inria.fr/pub/docs/manual-ocaml/libref/Str.html#VALregexp) for regular expressions accepted)
Edges may also be named for future use (in commands or in clustering for instance) with an identifier:
Edges may also be named for usage in commands (in Grew) or in clustering (in Grew-match) with an identifier:
* `e: N -> M`
......@@ -91,7 +92,10 @@ These constrains do not identify new elements in the graph, but must be respecte
* Constraints on features values:
* `N.lemma = M.lemma` two feature values must be equal
* `N.lemma <> M.lemma` two feature values must be different
* `N.lemma = "constant"` the feature `lemma` of node `N` must be the value `constant`
* `N.lemma = re".*ing"` the value of a feature must follow a regular expression (see [here](http://caml.inria.fr/pub/docs/manual-ocaml/libref/Str.html#VALregexp) for regular expressions accepted)
* `N.lemma = lexicon.field` imposes that the feature `lemma` of node `N` must be the be present in the `field` of the `lexicon`. **NB**: this reduce also the current lexicon the items for which `field` is equals to `N.lemma`.
* Constraints on node ordering:
* `N < M` the node `N` immediately precedes the node `M`
* `N << M` the node `N` precedes the node `M`
......@@ -103,7 +107,7 @@ These constrains do not identify new elements in the graph, but must be respecte
* `label(e1) <> label(e2)` the labels of the two edges `e1` and `e2` are different
### Remarks
When two or more nodes are equivalent in a pattern, each occurrence of the pattern in a graph will be found several times (up to permutation in the sets of equivalent nodes).
When two or more nodes are equivalent in a pattern, each occurrence of the pattern in a graph is found several times (up to permutation in the sets of equivalent nodes).
For instance, in the pattern below, the 3 nodes `N1`, `N2` and `N3` are equivalent.
```grew
......@@ -118,6 +122,8 @@ It imposes an ordering on some internal representation of the nodes and so avoid
The pattern below returns the 20 expected occurrences ([Grew-match](http://match.grew.fr/?corpus=Little_Prince&custom=5d4d6bb86ce49))
```grew
pattern {
N1 -[ARG1]-> N; N2 -[ARG1]-> N; N3 -[ARG1]-> N;
......@@ -145,7 +151,7 @@ For each one, its negation is available by changing the `is_` prefix by the `is_
* `is_tree`: a graph is a tree if it is a forest and if it have exactly one root.
* `is_projective`: the usual notion of projectivity defined on tree is generalised by saying the a structure is projective if there are no 4-tuples (`A`, `B`, `C`, `D`) of ordered nodes (i.e. `A << B`, `B << C` and `C << D`) such that `A` and `C` are linked and `B` and `D` are linked (two nodes are linked when there is at least one edge between the two, whatever is the orientation).
* `is_projective`: the usual [notion of projectivity](https://en.wikipedia.org/wiki/Discontinuity_(linguistics)) defined on tree is generalised by saying the a structure is projective if there are no 4-tuples (`A`, `B`, `C`, `D`) of ordered nodes (i.e. `A << B`, `B << C` and `C << D`) such that `A` and `C` are linked and `B` and `D` are linked (two nodes are linked when there is at least one edge between the two, whatever is the orientation).
### Metadata constraints
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment