slu-eval/slu-comp are C++ tools that provide utilities for Slot Filling Task in Spoken Language Understanding systems
-
slu-eval provide metrics for evaluation:
- precision - recall - f1score for each target
- details about errors: insertions - deletions - substitutions
- slu-comp provide comparison between two systems
install: works with g++>=13 and clang>=17
- git clone https://gitlab.inria.fr/craymond/slu-eval.git
- cd slu-eval
- cmake ./
- make
- sudo make install
usage of slu-eval
use: slu-eval [gcevCMLh] <file>
- C #input file is using a CRF file format
- L #input file is using a MPML file format
- M #input file is using a MPM file format
- g #output global metrics
- c #output concepts metrics
- e #output errors
- v #concept values are used to compute metrics
- h #this help
when flags (C/L/M) is not fixed, slu-eval determine file format using the file extension:
- .crf for a CRF file format
- .mpm for a MPM file format
- .mpml for a MPML file format
flags ge provide similar information as NIST sclite tool
flag c provide similar information as conlleval
example with an input file CRF format named test.crf
yes O O
I'am ERROR-B O
an ERROR-I OK-B
error ERROR-I OK-I
yes O O
I'am OK-B OK-B
a OK-I OK-I
good OK-I OK-I
prediction OK-I OK-I
I'am VALERR-B O
a VALERR-I VALERR-B
value VALERR-I VALERR-I
error VALERR-I VALERR-I
nothing O O
I'am O O
an O O
Insertion O INSERT-B
and O O
a O O
deletion DELET-B O
command slu-eval gce test.crf
outputs:
Align sentences: [100%] |██████████████████████████████████████████████████| 6/6 [ 00:00<00:00 ?it/s ] -
CLASSIFICATION REPORT
┌────────┬────────────┬───────────┬─────────┬────────┬───────────┬────────┬───────────┬────────────┐
│ Label │ Hypothesis │ Reference │ correct │ Erreur │ Precision │ Recall │ F-measure │ Error rate │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│ ERROR │ 0 │ 1 │ 0 │ 1 │ 100.00 │ 0.00 │ 0.00 │ 100.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│ OK │ 2 │ 1 │ 1 │ 1 │ 50.00 │ 100.00 │ 66.67 │ 100.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│ VALERR │ 1 │ 1 │ 1 │ 0 │ 100.00 │ 100.00 │ 100.00 │ 0.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│ INSERT │ 1 │ 0 │ 0 │ 1 │ 0.00 │ 100.00 │ 0.00 │ 100.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│ DELET │ 0 │ 1 │ 0 │ 1 │ 100.00 │ 0.00 │ 0.00 │ 100.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│ All │ 4 │ 4 │ 2 │ 3 │ 50.00 │ 50.00 │ 50.00 │ 75.00 │
└────────┴────────────┴───────────┴─────────┴────────┴───────────┴────────┴───────────┴────────────┘
| Micro F1 = 50.00% [1.00,99.00] at 95%
| Macro F1 = 33.33%
| Exact Match = 40.00%
| Multi-Label Accuracy = 40.00%
| Hamming Loss = 12.00%
SUMMARY STATISTICS
#seq #ref #hyp cor %sub %ins %del %WER %SER
6 4 4 50.00 25.00 25.00 25.00 75.00 50.00
ERRORS DETAILS
+-------------+----+---+
| Substituted | by | 1 |
+-------------+----+---+
| ERROR | OK | 1 |
+-------------+----+---+
+-------------+---+
| Insertions | 1 |
+-------------+---+
| INSERT | 1 |
+-------------+---+
+-------------+---+
| Deletions | 1 |
+-------------+---+
| DELET | 1 |
+-------------+---+
usage of slu-comp
use: slu-comp [gdsh] <filesys1> <filesys2>
- g #ouput general differences
- d #output details about differences
- s #output ttest/wilcoxon stats
- h #this help
like slu-eval, slu-comp determine file format using file extension:
- .crf for a CRF file format
- .mpm for a MPM file format
- .mpml for a MPML file format