Mentions légales du service

Skip to content
Snippets Groups Projects

slu-eval/slu-comp are C++ tools that provide utilities for Slot Filling Task in Spoken Language Understanding systems

  1. slu-eval provide metrics for evaluation:
    • precision - recall - f1score for each target
    • details about errors: insertions - deletions - substitutions
  2. slu-comp provide comparison between two systems
    • highligth systemes advantages
    • provide statistical test to determine the significativity of systems differences in sentences based CER and F1:
      1. Wilcoxon signed-rank test
      2. paired Student test

install: works with g++>=13 and clang>=17

usage of slu-eval

use: slu-eval [gcevCMLh] <file>

        - C     #input file is using a CRF  file format
        - L     #input file is using a MPML file format
        - M     #input file is using a MPM  file format
        - g     #output global metrics
        - c     #output concepts metrics
        - e     #output errors
        - v     #concept values are used to compute metrics
        - h     #this help

when flags (C/L/M) is not fixed, slu-eval determine file format using the file extension:

  1. .crf for a CRF file format
  2. .mpm for a MPM file format
  3. .mpml for a MPML file format

flags ge provide similar information as NIST sclite tool

flag c provide similar information as conlleval

example with an input file CRF format named test.crf

yes O O
I'am ERROR-B O
an ERROR-I OK-B
error ERROR-I OK-I

yes O O
I'am OK-B OK-B
a OK-I OK-I
good OK-I OK-I
prediction OK-I OK-I

I'am VALERR-B O
a VALERR-I VALERR-B
value VALERR-I VALERR-I
error VALERR-I VALERR-I

nothing O O

I'am O O
an O O
Insertion O INSERT-B

and O O
a O O
deletion DELET-B O

command slu-eval gce test.crf outputs:

Align sentences: [100%] |██████████████████████████████████████████████████| 6/6 [ 00:00<00:00 ?it/s ] -

CLASSIFICATION REPORT
┌────────┬────────────┬───────────┬─────────┬────────┬───────────┬────────┬───────────┬────────────┐
│  Label │ Hypothesis │ Reference │ correct │ Erreur │ Precision │ Recall │ F-measure │ Error rate │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│  ERROR │          0 │         1 │       0 │      1 │    100.00 │   0.00 │      0.00 │     100.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│     OK │          2 │         1 │       1 │      1 │     50.00 │ 100.00 │     66.67 │     100.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│ VALERR │          1 │         1 │       1 │      0 │    100.00 │ 100.00 │    100.00 │       0.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│ INSERT │          1 │         0 │       0 │      1 │      0.00 │ 100.00 │      0.00 │     100.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│  DELET │          0 │         1 │       0 │      1 │    100.00 │   0.00 │      0.00 │     100.00 │
├────────┼────────────┼───────────┼─────────┼────────┼───────────┼────────┼───────────┼────────────┤
│    All │          4 │         4 │       2 │      3 │     50.00 │  50.00 │     50.00 │      75.00 │
└────────┴────────────┴───────────┴─────────┴────────┴───────────┴────────┴───────────┴────────────┘
| Micro F1             = 50.00% [1.00,99.00] at 95%
| Macro F1             = 33.33%
| Exact Match          = 40.00%
| Multi-Label Accuracy = 40.00%
| Hamming Loss         = 12.00%



SUMMARY STATISTICS
#seq    #ref    #hyp    cor     %sub    %ins    %del    %WER    %SER
6       4       4       50.00   25.00   25.00   25.00   75.00   50.00

ERRORS DETAILS
+-------------+----+---+
| Substituted | by | 1 |
+-------------+----+---+
| ERROR       | OK | 1 |
+-------------+----+---+
+-------------+---+
| Insertions  | 1 |
+-------------+---+
| INSERT      | 1 |
+-------------+---+
+-------------+---+
| Deletions   | 1 |
+-------------+---+
| DELET       | 1 |
+-------------+---+

usage of slu-comp

use: slu-comp [gdsh] <filesys1> <filesys2>

        - g     #ouput general differences
        - d     #output details about differences
        - s     #output ttest/wilcoxon stats
        - h     #this help

like slu-eval, slu-comp determine file format using file extension:

  1. .crf for a CRF file format
  2. .mpm for a MPM file format
  3. .mpml for a MPML file format