Mentions légales du service

Skip to content
Snippets Groups Projects

Linkex

Linkex is a tool allowing to discover link keys candidate from two RDF datasets. Link keys generalise the combination of keys and ontology alignments for data interlinking. A link key is a set of pairs of properties that uniquely identify the instances of two classes of two RDF datasets. For example, {(hasCreator, aAuteur), (hasTitle, aTitre)} for (Book, Livre), which states that, if an instance of Book have the same values for hasCreator and aAuteur as an instance of Book has for hasCreator and hasTitle, the two instances are the same.

This tool can extract link keys candidates and evaluate them using discriminability and coverage. It can also evaluate them according to reference set of links given as input. It is able to extract candidates with composed properties, and inverse properties.

Linkex is free software distributed it under the terms of the Lesser GNU General Public License.

If you use this software and want to give it credit, please cite:

Manuel Atencia, Jérôme David, Jérôme Euzenat, Data interlinking through robust link key extraction, Proc. 21st ECAI, Prague (CK), pp15-20, 2014.

Installation

Use these commands to manually install alignment API :

wget ftp://ftp.inrialpes.fr/pub/exmo/software/ontoalign/align-4.10.zip
unzip align-4.10.zip -d alignapi
mvn install:install-file -Dfile=alignapi/lib/procalign.jar -DgroupId=fr.inrialpes.exmo.align -DartifactId=procalign -Dversion=4.10 -Dpackaging=jar
mvn install:install-file -Dfile=alignapi/lib/ontowrap.jar -DgroupId=fr.inrialpes.exmo.ontowrap -DartifactId=ontowrap -Dversion=4.10 -Dpackaging=jar
mvn install:install-file -Dfile=alignapi/lib/align.jar -DgroupId=org.semanticweb.owl.align -DartifactId=align -Dversion=4.10 -Dpackaging=jar
  • Clone this repository :
git clone git@gitlab.inria.fr:moex/linkex.git
  • Move to the linkex depository :
cd linkex
  • Compile and package into a jar
mvn package

This should create the file target/LinkkeyDiscovery-1.0-SNAPSHOT-jar-with-dependencies.jar

Run the extraction tool

Link key extraction tool can be run from command line.

From the linkex directory, you can get the followinf help message:

java -jar  target/LinkkeyDiscovery-1.0-SNAPSHOT-jar-with-dependencies.jar -help
usage: java fr.inrialpes.exmo.linkkey.LinkkeyDiscoveryAlgorithm [options]
            dataset1 dataset2
 -b <b>                     find links between blank nodes (true by
                            default)
 -c <composition length>    compose properties
 -c1 <uri1>                 Uri of the first class (if omitted, all
                            instances are considered)
 -c2 <uri2>                 Uri of the second class (if omitted, all
                            instances are considered)
 -classes                   extracts link keys candidates with classes
 -classesfull               extracts link keys candidates with classes
                            full (may be very expensive)
 -d <d>                     property discriminability threshold
 -e <e>                     use the given reference links for precision
                            and recall evaluation.the links are given in
                            RDF (i.e. a list of triples with predicate
                            owl:sameAs)
 -f,--format <format>       Format of the output: txt (default), edoal,
                            html, bin, dot, txt2 (txt with links)
 -help                      print this message
 -i                         considers inverse of properties (only useful
                            with -c)
 -l                         Lazy mode, data will be loaded when needed
                            (only available for bin)
 -o,--output <outputfile>   output filename. Default files: standard
                            output for txt and edoal, 'result' for html
                            and bin
 -p1 <uriprefix1>           prefix of classes that have to be considered
 -p2 <uriprefix1>           prefix of classes that have to be considered
 -s <s>                     a support threshold between [0;1] for
                            properties (default:0)
 -sparql                    if given the datasets are considered as sparql
                            endpoints
 -t <eq or in>              types of extracted keys: eq or/and in (eq and
                            in by default)

Example of command line:

java -mx5000m -jar  LinkkeyDiscovery-1.0-SNAPSHOT-jar-with-dependencies.jar -s 0.01 -i -c 4 -c1 "http://xmlns.com/foaf/0.1/Person" -c2  "http://xmlns.com/foaf/0.1/Person" -t in -f html -o mycandidates dataset1.ttl dataset2.ttl

This will extract link key candidates between the classes foaf:Person (options -c1 and -c2) of files dataset1.ttl and dataset2.ttl. The extraction algorithm will extract intersection link key candidates (-t in). It will consider only properties that are instanciated for at least 1% of instances of foaf:Person (-s 0.01). It will consider inverse of properties and compostion of them until a maximum path length of 4. The result will be rendered as a set of html files (-f html) located in directory "mycandidates" (-o mycandidates). The option -mx5000m allows to give 5GB of memory to the virtual machine.