Embedding the Alignment API

This version:
http://alignapi.gforge.inria.fr/tutorial/tutorial3/embed.html
Author:
Jérôme Euzenat, INRIA & LIG

Here is a tutorial for embedding the alignment API within your own applications.

Of course, the goal of the Alignment API is not to be used at the command line level (even if it can be very useful). So if you are ready for it, you can develop in Java your own application that takes advantage of the API.

Starting point

A skeleton of program using the Alignment API is Skeleton.java. It can be compiled by invoking:

$ javac -classpath ../../lib/align.jar:../../lib/procalign.jar -d results Skeleton.java

and run by:

$ java -cp ../../lib/Procalign.jar:results Skeleton file://$CWD/myOnto.owl file://$CWD/edu.mit.visus.bibtex.owl

Now considering the API (that can be consulted through its thin Javadoc for instance), can you modify the Skeleton program so that it performs the following:

Of course, you can do it progressively.

Call an alignment method

$ javac -classpath ../../lib/align.jar:../../lib/procalign.jar -d results MyApp.java $ java -cp ../../lib/Procalign.jar:results MyApp file://$CWD/myOnto.owl file://$CWD/edu.mit.visus.bibtex.owl > results/MyApp.owl
// Run two different alignment methods (e.g., ngram distance and smoa) AlignmentProcess a1 = new StringDistAlignment(); params.setParameter("stringFunction","smoaDistance"); a1.init ( onto1, onto2 ); a1.align( (Alignment)null, params ); AlignmentProcess a2 = new StringDistAlignment(); a2.init ( onto1, onto2 ); params = new BasicParameters(); params.setParameter("stringFunction","ngramDistance"); a2.align( (Alignment)null, params );

Manipulate alignments (merge and trim)

// Merge the two results. ((BasicAlignment)a1).ingest(a2); // Threshold at various thresholds

Evaluating alignments

// Evaluate them against the references // and choose the one with the best F-Measure AlignmentParser aparser = new AlignmentParser(0); Alignment reference = aparser.parse( (new File ( "refalign.rdf" ) ) . toURL() . toString()); Evaluator evaluator = new PRecEvaluator( reference, a1 ); double best = 0.; Alignment result = null; Parameters p = new BasicParameters(); for ( int i = 0; i <= 10 ; i += 2 ){ a1.cut( ((double)i)/10 ); evaluator = new PRecEvaluator( reference, a1 ); evaluator.eval( p ); System.err.println("Threshold "+(((double)i)/10)+" : "+((PRecEvaluator)evaluator).getFmeasure()+" over "+a1.nbCells()+" cells"); if ( ((PRecEvaluator)evaluator).getFmeasure() > best ) { result = (BasicAlignment)((BasicAlignment)a1).clone(); best = ((PRecEvaluator)evaluator).getFmeasure(); } }

Displaying an alignment

// Displays it as OWL Rules PrintWriter writer = new PrintWriter ( new BufferedWriter( new OutputStreamWriter( System.out, "UTF-8" )), true); AlignmentVisitor renderer = new OWLAxiomsRendererVisitor(writer); a1.render(renderer); writer.flush(); writer.close();

Putting them together

Do you want to see a possible solution?

The main piece of code in Skeleton.java is replaced by:

  // Run two different alignment methods (e.g., ngram distance and smoa)
  AlignmentProcess a1 = new StringDistAlignment();
  params.setParameter("stringFunction","smoaDistance");
  a1.init ( onto1, onto2 );
  a1.align( (Alignment)null, params );
  AlignmentProcess a2 = new StringDistAlignment();
  a2.init ( onto1, onto2 );
  params = new BasicParameters();
  params.setParameter("stringFunction","ngramDistance");
  a2.align( (Alignment)null, params );

  // Merge the two results.
  ((BasicAlignment)a1).ingest(a2);

  // Threshold at various thresholds
  // Evaluate them against the references
  // and choose the one with the best F-Measure
  AlignmentParser aparser = new AlignmentParser(0);
  Alignment reference = aparser.parse( (new File ( "refalign.rdf" ) ) . toURL() . toString());
  Evaluator evaluator = new PRecEvaluator( reference, a1 );

  double best = 0.;
  Alignment result = null;
  Parameters p = new BasicParameters();
  for ( int i = 0; i <= 10 ; i += 2 ){
    a1.cut( ((double)i)/10 );
    evaluator = new PRecEvaluator( reference, a1 );
    evaluator.eval( p );
    System.err.println("Threshold "+(((double)i)/10)+" : "+((PRecEvaluator)evaluator).getFmeasure()+" over "+a1.nbCells()+" cells");
    if ( ((PRecEvaluator)evaluator).getFmeasure() > best ) {
       result = (BasicAlignment)((BasicAlignment)a1).clone();
       best = ((PRecEvaluator)evaluator).getFmeasure();
    }
  }
  // Displays it as OWL Rules
  PrintWriter writer = new PrintWriter (
		        new BufferedWriter(
	                 new OutputStreamWriter( System.out, "UTF-8" )), true);
  AlignmentVisitor renderer = new OWLAxiomsRendererVisitor(writer);
  a1.render(renderer);
  writer.flush();
  writer.close();

The execution provides an insight about the best threshold:

Threshold 0.0 : 0.4693877551020408 over 148 cells
Threshold 0.2 : 0.5227272727272727 over 128 cells
Threshold 0.4 : 0.5476190476190476 over 120 cells
Threshold 0.6 : 0.6478873239436619 over 94 cells
Threshold 0.8 : 0.75 over 72 cells
Threshold 1.0 : 0.5151515151515151 over 18 cells

A full working solution is MyApp.java.

More work: Can you add a switch like the -i switch of Procalign so that the main class of the application can be passed at commant-line.

Advanced: You can develop a specialized matching algorithm by subclassing the Java programs provided in the Alignment API implementation (like BasicAlignment or DistanceAlignment).

Advanced: What about writing an editor for the alignment API?

Further exercises

More info: http://alignapi.gforge.inria.fr


http://alignapi.gforge.inria.fr/tutorial/tutorial3/embed.html


$Id$