5ISS Lab 2

Purpose

During this lab, you will manipulate an ontology in you source code to build a semantic-aware application. Namely, you will reuse the ontology you previously developped, to annotate a dataset produced by the city of Aarhus, Denmark. These data are collected from temperature sensors, and they are stored in CSV files, which is at 3 stars on the Linked Data hierarchy. You will convert it to 5-star data by using the ontology you built last lab.

To complete this lab, a codebase is provided. It’s a wrapper around one of the main libraries for manipulating the Semantic Web principles and technologies in Java : Apache Jena.

Setup

Implementing the interfaces

The codebase that you cloned contains Java interfaces (functions specifications), namely IControlFunctions and IModelFunctions. These interfaces are implemented by the stubbed classes DoItYourselfModel and DoItYourselfControl. The functions to implement are of increasing complexity in order: the first one is the easiest, and as you progress you can reuse the previous elements you developped.

Implementing IModelFunction

Start by implementing IModelFunction in DoItYourselfModel. These functions provide knowledge-base related operations. To help you, you will find functions in the IConvenienceInterface that wrap lower-level Jena functions and SPARQL queries.

After having written each functions, run the tests. You can read the code for the tests if you are unsure of what you should do.

Implementing IControlFunctions

The controler uses functions from the model, and uses them to enrich the dataset. Once you complete the interface implementation, go to the main function in the Controler class. You must edit some code snippets depending on your environment.

Remarks

This is a purely pedagogical codebase, and its performances are not great: many optimisations could be done. Moreover, the generated knowledge graph is managed in memory only: when the software stops, the knowledge base disappears. Export it if you want to persist it. The bottleneck are queries to the knowledge base, so think about caching.

Exploitation in Protégé

Import the generated model in Protégé (it takes a while). We now want to check if the sensors have been chosen wisely. SSN (a sensors ontology) contains classes and propertie to describe the conditions in which a sensor should be operated. Using the W3C description of SSN, identify these classes/properties. Do you think the sensor are adapted to the environment in which they are deployed ? Could this deduction have been automated ?

Additional question