5ISS Lab 2
Purpose
During this lab, you will manipulate an ontology in you source code to build a semantic-aware application. Namely, you will reuse the ontology you previously developped, to annotate a dataset produced by the city of Aarhus, Denmark. These data are collected from temperature sensors, and they are stored in CSV files, which is at 3 stars on the Linked Data hierarchy. You will convert it to 5-star data by using the ontology you built last lab.
To complete this lab, a codebase is provided. It’s a wrapper around one of the main libraries for manipulating the Semantic Web principles and technologies in Java : Apache Jena.
Setup
- Prerequisite:
- Java 8
- maven
- git
- Clone the following git repository :
git clone https://framagit.org/nseydoux/iss-semantics-lab
- You can import the Maven project in Eclipse:
- Go to the cloned repo, and execute the command
mvn compile
- If all goes well, you can now test your code using the command
mvn test
. The tests will fail, because nothing is implemented yet. These tests are unit tests, validating each functionnality that you have to develop. Try to run them as you go. - Download the dataset.
Implementing the interfaces
The codebase that you cloned contains Java interfaces (functions specifications), namely IControlFunctions
and IModelFunctions
. These interfaces are implemented by the stubbed classes DoItYourselfModel
and DoItYourselfControl
. The functions to implement are of increasing complexity in order: the first one is the easiest, and as you progress you can reuse the previous elements you developped.
Implementing IModelFunction
Start by implementing IModelFunction
in DoItYourselfModel
. These functions provide knowledge-base related operations. To help you, you will find functions in the IConvenienceInterface
that wrap lower-level Jena functions and SPARQL queries.
After having written each functions, run the tests. You can read the code for the tests if you are unsure of what you should do.
Implementing IControlFunctions
The controler uses functions from the model, and uses them to enrich the dataset. Once you complete the interface implementation, go to the main function in the Controler class. You must edit some code snippets depending on your environment.
Remarks
This is a purely pedagogical codebase, and its performances are not great: many optimisations could be done. Moreover, the generated knowledge graph is managed in memory only: when the software stops, the knowledge base disappears. Export it if you want to persist it. The bottleneck are queries to the knowledge base, so think about caching.Exploitation in Protégé
Import the generated model in Protégé (it takes a while). We now want to check if the sensors have been chosen wisely. SSN (a sensors ontology) contains classes and propertie to describe the conditions in which a sensor should be operated. Using the W3C description of SSN, identify these classes/properties. Do you think the sensor are adapted to the environment in which they are deployed ? Could this deduction have been automated ?Additional question
- What is the difference between object property and data property ?