About inspire-foss

About inspire-foss

This HowTo describes how to transform your local national data to INSPIRE harmonized data and how to embed this transformation in a complete ETL (Extract, Transform, Load) chain. This will be illustrated using some of the existing transformation resources present within the project. In the remainder of this document the term “ETL” will be used for the overall process from local data to harmonized INSPIRE data. “Content Transformation” is a similar term as ETL and may be found in some of the documentation. The ETL is tightly integrated with the provision of download and view network services via WFS and respectively WMS. These network services are not part of this HowTo.

Illustrations-2In order to grasp the steps that follow in upcoming sections we advise to first read about the concepts, design and technologies underlying the ETL.

The main document (HTML and PDF) to start with is http://inspire.kademo.nl/doc. Although this document has references to data specific to the Dutch Kadaster it has general applicability. In particular the sections about Concepts and ETL Design should provide the basic information on the basics of the ETL. This document also provides some specific information on the ETL Implementation where actual code is even directly included within the document. Though the ETL may look different for each dataset and target INSPIRE Theme, there are underlying “patterns” which not only divide the ETL components into logical units but also make many of these components highly reusable. See how this would be interesting if you could give yourself a chance to visit casino france online.

All code for existing ETL can be browsed online via the ETL folder within the Subversion repository. As you may have noticed the directory layout under the etl folder is organized by <Country>.<Data Provider>/<INSPIRE Theme>, for example NL.RWS/TransportNetworks. Directly under etl is a directory called shared that contains shared ETL code for multiple data providers and INSPIRE Themes. Not all data providers follow this convention yet. Best is to look under NL.Kadaster and NL.RWS.

Further background knowledge that helps in understanding the ETL implementation deals with the technologies that are used:

  • GDAL/OGR core format and projection transformation. Although GDAL/OGR is an entire suite of libraries/tools, only a single command-line tool is used: ogr2ogr.
  • XSLT is used for model/schema transformation. To learn about XSLT, the best source to start is the XSLT tutorial at W3Schools.com.
  • The ETL chain and the invokation of the tools like xsltproc and ogr2ogr are “glued” together using Unix/Linux shell scripts. To learn the Steve’s Bourne / Bash shell scripting tutorial is a good starting point