In part one and part two of this series, we introduced the concept of dynamic sourcing pipelines for data, the architectural implementation, and the metadata. Armed with this knowledge, we’ll dive in the pipeline in detail. Part three will show you how we start from external sources with very different characteristics and generalize step by step.

This blog will include lots of code-blocks and queries. Consider the code blocks as “telling the story” while we provide explanations and extra details in the text. As the data travels through the pipeline, it will be generalized. …


Illustrative photo by JJ Ying on Unsplash

In part one of this series, we discussed the background and reasons for creating dynamic sourcing pipelines — including an overview of the pipeline components. In this part, we will describe the architecture and metadata in more detail. Let’s get started!

Architectural Overview

Our extract, transform, and load system (ETL) is written in Python, which is well known by GeoPhy’s engineers. Airflow is also built in python, hence easy for us to extend. For a similar reason, we chose PostgreSQL for the database technology. Postgres also shines for geospatial data, our most common type of data.

A Python/Postgres stack may be relatively…


illustrative image for generic pipelines

Introduction

Airflow is a great tool with endless possibilities for building and scheduling workflows. At GeoPhy we use it to build pipelines dynamically, combining generic and specific components. Our system maximizes reusability and maintainability, by creating each of these pipelines from the same code. It also keeps flexibility over the specific components where you need it.

This article is the first of a three part series describing how GeoPhy uses Airflow to create dynamic sourcing pipelines. In part two we’ll go deep into the metadata framework that we developed. …

Marc Enthoven

Data Engineer @ GeoPhy

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store