Data Transparency

Research Topics

In a data-driven world, where decisions are based on or even automatically made based on data analysis empowered by Big Data technologies, it has become more important than ever to make the data processing transparent. Indeed, data transparency is essential to better understand, retrace, and explain data-driven decisions. This forms the basis for creating accountable decision support systems.

Data provenance

Our research on data transparency focuses on tracing data that is being prepared for data analysis through common data engineering steps such as data collection, data cleaning, data integration, or data distribution. To this end, we research how to model, capture, provision, and use so called provenance, i.e., meta-data about the production process of some digital data.

Data explanation

A second line of research is to study how data sets resulting from some processing may be explained. How did I obtain this result? Why is it structured like this? Why is expected information missing? This includes research on formalisms and languages to express a need for explanation, foundations of what form explanations should take, and algorithms to efficiently and effectively provide data explanations.

Accountable data analytics

We should be able to perform data analytics in an accountable way, for instance to legitimate data-driven decisions. This requires making the development process, data, and internal processes of data analytics solutions transparent to different stakeholders (end-users, auditors, etc). We are developing data management solutions to support such accountable data analytics systems.

To the top of the page