- Execution-Plan optimization for Simulations
-
The systematic cost estimation of alternative query execution plans has a long tradition in query optimizers of database management systems. In simulations, which involve solving partial differential equations (PDEs), we also observe that there are alternative schemes and implemntations to solve a PDE, which comprise different common steps that need to be well chosen and properly parameterized for a particular setting (defined by the available hardware, time constraints, etc.). We explore how concepts of query optimization can be brought to simulations to enable a more systematic selection of an adequate execution plan and good parameters compared to the current approach that typically relies on expert knowledge and experience.
- Runtime Optimization in DISC Systems
-
Data intensive scalable computing (DISC) systems, such as Apache Hadoop or Spark, allow to process large amounts of heterogenous data. In the context of such DISC systems, we research how to reduce the overall runtime of data processing under a variety of system characteristics (e.g., systems with multiple concurrent jobs, systems with provenance capture enabled, ...).
- Debugging Declarative Data Processing
-
Using declarative languages such as SQL to specify data processing, developers often face the problem that they cannot properly inspect or debug their query or transformation code. All they see is the tip of the iceberg once the result data is computed. If it does not comply with the developers’ expectation, they usually perform one or more tedious and mostly manual analyze-fix-test cycles until the expected result occurs. The goal of our research is to support developers in this process by providing a suite of algorithms and tools to accompany the process.