zur Startseite

Masterarbeit

Automatic Splitting in Data-Parallel Complex Event Processing Systems
Betreuer Dr. rer. nat. Ruben Mayer
Prüfer Prof. Dr. rer. nat. Dr. h. c. Kurt Rothermel
Beschreibung

Complex Event Processing (CEP) is a paradigm applied in many different application areas like logistics, traffic monitoring and automatic trading, to infer from source events like sensor readings the occurrence of complex situations of interest in the surrounding world. Such situations can be, for instance, the delayed delivery of a packet, traffic jams or accidents and leading market signals. In order to stepwise infer their occurrence from the sensor streams, a distributed network of interconnected CEP operators, the operator graph, is deployed. Each operator processes incoming event streams and detects a designated part of an event pattern that corresponds to a situation of interest. If such a pattern was detected, a new event is produced and emitted in outgoing event streams to successor operators or to a consumer.

To be able to cope with high and fluctuating incoming event rates, the principle of data parallelization has been applied to CEP operators. To this end, incoming event streams are split into independently processable partitions and processed in parallel by a flexible number of operator instances. The challenge in doing so is to find partitions that yield consistent results, i.e., no false-negatives or false-positives are produced. Therefore, exactly those events need to be contained in a partition that belong to a sought-after event pattern. State-of-the-art data parallelization frameworks employ a dedicated splitter that builds the partitions on the incoming event streams of an operator and distributes them to the operator instances. However, the splitting rules currently have to be manually implemented by a domain expert.

This can be cumbersome and error-prone. Hence, the goal of this thesis is to generate splitting rules automatically by analyzing the CEP query. To this end, state-of-the-art CEP query languages, like TESLA and Snoop, shall be analyzed first. Based on common patterns of such CEP languages, a pattern abstraction shall be derived that is used as an intermediate model between CEP queries and the splitting logic. Then, a method to automatically transform CEP queries to the intermediate model shall be developed. Finally, methods to automatically derive splitting rules from the model shall be devised.

The developed concepts shall be evaluated based on an existing prototype. The results of the thesis shall be presented in a talk in the VS colloquium.