zur Startseite


Approximating Distributed Graph Processing
Betreuer Dr. rer. nat. Sukanya Bhowmik
Prüfer Prof. Dr. rer. nat. Dr. h. c. Kurt Rothermel

Distributed Graph Processing (DGP) has gained massive popularity fueled by the need to perform complex graph analysis on massive real-world graphs such as digitized social networks or crawled web document
graphs. Recent work argues that the growth rate of data outstrips the growth rate of processors. Hence, the challenges of low-latency large-scale graph processing are likely to become more pronounced in the next decade. To this end, the goal of this thesis is to follow the principle of approximate computing by developing new paradigms for the think-like-a-vertex approach to distributed graph processing where each vertex acts as an independent processing unit that exchanges messages with other vertices and adapts its vertex data according to the messages received by neighboring vertices. Existing research in this direction proposes to drop messages randomly. In this thesis, our goal is to go beyond this idea and to develop new approaches for 'think-like-a-vertex message sampling' in a distributed environment that exploit application knowledge.

For instance, in the shortest path algorithm, vertices exchange messages about the current path length to learn about the shortest paths via neighboring vertices. A first idea to improve the naive approach of randomly dropping messages is to drop messages according to their \textit{information gain} (or entropy). A second idea is to drop messages with higher probability if this results in high saved communication overhead during distributed computation.These approaches promise super-linear runtime reduction with the number of dropped messages as we focus on the redundant and costly messages.

The goals and tasks of this thesis are the following.
-- Develop new methods for application-aware message dropping using concepts such as message information content and costs.
-- Implement these methods on top of existing graph systems and evaluate the runtime latency and correctness in comparison to state-of-the-art.
-- Document results in a written report and present them in the colloquium of the department of Distributed Systems (German or English).

The results need to be documented in written form and have to be presented at the VS-Colloquium.