Abstract
As with general graph processing systems, partitioning data over a cluster of machines improves the scalability of graph database management systems. However, these systems will incur additional network cost during the execution of a query workload, due to inter-partition traversals. Workload-agnostic partitioning algorithms typically minimise the likelihood of any edge crossing partition boundaries. However, these partitioners are sub-optimal with respect to many workloads, especially queries, which may require more frequent traversal of specific subsets of inter-partition edges. Furthermore, they are largely unsuited to operating incrementally on dynamic, growing graphs. We present a new graph partitioning algorithm, Loom, that operates on a stream of graph updates and continuously allocates the new vertices and edges to partitions, taking into account a query workload of graph pattern expressions along with their relative frequencies. First we capture the most common patterns of edge traversals which occur when executing queries. We then compare sub-graphs, which present themselves incrementally in the graph update stream, against these common patterns. Finally we attempt to allocate each match to single partitions, reducing the number of inter-partition edges within frequently traversed sub-graphs and improving average query performance. Loom is extensively evaluated over several large test graphs with realistic query workloads and various orderings of the graph updates. We demonstrate that, given a workload, our prototype produces partitionings of significantly better quality than existing streaming graph partitioning algorithms Fennel & LDG.
Original language | English |
---|---|
Title of host publication | Advances in Database Technology - EDBT 2018 |
Subtitle of host publication | 21st International Conference on Extending Database Technology, Proceedings |
Editors | Michael Bohlen, Reinhard Pichler, Norman May, Erhard Rahm, Shan-Hung Wu, Katja Hose |
Publisher | OpenProceedings.org |
Pages | 337-348 |
Number of pages | 12 |
ISBN (Electronic) | 9783893180783 |
DOIs | |
Publication status | Published - 2018 |
Event | 21st International Conference on Extending Database Technology, EDBT 2018 - Vienna, Austria Duration: 26 Mar 2018 → 29 Mar 2018 |
Publication series
Name | Advances in Database Technology - EDBT |
---|---|
Volume | 2018-March |
ISSN (Electronic) | 2367-2005 |
Conference
Conference | 21st International Conference on Extending Database Technology, EDBT 2018 |
---|---|
Country/Territory | Austria |
City | Vienna |
Period | 26/03/18 → 29/03/18 |
Bibliographical note
Publisher Copyright:© 2018 Copyright held by the owner/author(s)
ASJC Scopus subject areas
- Information Systems
- Software
- Computer Science Applications