Authors | Hamid Saadatfar |
---|---|
Journal | Journal of Big Data |
Page number | 1-17 |
Serial number | 6 |
Volume number | 1 |
Paper Type | Full Paper |
Published At | 2019 |
Journal Grade | ISI |
Journal Type | Typographic |
Journal Country | Iran, Islamic Republic Of |
Journal Index | Scopus |
Abstract
In recent years, the rapid growth of the Internet has led to creation of massively large graphs. Since databases have become very large nowadays, they cannot be processed by a simple machine at an acceptable time anymore; therefore, traditional graph partitioning methods, which are often based on having a complete image of the entire graph, are not applicable to large datasets. This challenge has led to the appearance of a new approach called streaming graph partitioning. In streaming graph partitioning, a stream of input data is received by a partitioner, and partitioner decides which computational machine the data should be transferred to. Often, streaming partitioner does not have any information about the whole graph, and usually distributes the vertices based on some greedy heuristics which may not be optimal for incoming vertices. Hence, partitioner’s decision can be significantly improved if more information about the graph is utilized. In this paper, we present a new vertex-cut streaming graph partitioning approach. The proposed method uses the idea of postponing the decision for some of the edges (by means of an intelligent buffering) and corrects some of the past decisions to improve the quality of the graph partitioning. The proposed approach is evaluated using from real-world graphs. The experimental results show that the performance of the proposed method is superior in comparison with the previous HDRF method.
tags: Graph partitioning, Streaming partitioning, Edge partitioning, Big Data Processing