Language-aware Indexing for Conjunctive Path Queries (2022)


Link: https://arxiv.org/abs/2003.03079

Abstract: Conjunctive path queries (CPQ) are one of the most frequently used queries for complex graph analysis. However, current graph indexes are not tailored to fully support the power of query languages to express CPQs. Consequently, current methods do not take advantage of significant pruning opportunities during CPQ evaluation, resulting in poor query processing performance. We propose the CPQ-aware path index CPQx, the first path index tailored to the expressivity of CPQ. CPQx is built on the partition of the set of source-target vertex pairs of paths in a graph based on the structural notion of path-bisimulation. Path-bisimulation is an equivalence relation on paths such that each partition block induced by the relation consists of paths in the graph indistinguishable with respect to CPQs. This language-aware partitioning of the graph can significantly reduce the cost of query evaluation. We present methods to support the full index life cycle: index construction, maintenance, and query processing with our index. We also develop interest-aware CPQx to reduce index size and index construction overhead while accelerating query evaluation for queries of interest. We demonstrate through extensive experiments on 14 real graphs that our methods accelerate query processing by up to multiple orders of magnitude over the state-of-the-art methods, with smaller index sizes. Our complete C++ codebase is available as open source for further research.