Conjunctive Path Query (CPQ) Cores

The core of a CPQ is a unique CPQ with the same evaluation result as the original CPQ and whose query graph is smallest. The query graph of a CPQ is the transformation of a CPQ into a graph, e.g. $a \circ b$ can be represented with a graph containing two nodes and two both an edge labelled $a$ and an edge labelled $b$ between the two nodes. As an extra definition we have an ie-subgraph, an ie-subgraph of query graph $G_q = (V_q, E_q, L, v_s, v_t)$ (the query graph of query $q$) is a query graph $G’_q = (V’_q, E’_q, L, v’_s, v’_t)$ such that there exists a query graph $G’ = (V’, E’, L, v_s, v_t)$ that is isomorphically equivalent to $G’_q$, where $V’ \subseteq V_q$ and $E’ \subseteq E_q$.

Languages aware indexing for CPQs is based on path equivalence under the notion of k-path-bisimulation. The key takeaway is that k-path-bisimilar paths have the same evaluation result for any CPQ, meaning they partition the space of all paths into blocks of paths that cannot be distinguished by any CPQ.

Query Homomorphism

For a proper definition of query cores we first need to introduce a slightly modified definition of homomorphism. For this we introduce query homomorphism, we need this because the normal definition of homomorphism says nothing about the source and target vertices of the query so this results in the loss of important information. The definition itself is fairly straightforward, a query homomorphism from $q$ to $q′$ is a function: $f : V (G_q) \rightarrow V (G_{q′})$ such that $(v, u, l) \in E(G_q)$ implies $(f(v), f(u), l) \in E(G_{q′})$ and $f(s) = s′$, $f(t) = t′$. In other words, it is a function that maps from vertex to vertex like a normal homomorphism, while placing an extra requirement on the source and target vertex of the query. In addition this definition is transitive. If two CPQs are query homomorphically equivalent then their evaluation result on any graph is equivalent.

Query Core

We define a query core as follows. Let $q, q′ \in CPQ$ be queries, $s, t \in V(G_q)$ be a source vertex and a target vertex, respectively, $s′, t′ \in V(G_{q′})$ be a source vertex and target vertex, respectively. A query core of $q$ is a query $q′$ such that:

  1. There exists a query homomorphism $f : V(G_q) \rightarrow V(G_{q′})$ but no query homomorphism $V(G_q) \rightarrow V(G_{q′′})$ for any proper ie-subgraph $G_{q′′}$ of $G_{q′}$.
  2. There exists a query homomorphism $f′ : V(G_{q′}) \rightarrow V(G_q)$. It follows that the evaluation result of of the query core is the same as the evaluation result of the original CPQ as they are query homomorphically equivalent.

Query Isomorphism

A query isomorphism is a bijective function $f^{iso} : V(G_q) \rightarrow V(G_{q′})$ such that $(v, u, l) \in E(G_q)$ if and only if $(f^{iso}(v), f^{iso}(u), l) \in E(G_{q′})$, and $f^{iso}(s) = s′$, $f^{iso}(t) = t′$. Recall that a bijection is an invertible function, meaning it is both surjective and injective. Subjective meaning every element in the target set has an element in the source set mapping to it and injective meaning that each element from the source set maps to a unique element in the target set.

This leads to some useful properties:

  1. The size of the vertex set of the source and target set of the bijective function is the same.
  2. Let $q_1$ and $q_2$ be query cores of $q$ and $f_1 : V(G_q) \rightarrow V(G_{q_1})$, $f′_1 : V(G_{q_1}) \rightarrow V(G_q)$, $f_2 : V(G_q) \rightarrow V(G_{q_2})$, $f′_2 : V(G_{q_2}) \rightarrow V(G_q)$ be query homomorphisms. There exist query homomorphisms between $q_1$ and $q_2$.
  3. Given two query cores of the same query, query homomorphisms between these query cores are surjective.
  4. Given two query cores of the same query, query homomorphisms between these query cores are injective.

These properties together lead to the observation that the query cores of some query are unique up to isomorphism, i.e. all query cores for a given query are query isomorphic.


Note: so basically, with the exception of a single proof (Proposition 4) the method for obtaining the graph core of a CPQ is completely done.