All-Pairs Shortest Paths Algorithm for High-dimensional Sparse Graphs

Here the All-pairs shortest path problem on weighted undirected sparse graphs is being considered. For the problem considered, we propose ``disassembly and assembly of a graph'' algorithm which uses a solution of the problem on a small-dimensional graph to obtain the solution for the given graph. The proposed algorithm has been compared to one of the fastest classic algorithms on data from an open public source.


Introduction
The APSP (all-pairs shortest path problem) is one of the most popular tasks in graph theory because the shortest paths between all pairs of vertices are used for solving many problems involving discrete optimization (TSP, theory of transportation task etc).Moreover, the task itself is of great interest in research.
Recently this problem has gained new interest due to a growing number of highly detailed graphs that are generated automatically and describe structures from the real world.Such graphs have about 10 6 or more vertices and this number will inevitably increase.So the acceleration of APSP solving for high-dimensional graphs is becoming highly important.
Because of its popularity, there are a lot of APSP solution algorithms but there isn't any method to obtain the solution as fast for different kinds of input data.That's why APSP solution algorithms can be classified according to the type of graph as follows: directed [3], complete [5], weighted [4], unweighted [1] and sparse [7].
Here we present an algorithm for solving the APSP for weighted, undirected and highdimensional sparse graphs with non-negative weights.
This paper is organized as follows.In section , we introduce notation and the problem definition, in section 2, we describe the algorithm and in section 3 we show the results in comparison with one of the most renowned APSP algorithms.

Notation and problem definition 1.Terms and definitions
Here, we consider a connected, undirected and sparse graph G = (V, E, w), where each edge e (v i , v j ) has a non-negative weight w (i, j).The given graph G is considered to be simple (has no loops or multiple edges).
Denote by |V | = n the order of a graph or cardinality of vertices set.Denote by |E| = m the size of a graph or cardinality of edges set.
Denote by w (i, j) the weight of the edge between vertices v i and v j (w (i, j) = ∞, for nonconnected vertices).
A path is an alternating sequence of vertices and edges v 0 , e 1 , v 1 , . . ., v k−1 , e k , v k , beginning and ending with a vertex.In that sequence, each vertex is incidental to both the edge that precedes it and the edge that follows it.A length of a path is the sum of the weights of its edges.A distance m (i, j) between v i and v j is the length of the shortest path p s ij = p s (v i , v j ) between these vertices.A distance matrix is a matrix in which each element at the intersection of ith row and jth column contains the length of the shortest path between v i and v j .A graph is said to be connected if every pair of vertices in the graph is connected by some path, i.e. m ij < ∞, ∀i, j.
Between any pair of vertices there can be more than one shortest path.We do not consider it as an essential issue in this paper, so the references to the shortest path can mean any of them.
A matrix is called a precedence matrix if each element p ij of the matrix corresponds to the vertex that precedes vertex v j in the path from v i to v j .Therefore the elements of P can be determined by Using P the shortest path p s ij from v i to v j in a connected graph can be obtained by the recursive formula: Now, we shall give the following supplementary definitions.Let us call a graph sequence Every next graph G p+1 of the sequence is obtained from the previous G p by the removing the k vertices and the edges incidental to them, plus the addition of new edges and by recalculating the weights of the edges adjacent to the deleted ones.
For these graphs, we get the graph obtained from G p by removing the vertices v p 1 , v p 2 , . . ., v p k and the edges incidental to them.For this graph we get w p+1 (i, j) = w p (i, j), ∀i, j : Denote by m p (i, j) the distance between v p i and v p j in G p .By M p = m p ij denote the distance matrix of G p .Also denote by v p i l lth adjacent to v p i vertex and by A p i the set of all adjacent to v p i vertices in graph G p .

Problem definition
Given a connected, undirected, simple, weighted and sparse graph G = (V, E, w), where each edge has a non-negative weight w : E → [0, ∞).Find the shortest paths between every pair of vertices of the graph, i.e. find the distance matrix M and the precedence matrix P of the graph.

Algorithm of the solution 2.1 Main idea
The main idea of the introduced algorithm is to reduce the problem on a large graph to the problem on a smaller graph.The algorithm can be partitioned into 3 stages.

Compression.
A large initial graph is replaced by a small graph.
2. Microsolution.The APSP for the small graph is solved by using any known method.
3. Restoring.The APSP solution for the small graph is projected onto the initial graph.
While using this method we must satisfy the following conditions: a) validity "-the compression must keep information about the shortest paths of the initial graph; b) efficiency "-the introduced method must be quicker than all others.
The algorithm in which similar ideas were used are considered in [6].Here we introduce an algorithm of a graph disassembly/assembly for large sparse graphs.At the disassembly stage, we consistently remove vertices, and then solve the APSP for the resulting small graph.At the assembly stage the initial graph is restored with the calculation of distances and paths.

Disassembly
The disassembly stage consists of consistent approximation of the initial graph G 0 = (V 0 , E 0 , w 0 ) by the graphs of a shrinking sequence S = {G 1 , G 2 , . . ., G r }.Here we consider a particular case in which every next graph G p+1 of the sequence S is obtained by removing only one vertex from G p .Suppose that vertex v p i is to be removed.Let the degree of v p i be equal to k.If any shortest path contains v p i (except shortest path straight to or from v p i ) then this path contains subpath v p i j , e p (v p i j , v p i ), v p i , e p (v p i , v p i l ), v p i l : j, l ∈ {1, 2, . . ., k}.Therefore to remove vertex v p i properly, we need to preserve the shortest paths only between vertices adjacent to v p i .By w mv(1,2,...,h) p (i j , i l ) = min g=1,2,...,h (w p (i j , g) + w p (g, i l )) denote the minimum sum of the weights of two edges which connect vertices v p i j , v p i l and are incidental to a common vertex that belongs to the set v p 1 , v p 2 , . . ., v p h of G p .To preserve distances it is sufficient to have for any pair At the beginning of the algorithm any element of P ′ is equal to infinity p ′ ij = ∞, ∀i, j.To preserve the information about the shortest paths, for each element of P ′ that satisfies Note: if vertex v p i , which is to be removed, is adjacent only to one vertex of G p , so, as there are no shortest paths passing through v p i , the vertex and the incidental edge are simply removed without the shortest path preservation.
We use three parameters for the disassembly stage.d max "-is the maximum degree of the vertices to be removed.n min "-is the order of graph G r , which is the last (smallest) graph of the shrinking sequence.I max "-is the limit of the increasing number of edges after the removal of one vertex.The assignment of values to d max , n min and I max is a problem in itself, which will be discussed elsewhere.The results, which are shown in part 3, have been obtained by assignment Let us try to remove vertex v p i with all of its k incidental edges and preserve the shortest paths.Denote by I (v p i ) the change in the number of graph edges when the vertex is removed.The removal of v p i itself will decrease the number of edges by k, therefore we get I (v p i ) = −k.Using the shortest paths preservation and (1), we have: Thus we'll obtain the change in the size of graph G p+1 relative to G p after the removal of vertex v p i .If I (v p i ) > 0, then the graph size increases, otherwise the graph size decreases or remains the same.Using (3) we expect that the increase of the graph size is bounded above by I max when a vertex is removed.It follows that vertex v p i can be removed only if I (v p i ) ≤ I max .The selection of the vertices that we are going to remove is performed in the following way.Since vertices meeting d (v p i ) < 3 can be removed anyway, it follows that vertices should be removed in ascending order of their degrees from 1 to d max .This speeds up the algorithm due to a smaller number of processed vertices with degrees close to d max .After we remove v p i , the degrees of the adjacent vertices can change, hence, if we remove v p i , the vertices adjacent to v p i should be processed through recursion.The graph disassembly algorithm and an auxiliary algorithm of vertices inspection and removal are on fig. 1 and 2.

Vertices inspection and removal
Count the weights of the edges between vertices A p i by (1).Change the elements of the matrix P ′ by (2).n c = n c − 1, t = p, p = p + 1.
If n c = n min , end of algorithm.Else, while n c > n min for vertices v p i l : d v p i l < d v t i l do Vertices inspection and removal (v p i l , n c , I max , d min , n min , p, P ′ ).
Fig. 1: Auxiliary algorithm of vertices inspection and removal.

Algorithm of the graph disassembly
Else end of algorithm.

Step 2. Vertices inspection and removal
Vertices inspection and removal (v p i , n c , I max , d max , n min , p, P ′ ), go to step 1. Output: graph G r = G p .Fig. 2: Algorithm of the graph disassembly.

Microsolution
Here the APSP for G r is solved.The result of the stage is the distance matrix M r of G r .We use matrix M ′ r = M r and recalculate P ′ by here p r ij are the elements of the matrix P r = p r ij , which corresponds to G r .The calculated paths are the shortest ones due to the usage of the distances preservation method.In other words, we have m Obviously, if G r has only one vertex then this stage is skipped and the assembly of the graph starts.

Assembly
Before this stage starts, the graph assembly sequence S = {G 0 , G 1 , . . ., G r } is defined.Here G 0 "-is the initial graph, G r "-is the smallest graph.The shortest paths between all vertices of G r were found in the previous stage.At the assembly stage we restore the removed vertices in reverse order to their removal.That is we move from G r to G 0 through G r−1 , G r−2 , . . ., G 1 , recalculating the shortest paths for vertex v r−p i Denote by x(l) the number i z such that x(l) : then the respective elements of matrix P ′ should be changed by The proposed algorithm speeds up the solving of APSP an average of 47 times faster in comparison with the Dijkstra algorithm.For each and all test graphs the algorithm is faster than the Dijkstra's algorithm (the minimum speed up is 34 times faster).During the tests, the vertices degrees were increased to a maximum of 17.This means that the complexity of the vertices removal increases during the disassembly only slightly.

Conclusion
The proposed algorithm noticeably accelerates the solving of the APSP for graphs of road networks, which is confirmed by the tests.The objects of further research may be the selection of the algorithm parameters based on a fast analysis of graph properties, the modification of the disassembly and assembly order and the scalability issues of the algorithm relative to the increasing of a graphs' dimensions.Also, it is interesting to modify the algorithm to solve the problem quicker, but within a given error.
−p in each step p. Suppose vertex v r−1 i is to be restored, i.e. we move from G r to G r−1 .Vertex v r−1 i is connected with vertices v r−1 iz k z=1 by k edges.Matrix M ′ r = M r of G r was found in the previous step, therefore to find the matrix M ′ r−1 of G r−1 , we only need to calculate the shortest paths from vertex v r−1 i to all other vertices of G r−1 .Other elements of M ′ r−1 are assigned equally to the corresponding elements of M ′ r , that is m number of vertices n c , I max , d max , n min , p, P

Table 2 :
Test results