Dec 19, 2016
["Algorithm"]

# Original Post of Floyd-Warshall Algorithm
While reading the article in [GeeksForGeeks about the Floyd-Warshall Algorithm](http://www.geeksforgeeks.org/dynamic-programming-set-16-floyd-warshall-algorithm/), I got little confused about the correctness of this algorithm, wondering why it works. After spending some time understanding it and search the proof about this algorithm, now I understand why the `k` loop should be the out-est loop in the algorithm. Here is some explanation for this algorithm where [GeeksForGeeks about the Floyd-Warshall Algorithm](http://www.geeksforgeeks.org/dynamic-programming-set-16-floyd-warshall-algorithm/) didn't explain well.
As it is stated in this [article](http://www.geeksforgeeks.org/dynamic-programming-set-16-floyd-warshall-algorithm/):
>Floyd Warshall Algorithm
>We initialize the solution matrix same as the input graph matrix as a first step. Then we update the solution matrix by considering all vertices as an intermediate vertex. The idea is to one by one pick all vertices and update all shortest paths which include the picked vertex as an intermediate vertex in the shortest path. When we pick vertex number k as an intermediate vertex, we already have considered vertices {0, 1, 2, .. k-1} as intermediate vertices. For every pair (i, j) of source and destination vertices respectively, there are two possible cases.
>1) k is not an intermediate vertex in shortest path from i to j. We keep the value of dist[i][j] as it is.
>2) k is an intermediate vertex in shortest path from i to j. We update the value of dist[i][j] as dist[i][k] + dist[k][j].
>The following figure is taken from the Cormen book. It shows the above optimal substructure property in the all-pairs shortest path problem.
>![Floyd Warshall Algorithm](http://d1hyf4ir1gqw6c.cloudfront.net//wp-content/uploads/Floyd-Warshell1.jpg)
The Python implementation (with a little change) for this code in GeeksForGeeks is as below:
```python
# Python Program for Floyd Warshall Algorithm
# Define infinity as the large enough value. This value will be
# used for vertices not connected to each other
INF = float("inf")
# Solves all pair shortest path via Floyd Warshall Algrorithm
def floydWarshall(graph):
""" dist[][] will be the output matrix that will finally
have the shortest distances between every pair of vertices """
""" initializing the solution matrix same as input graph matrix
OR we can say that the initial values of shortest distances
are based on shortest paths considerting no
intermedidate vertices """
V = len(graph[0])
dist = [[elem for elem in line] for line in graph]
""" Add all vertices one by one to the set of intermediate
vertices.
---> Before start of a iteration, we have shortest distances
between all pairs of vertices such that the shortest
distances consider only the vertices in set
{0, 1, 2, .. k-1} as intermediate vertices.
----> After the end of a iteration, vertex no. k is
added to the set of intermediate vertices and the
set becomes {0, 1, 2, .. k}
"""
for k in range(V):
# pick all vertices as source one by one
for i in range(V):
# Pick all vertices as destination for the
# above picked source
for j in range(V):
# If vertex k is on the shortest path from
# i to j, then update the value of dist[i][j]
dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
return dist
if __name__ == '__main__':
graph = [ [0, INF, INF, 1, 101],
[INF, 0, 200, INF, INF],
[INF, 200, 0, INF, 103],
[INF, INF, 100, 0, INF],
[INF, INF, 102, INF, 0]
]
floydWarshall(graph)
```
What makes me wonder is that in the code above, if we change the position for the `k` loop, making it as the inner loop instead of the outer loop, the algorithm would give the wrong answer. The sequence of this algorithm really matters! but why?
# My Explanation
Actually, in the post is the space-optimized algorithm for Floyd-Warshall. Since this algorithm take O(V^3) time, but the space complexity is only O(V^2).
The original dynamic programming state would be `dist(i, j, k)` representing the MINIMUM distance from node `i` to node `j` which only allow the first `k` nodes in the path. That is, we only consider the first k nodes when calculating the minimum distance from node `i` to `j`.
Based on the definition of the Dynamic Programming state `dist(i, j, k)` we have the recurrence formula as below
`dist(i, j, k) = min( dist(i, j, k-1), dist(i, k, k-1) + dist(k, j, k-1))`
That is, the MINIMUM distance from node `i` to node `j` which only allow the first `k` nodes in the path. is whether:
1. the MINIMUM distance without the node `k`
2. the MINIMUM distance with the node `k`
Since the current state `dist(i, j, k)` is only dependent on the states with term `k-1`, then we could optimize the space complexity by kicking out the dimension in the `dist` matrix, which would lead us to the algorithm in the article of [GeeksForGeeks](http://www.geeksforgeeks.org/dynamic-programming-set-16-floyd-warshall-algorithm/). And that is also the reason why we have to put the `k` loop as the out-est loop in the algorithm.
This space optimization can also be used in the knapsack problem when we can only take each item once.