Given a digraph, we may wish to find the shortest distance between two nodes, or between a node and all other nodes, etc. Each edge has a weight that corresponds to a distance.
We’ll allow parallel edges and self loops.
Data Structure
First, note that a shortest paths tree exists, where the root is the start node.
Let’s define two lists: One (distTo) will give the distance to a given node. The other (edgeTo) will give the last node before reaching the given node. So if we wish to find the path from node 0 to node \(n\), we trace back from \(n\).
(You could, of course, use two dictionaries with the nodes being the keys).
Properties
Let’s say we’re starting from source node \(s\). An edge relaxation for edge \(e=v-w\), is simply adding the edge to our shortest paths tree, if it causes one of the distances (to \(v\) or \(w\)) to decrease.
distTo will have the shortest distances from \(s\) to the other nodes if and only if the following hold:
- distTo[s]=0.
- For each node \(v\), distTo[v] is the length of some path from \(s\) to \(v\).
- For each edge \(e=v-w\), \(d(w)\le d(v)+e.weight\) where I used \(d(v)\) to mean distTo[v].
Generic Algorithm
Here I’ll describe the template for these algorithms:
Set distTo[s]=0 and distTo[s]= \(\infty\). Keep relaxing edges until you reach optimality.
Why does this work? Each relaxation will decrease distance to a node. And you can have only finitely many relaxations.
The algorithms differ in the scheme to pick which edges to relax.
Dijkstra’s Algorithm
Dijkstra’s algorithm works when the weights are all non-negative.
Initialize \(d(s)=0\) and all other ones to \(\infty\).
Choose the unvisited node with the lowest \(d\), and relax all edges from it, updating \(d\) for each neighbor. Mark this node as visited, and repeat till all nodes are visited. Note that we can mark a given node as visited, as the distance to it cannot possibly get any smaller: All the remaining distances are already bigger (by design).
The complexity is \(O(M\log N)\).
Note that this algorithm is practically Prim’s algorithm. In that one, at each step, we found the node closest to the tree. Here we find the node closest to \(s\).
Edge Weighted DAGs
This algorithm works for DAGs, and even with negative weights.
Basically, consider nodes in topological order. For each node, relax the edges from it.
The distance to an already considered node can never decrease as there will be no edges to it (because we are traversing in topological order).
The complexity is \(O(M+N)\)
Applications
Seam Carving
This algorithm is used in seam carving, (liquid rescale in GIMP, I presume?). Let a pixel be a node, and the neighbors be the 3 pixels below it. The weights are given by an energy function. The shortest path from the top of the image to the bottom can easily be found with this algorithm, Each row of pixels will have exactly one pixel in this path. We can then just delete these pixels to rescale (horizontally).
Parallel Job Scheduling
You have a set of jobs, each that take some known amount of time. Some jobs are dependent on other jobs. We want to find the ordering that will complete all jobs in the minimal amount of time.
To solve this:
- Define imaginary source and sink nodes. Connect these to all jobs (beginnings for source and ends for sink) with 0 weighted edges.
- For each job, make two nodes: Start and stop. Connect them with an edge whose weight is the duration.
- From the finish node, connect using 0 weighted edges to dependent jobs.
The solution is called the critical path method: Use the longest path from source to schedule it.
Question: Why the longest path?