Introduction To Networks: 15.053 March 22, 2007
Introduction To Networks: 15.053 March 22, 2007
Introduction To Networks: 15.053 March 22, 2007
053
Introduction to Networks
Announcement: no recitations this week Comment on Excel
Network Models
z
Optimization models that exhibit a very special structure. For special cases, this structure to dramatically reduce computational complexity (running time). First widespread application of LP to problems of industrial logistics Addresses huge number of diverse applications Todays lecture (first of 3): introductory material, Eulerian tours, the Shortest Path Problem
z z
Graph G = (V,E) Vertex set V = {1,2,3,4} Edge set: Arc Set{(1,2), (1,3), (3,2), (3,4), (2,4)} E={1-2,1-3,3-2,3-4,2-4} Network G = (N, A) Node set N = {1, 2, 3, 4}
1
Also Seen
Personally, I find introducing all of the network notation and terminology to be boring. At the same time, it is critical to any further understanding in this area. I use the notation in the text Network Flows by Ahuja, Magnanti, and Orlin. Tom Magnanti is currently Dean of Engineering at MIT. Ravi Ahuja and I have worked together more than 20 years, and have co a - uthored around 50 papers together as well as this book.
3 d An Undirected Graph
3 d A Directed Graph
Networks are universal, and are applied to a wide range of situations. Almost any system can be described in large part by a network that describes relationships between parts of a system. This includes communication systems, electrical systems, manufacturing systems, social networks, and much more.
Physical Networks Road Networks Railway Networks Airline traffic Networks Electrical networks, e.g., the power grid Abstract networks
organizational charts
precedence relationships in projects Others?
Overview:
z z z
Networks and graphs are powerful modeling tools. Most OR models have networks or graphs as a major aspect Next five lectures: we will develop models that are efficiently solvable. Help form a toolkit for those problems that are harder to solve
Next: representations of networks
A great insight from computer scientists: how data is represented is important to how it is used
3 d A Directed Graph
1 2 3 4
0 0 0 0
1 0 1 0 1 0 0 0 0 1 1 0
Have a row for each node Have a column for each node Put a 1 in row i- column j if (i, j) is an arc What would happen if (4, 2) became (2, 4)?
8
A really nice description of a network for class work is a pictorial representation. But algorithms rely on other descriptions of networks. A very simple representation of a network is an adjacency matrix. If a matrix has n nodes, the adjacency matrix is an n x n matrix. There is a 1 in row i and column j if (i, j) is an arc.
4 degree
3 d An Undirected Graph
1 2 3 4
0 1 0 1
1 0 1 1
0 1 0 1
1 1 1 0
2 3 2 3
Have a row for each node Have a column for each node Put a 1 in row i- column j if (i, j) is an arc The degree of a node is the number of incident arcs
9
The only difference between the adjacency matrix for directed and undirected networks is that in an undirected network, arc (i, j) is the same as arc (j, i) and so each arc corresponds to two 1s of the adjacency matrix, one in row i and column j, and one in row j and column i. The degree of a node is the number of incident arcs and is a useful concept. The degree of node j is equal to the row sum of row j of the adjacency matrix.
1 a 4
b c
3 d A Directed Graph
Create a list arcs for each node There are lots of very similar variants of this type of representation
10
The difficulty with adjacency matrices is that they are inefficient for large networks. For example, suppose that a network has a million nodes. (This actually happens in practice.) Then the adjacency matrix has one trillion entries. Just writing the adjacency matrix takes too much time for any algorithm. Instead, one can write the list of nodes (a million of them in our example), and follow it with a list of arcs (perhaps 10, million arcs). This is manageable, and incredibly more efficient as a representation than the adjacency matrix. To give a sense for how inefficient the adjacency matrix is in this example, note that only 20 million entries out of 1 trillion are non z because there are two entries for each - ero, arc. This means that only 1 out of every 50,000 entries is non- z ero. But the adjacency matrix would include all entries. There are different ways of writing the list of arcs that may be more efficient for different applications. But this is too subtle a point to pursue. We just conceptualize this description as a list of nodes and arcs.
On network representations
z
Each representation has its advantages Major purpose of a representation efficiency in algorithms ease of use
11
c a 2
5
b
d 3 e 4
c a 2 2
5
b
d 3 e 4
a 1 d
b e c 4 a 1 d 4 2 b e c 3 3
If you think of nodes as being intersections of a city and think of arcs as roads, then one may also think about paths and cycles. In our definitions, we do not permit paths to repeat nodes. W also do not permit cycles to repeat nodes, except that the first and last node in the representation of a cycle is the same.
Walks
1 a 4 b c d 3 2 e 1 a 4 b c d 3 2 e
Walks are paths that can repeat nodes and arcs Example of a directed walk: 1-2-3-5-4-2-3-5 A walk is closed if its first and last nodes are the same. A closed walk is a cycle except that it can repeat nodes and arcs.
13
A walk is permitted to repeat nodes and arcs. A closed walk starts and ends at the same node.
More terminology
1 2 5 1 2 5
3 4
An undirected network is connected if every node can be reached from every other node by a path
3 4
A directed network is connected if its undirected version is connected. This directed graph is connected, even though there is no directed path between 2 and 5.
14
Again, if we view the network as a road network, we expect to be able to get from any node to any other node via a path in the network. If there is a path between all pairs of nodes, then the network is connected. Note that for the definition of connected, one is permitted to use directed arcs in the wrong direction, sort of like drivers in Boston go down one way streets the wrong way. Of course, not all networks are connected. If one restricts attention to road networks, there is no path from Boston, MA to London, England.
On connectivity
There are simple efficient procedures for determining if a graph is connected.
1 7 6 2 4 5 8 3 10 9
zHere
We will not describe these algorithms, but will do a more general algorithm later in this lecture
15
If a graph is not connected, then the graph can be described as the union of connected subgraphs, called the connected components. In the example above, the blue and the yellow subgraphs are the connected components. Note that there is no arc incident to both a yellow node and a blue node. Finding connected components is important. We will not describe the algorithm, but will describe a more general algorithm later in this lecture.
z z
16
Incidentally, I have a distant connection to Leonard Euler. If you go the math geniology webite starting from me and click on advisors, you will eventually reach the following mathematicians: Lagrange, Euler, Johann Bernoulli, Jacob Bernoulli, and Leibneitz. This means that I am an academic descendent of all of them, through a direct line of advisors. To start go to: http://www.genealogy.ams.org/html/id.phtml?id=68357 Choose Herbert Robbins as the advisor of Cyrus Derman.
17
B
5 6
D Is it possible to start in A, cross over each bridge exactly once, and end up back in A?
18
B
5 6
A
1
B
2 4 6
3
C
In retrospect, it does not seem like a major conceptual leap to associate a single node with each land mass. Nevertheless, it was very important. Because of this, Euler is credited with the founding of the field of graph theory.
A
1
B
2 4 6
3
C
Translation to graphs or networks: Is there a walk starting at A and ending at A and passing through each arc exactly once? Why isnt there such a walk? 21
1
B
2 4 9
D
5 6
Here is the walk. A, 1, B, 5, D, 6, B, 4, C, 8, A, 3, C, 7, D, 9, B, 2, A Note: the number of arcs incident to B is twice the number of times that B appears on the walk.
22
We observe that any closed walk that passes through each arc exactly once must enter and leave each node the same number of times. This means that every node has even degree.
Eulerian cycle: a closed walk that passes through each arc exactly once
z z z
Degree of a node = number of arcs incident to the node Necessary condition: each node has an even degree. Why necessary? The degree of a node j is twice the number of times j appears on the walk (except for the initial and final node of the walk.)
Theorem. A graph has an eulerian cycle if and only if the graph is connected and every node has even degree.
23
Another obvious condition for an Eulerian walk is that the graph is connected. Obviously, one cannot find a closed walk through each arc unless the graph is connected, assuming that we ignore nodes with no incident arcs. Nodes of degree 0 do not need to be visited at all in an Eulerian walk. Euler showed that these two necessary conditions are also sufficient. That is, if every node has even degree and if the graph is connected, then there is an Eulerian cycle.
Eulerian path: a walk that is not closed and passes through each arc exactly once
Theorem. A graph has an eulerian path if and only if exactly two nodes have odd degree and the graph is connected.
24
An Eulerian path is defined similarly. An Eulerian path is a non c - losed walk that passes through every arc exactly once.
Exercise: Does the graph below have an Eulerian path or cycle? If so, find it.
25
Eulerian cycles
z z
Eulerian cycles and extensions are used in practice Mail Carrier routes: visit each city block at least once minimize travel time other constraints in practice?
Trash pickup routes
visit each city block at least once minimize travel time other constraints in practice?
26
The Traveling Salesman Problem : find a tour that visit all cities and minimizes the total distance traveled.
27
Another important type of walk is one that passes through each node exactly once. We call a cycle that passes through every node a Hamiltonian Cycle after the mathematician William Rowan Hamilton. The Traveling Salesman Problem is to find a Hamiltonian Cycle of minimum length. It is a very well studied problem. This picture is taken from a site devoted to the Traveling Salesman Problem. http://www.tsp.gatech.edu/
Mental Break
Top 10 Reasons Why God Can't Get University Tenure. 10. 9. 8. 7. 6. Researchers can't replicate His results. He failed to get permission from the ethics board to use human subjects. After one of His experiments failed, He drowned the subjects. He had His son teach the class. He didnt show up for lectures, and He told students to Read the Book.
28
Mental Break
5. Although He listed only ten requirements, his exam was far too hard and everyone failed. 4. He expelled his first two students for eating in class. 3. His publications did not cite earlier authorities. 2. He referred to his favorite students as the Chosen People. 1. He hasnt published anything in 2000 years?
29
More Definitions
1 2 5
3 4
A network is connected if every node can be reached from every other node by a path
A spanning tree is a connected subset of a network including all nodes, but containing no cycles.
1 2 5
30
3 4
1 2 5
3 4
1 2 5
3 4
Spanning trees turn out to be surprisingly important. They are used widely as data structures in computer software. In addition. they are closely connected to a variety of network optimization problems.
More on Trees
z
An out tree is a spanning tree in which every node has exactly one incoming arc except for the root. Theorem. In an out tree, there is a directed path from the root to all other nodes. (All paths come out of the root). One can find the path by starting at the end and working backwards.
1 1 2 5 3 4 7 12 4 8 2 5 9 13 10 3 6 11
31
2 2 1 4 3 1
4 2 3 2
What is the shortest path from a source node (often denoted as s) to a sink node, (often denoted as t)? What is the shortest path from node 1 to node 6? Assumptions for this lecture: 1. There is a path from the source to all other nodes. 2. All arc lengths are non-negative
32
There are several different algorithms for the shortest path problem. Here we will focus on one due to Dijkstra. For convenience we will assume that the graph is connected. Also, the algorithm relies on the assumption that all arc lengths are non ngative. - e This assumption is widely true in practice. But there are applications of the shortest path problem for which this assumption does not hold.
and less obvious applications, as in the course readings (in documents under lecture notes)
z
33
34
I like this application because it is not at all obvious. If you ever use LaTex, it lays out paragraphs optimally using an algorithm of this type.
On the values of F( )
The better the look of the line, the lower the value of F. The values below are illustrative, but are artificial. TeX 1 TeX 1 TeX 1 optimally 2 optimally 2 decomposes 3 optimally 2 decomposes 3 paragraphs 4 F(1, 3) = 50
F(1, 4) = 25 F(1, 5) = 3
F(1, 6) = 1
The key point is that the values of F are selected carefully so that a beautiful line is given a very low number, and an ugly line is given a high number. The shortest path problem will select a paragraph so as to minimize the sum of the ugliness of the lines.
3 36 F(1, 5) = 3
12
18
22
.3
34
41
48
.5 0
53
Every path from 1 to 53 corresponds to a laying out of the paragraph. The length of the path is the sum of the F values and thus is the sum of the ugliness of the lines of the paragraph. The shortest path corresponds to the least ugly paragraph.
1
37
12
.5
19
.3
28
.3
39
.8
47
53
0
Exercise: find the shortest path from node 1 to all other nodes. Keep track of distances using labels, d(i) and each nodes immediate predecessor, pred(i). d(1)= 0, pred(1)=0; d(2) = 2, pred(2)=1 Find the other distances, in order of increasing distance from node 1. 38
Key observations
z z z
Suppose that d(i) is the length of some path from node 1 to node i. Suppose that there is an arc (i, j) of length cij. Then there is a path from node 1 to node j of length at most d(i) + cij. P
Length(P) = 78
d(j) = 78 j
P
Length(P) = 62
i d(i) = 62
10
In this case, there is a path from 1 to j of length 72. We can reduce d(j) to 72.
39
All shortest path algorithm rely in the following elementary observation. If there is a path from node 1 to node i of length d(i) and if there is an arc (i, j) of length cij, then there is a walk from node 1 to node j of length d(i) + cij. Note that we do not assume that it is a path in that it could possibly repeat node j. (I am assuming that node 1 is the source node). The algorithm will proceed as follows. It will let d(j) be the length of some path from node 1 to node j. Initially, d(1) = 0, and d(j) = infinity for all other nodes. If it discovers a path of length d(i) + cij from node 1 to node j, and if d(i) + cij is less than d(j), then the algorithm replaces d(j) by the value d(i) + cij. There is a little more to the algorithm than this, but this is the essence.
At each iteration d(j) is the length of some path from node 1 to node j. (If no path is known, then d(j) = ) Procedure Update(i) for each (i,j) A(i) do if d(j) > d(i) + cij then d(j) : = d(i) + cij and pred(j) : = i; 62 i 78 j 72
10
Up to this point, the best path from 1 to j had length 78 But P, (i,j) is a path from 1 to j of length 72.
40
Procedure update is used to decrease d(j) whenever a shorter path is discovered from node 1 to node j.
Dijkstras Algorithm
begin d(s) : = 0 and pred(s) : = 0; d(j) : = for each j N - {s}; LIST : = {s}; while LIST do begin let d(i) : = min {d(j) : j LIST}; remove node i from LIST; update(i) if d(j) decreases from , place j in LIST end end
Select the node i on LIST with minimum distance label, and then update(i)
41
Initialize
d(2) = pred(2) =
2 2 1 4 3 3 5 2 4
d(4) = pred(4) =
4 3 2 2 6
d(1) = 0 1 pred(1) =
d(6) = pred(6) =
d(3) = pred(3) =
d(5) = pred(5) =
The algorithm initializes d(1) as 0 and d(j) = infinity otherwise. LIST is a list of all nodes for which d(j) is finite, and for which d(j) has not been labeled as permanent. Once a node j is labeled as permanent, d(j) will never again change. So, we are only permitted to label j as permanent if we know that d(j) is the length of the shortest path from node 1 to node j.
d(2) = pred(2) =
2 1 d(1) = 0 1 pred(1) = 2 1 4 3
d(4) = pred(4) =
4 2 3 2 2
d(6) = pred(6) =
d(3) = pred(3) =
LIST = { {1,
43
d( ) = { {0,
The algorithm finds the node j on LIST for which d( ) is minimum. It turns out that this node j can be made permanent. And the algorithm does make it permanent.
d(4) = pred(4) =
d(6) = pred(6) =
d(3) = 4 pred(3) = 1
d(5) = pred(5) =
d( )) = {{ 2 4 d( = {0, 2,
The algorithm then performs an update step on the permanently labeled node by scanning all arcs out of the node and then updating distance labels. Any node not on LIST that has a finite distance label after this scanning is added to LIST.
d(2) = 2 pred(2) = 1
2 2 1 4 3 3 4
d(4) = pred(4) =
4 2 3 2 5 2
d(1) = 0 1 pred(1) =
d(6) = pred(6) =
d(3) = 4 pred(3) = 1
d( )) = {{ 4 4 d( = {0, 2 2,
The find m operations is repeated by scanning nodes of LIST. A new node is n i made permanent.
d(4) = 6 pred(4) = 2
d(1) = 0 1 pred(1) =
d(6) = pred(6) =
d(3) = 4 3 pred(3) = 1 2
d(5) = 4 pred(5) = 2
LIST = { 3, 4 5 4,
46
d( ) = { 4 6 4 3 6, 3,
d(2) = 2 pred(2) = 1
4 2 2 1 4 3 3 3
d(4) = 6 pred(4) = 2
4 2 3 2 5 2
d(1) = 0 1 pred(1) =
d(6) = pred(6) =
d(3) = 3 pred(3) = 2
d( ) = {6, 4 4 { 3, 6,
The algorithm alternates between the find min operation and the Update operation. It continues until all nodes are made permanent, at which point it stops with the shortest distances from node 1 to all other nodes.
d(4) = 6 pred(4) = 2
d(1) = 0 1 pred(1) =
d(6) = pred(6) =
d(3) = 3 pred(3) = 2
d(5) = 4 pred(5) = 2
LIST = {4, 5
48
d( ) = {6, 4
d(2) = 2 pred(2) = 1
4 2 2 1 4 3 3
d(4) = 6 pred(4) = 2
4 2 3 2 5 5 2
d(1) = 0 1 pred(1) =
d(6) = pred(6) =
d(3) = 3 pred(3) = 2
LIST = {4 5 {4,
49
d( ) = {6 4 {6,
d(4) = 6 pred(4) = 2
d(1) = 0 1 pred(1) =
d(6) = 6 pred(6) = 5
d(3) = 3 pred(3) = 2
d(5) = 4 pred(5) = 2
LIST = {4 6 {4,
50
d( ) = {6 6 {6,
d(2) = 2 pred(2) = 1
4 2 2 1 4 3 3
d(4) = 6 pred(4) = 2
4 4 2 3 2 5 2
d(1) = 0 1 pred(1) =
d(6) = 6 pred(6) = 5
d(3) = 3 pred(3) = 2
LIST = {6 6 {4,
51
d( ) = {6 6 {6,
d(4) = 6 pred(4) = 2
4 4 3 2 2 6
d(1) = 0 1 pred(1) =
d(6) = 6 pred(6) = 5
d(3) = 3 pred(3) = 2
d(5) = 4 pred(5) = 2
LIST = {6
52
d( ) = {6
d(2) = 2 pred(2) = 1
4 2 2 1 4 3 3
d(4) = 6 pred(4) = 2
4 2 3 2 5 2
d(1) = 0 1 pred(1) =
6 6
d(6) = 6 pred(6) = 5
d(3) = 3 pred(3) = 2
LIST = { {6
53
d( ) = { {6
d(4) = 6 pred(4) = 2
d(1) = 0 1 pred(1) =
d(6) = 6 pred(6) = 5
d(3) = 3 pred(3) = 2
d(5) = 4 pred(5) = 2
LIST = { {6
54
d( ) = { {6
d(2) = 2 pred(2) = 1
4 2 2 1 4 3 3
d(4) = 6 pred(4) = 2
4 2 3 2 5 2
d(1) = 0 1 pred(1) =
d(6) = 6 pred(6) = 5
d(3) = 3 pred(3) = 2
d(5) = 4 pred(5) = 2
LIST = { {6
55
d( ) = { {6
d(4) = 6 pred(4) = 2
d(1) = 0 1 pred(1) =
d(6) = 6 pred(6) = 5
d(3) = 3 pred(3) = 2
d(5) = 4 pred(5) = 2 Trace back the path from node 6 to node 1 using the predecessors.
LIST = { {6
56
d( ) = { {6
d(4) = 6 pred(4) = 2
d(1) = 0 1 pred(1) =
d(6) = 6 pred(6) = 5
d(3) = 3 pred(3) = 2
LIST = { {6
57
d( ) = { {6
The algorithm ends with an optimum path from node 1 to each other node. The paths are stored succinctly as an out t ee. The path on the out tree from node 1 to - r node j is the shortest path from node 1 to node j. The arcs of the out tree are the predecessor arcs that were kept track of in running Dijkstras algorithm.
Dijkstras algorithm makes nodes permanent in increasing order of distance from the origin node. Dijkstras algorithm is efficient in its current form. The running time grows as n2, where n is the number of nodes It can be made much more efficient In practice it runs in time linear in the number of arcs (or almost so).
z z
58
59
Summary
z z z
The Eulerian cycle problem The shortest path problem Dijkstras algorithm finds the shortest path from node 1 to all other nodes in increasing order of distance from the source node. The bottleneck operation is identifying the minimum
distance label. One can speed this up, and get an
incredibly efficient algorithm
60