L12 DAG

DAG(Directed Acyclic Graph), DAG's topological ordering and SCC(Strongly Connected Component).

Ref: 算法设计与分析(Algorithm design and analysis) by 黄宇.

Directed Acyclic Graph

-> ref

DAG and Topological Ordering

An DAG is a directed graph that contains no directed cycles.

Topological Ordering for G=(V,E)

A topological order of a directed graph G=(V,E) is an assignment of distinct integer 1,2,,n of its nodes as v1,v2,,vn so that for every edge (vi,vj)E we have i<j.

Lemma1

Lemma1: G=(V,E) is a DAG <==> G has a topological order

Proof:

==> : (by induction on n)

  1. Base case: true if n = 1.
  2. Given DAG on n > 1 nodes, find a node v with no incoming edges.
  3. G - { v } is a DAG, since deleting v cannot create cycles.
  4. By inductive hypothesis, G - { v } has a topological ordering.
  5. Place v first in topological ordering; then append nodes of G - { v } in topological order. This is valid since v has no incoming edges.

<== : (by contradiction)

  1. Suppose that G has a topological order v1, …, vn and that G also has a directed cycle C. Let's see what happens.
  2. Let vi be the lowest-indexed node in C, and let vj be the node just before vi ; thus (vj, vi ) is an edge.
  3. By our choice of i, we have i < j.
  4. On the other hand, since (vj, vi ) is an edge and v1, …, vn is a topological order, we must have j < i, a contradiction.
DAG Lemma1

Lemma2

Lemma2: If G is a DAG, then G has a node with no incoming edges.

Proof: (by contradiction)

  1. Suppose that G is a DAG and every node has at least one incoming edge. Let's see what happens.
  2. Pick any node v, and begin following edges backward from v. Since v has at least one incoming edge (u, v) we can walk backward to u.
  3. Then, since u has at least one incoming edge (x, u), we can walk backward to x.
  4. Repeat until we visit a node, say w, twice.
  5. Let C denote the sequence of nodes encountered between successive visits to w. C is a cycle.
DAG Lemma2

Compute Topologic Ordering

  • 在某个集合A 上的关系R如果是自反的、反对称的和传递的,那么R是一个偏序

  • 偏序集的有向图中没有长度大于一的环

  • 拓扑序要求全序且无环.

    • 如果有向图G=(V,E)有环, 则 G不存在拓扑排序.
    • 成环等价于遍历过程中遇到了灰色节点.
  • "尽头"与DFS

    • DFS就是沿某条路径一直往下走,直到某个“尽头”节点。

    • 假设 i\rarrj 表示任务i的执行依赖任务 j 的完成,则尽头节点不依赖其他任何节点,因而对它的拓扑序号的分配从依赖关系的角度看是自由的。该分配方式不会影响其他节点的执行. 比如,对于逆拓扑序而言,只要分配当前尚未分配的最小序号。

  • 逻辑尽头

    • 当一个节点的所有后续节点均已处理完毕时, 该节点就成为逻辑上的尽头节点。
      • 逆拓扑排序时, 逻辑尽头节点的逆拓扑序号只需要分配当前未分配序号中最小的
    • 分配拓扑序号的过程就成为不断找到逻辑结点的过程,这与DFS适合
      • 在DFS-WRAPPER中,开始遍历之前定义一个全局变量globalNum, 并初始化为 n+1.
      • 在DFS框架的"遍历后处理"处,嵌入对拓扑排序的处理:
        • globalNum := globalNum -1.
        • v.topoNum := globalNum.
    1
    2
    3
    4
    5
    6
    7
    TOPO-WRAPPER(G)

    globalNum = n+1;
    Color all nodes WHITE;
    foreach node v in G do
    if v.color = WHITE then
    TOPO-ORDER(v);
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    TOPO-ORDER(v)

    v.color = GRAY;
    foreach neighbour w of v do
    if w.color = WHITE then
    TOPO-ORDER(w);


    globalNUM := globalNum - 1;
    v.topoNum := globalNum;
    v.color = BLACK;
  • Complexity: O(m+n). m = number of edges, n = number of vertices.

Critical path analysis

Critical path in a Task Graph

  • Earliest start time( est ) for a task v

    • If v has no dependencies, the est is 0

    • If v has dependencies, the est is the maximum of the earliest finish time of its dependencies.

  • Earliest finish time( left ) for a task v

    • For any task: eft = est + duration
  • Critical path in a project is a sequence of tasks: v0,v1,,vk, satisfying:

    • v0 has no dependencies;
    • For any vi ( i = 1,2,...,k), vi1 is a dependency of vi, such that est of vi equals eft of vi1;
    • eft of vk​​ is maximum for all tasks in the project.
  • 在DFS框架中嵌入相应处理

    • 在"遍历前处理"处, 初始化该节点的最早开始时间, 并初始化关键路径相关信息

    • 在结束邻接节点的处理返回的时候,检查是否要更新当前节点目前已知的最早开始时间.以及是否需要更新关键路径的相关信息

    • 在"遍历后处理"处, 当前节点的est已确定, 则可以计算出当前节点的eft

1
2
3
4
5
6
7
8
9
10
11
12
13
14
CRITICAL-PATH(v) //该算法同样需要WRAPPER来调度
//逻辑尽头的est和est在其逻辑的关键路径中最小

v.color := GRAY
c.est := 0; v.CritDep := -1;
foreach neighbour w of v do
if w.color = WHITE then
CRITICAL-PATH(w);
if w.eft >= v.est then
v.est = w.eft //求efs的最大值
v.CritDep := w;

v.eft := v.est + v.l;
v.color := BLACK;

Analysis

  • Complexity
    • Θ(n+m)

Connectivity of DAG

Strongly Connected

  • 强连通(strongly connected): A DAG is strongly connected if every pair of nodes is mutually connected(->"connectivity" relation in graph).

    • TL;DR: x和y是连通的 == 存在一条x -> y的长度未知的路径. (但长度不能为0)
  • Lemma. Let s be any node. G is strongly connected iff every node is reachable from s, and s is reachable from every node.

  • Pf:

    ==> : Follows from definition.

    <== :

    1. Path from u to v: concatenate u-s path with s-v path.
    2. Path from v to u: concatenate v-s path with s-u path.
    DAG Strong Connectivity

Simply Connected

  • 单连通(simply connected):A DAG is simply connected if 对图 G 中任意两个顶点 u 和 v, 存在从 u 到 v 的路径或从 v 到 u 的路径.

Weakly Connected

  • 弱连通(weakly connected): A DAG is weakly connected if 忽略图 G 中每条有向边的方向, 得到的无向图(即有向图的基图)连通.

Detect Cycle in DAG

根据之前介绍的图的三种染色, 判断DAG是否有环就等价于判断在该图的遍历过程中是否遇到了灰色节点.

  • 遇到灰色节点意味着在遍历该图时遇到了当前遍历树的祖先节点, 也就是成环.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
public class Graph<T> {

private List<Vertex<T>> vertices;

public Graph() {
this.vertices = new ArrayList<>();
}

//Some apis...

public boolean hasCycle() {

for(Vertex<T> vertex: vertices) vertex.setColor(Color.WHITE);//Every vertex hasn't been visited.

for(Vertex<T> sourceVertex: vertices) {
if(hasCycle(sourceVertex))
return true;
}
return false;
}

private boolean hasCycle(Vertex<T> sourceVertex)
{
if(sourceVertex.getColor() == Color.GREY)//This vertex is being visited, we encounter a cycle!
return true;
if(sourceVertex.getColor() == Color.BLACK)//This vertex has been visited. So we don't need to traver it again.
return false;

sourceVertex.setColor(Color.GREY);//This vertex is being visited

for (Vertex<T> neighbor : sourceVertex.getAdjacencyList()) {
if( hasCycle(neighbor) )
return true;
}

sourceVertex.setColor(Color.BLACK);//The visiting of this vertex is over. Color it black.
return false;
}

Strongly Connected Component(SCC)

Condensation Graph

  • Condensation Graph: 把G中的每个强连通片收缩成一个点, 强连通片之间的边收缩成一条有向边,则得到G的收缩图G

    • 两个强连通片之间只能是单向可达(或者不可达).

    • DAG的Condensation Graph还是DAG.

Traverse of SCC

1
2
3
4
5
6
7
8
9
10
11
12
13
SCC(G)

Initiate the empty stack nodeStack;

Perform DFS on G. In the postorder processing of each vertex v, insert the statement "nodeStack.push(v)"; //第一轮DFS,标记尽头,并通过栈完成排序

Compute the transpose grapg Gt of G;

Color all nodes WHITE;

while nodeStack != empty do
v := nodeStack.pop();
Conduct DFS from v on Gt;
  • Def:

    For a DFS, the first vertex discovered in a strong component Si​​ is called the leader of Si​​, 记为li

  • 推论:

    The leader of Si is the last vertex to finish among all vertices of Si ( since all of them in the same DFS tree )( 即: 首节点的活动区间包含同一个强连通片中所有其他节点的活动区间 )

  • 引理:

    1. Each DFS tree of a digraph G contains only complete strong components of G, one or more.(即: 不可能一个强连通片中的节点一部分在某棵遍历树中,一部分不在)

    2. li​在第一轮遍历中被发现时(刚刚被处理,即将被染成灰色时), 不可能有路径通向某个灰色节点

      • Proof:

        反证法: 设Si的首节点li刚被发现时, 有一条路径通向某个灰色节点x. 由于li是首节点, 所以x必然处于图的另一个强连通片Sj中(而不可能在Si中). 所以存在一条 $ S_iS_j.l_i,x,xl_iDFSTree. S_jS_i, S_iS_j$​是强连通的, 矛盾.

    3. x(若有的话), 比li​先结束遍历, 即: x.finishTime < l.finishTime

      • x只能为白色或黑色
    4. 在第二轮DFS中,当一个白色节点从栈中被POP出来时,它一定是其所在强连通片的首节点

      • Proof

        第二轮DFS时,一个出栈的节点l​为白色, 则它必然是其所在强连通片Si​的第一个出栈节点( 否则就会在Si​​的其它先出栈的节点进行第二轮DFS时被染成灰色 )。 而在第一轮DFS时,最后一个入栈的节点就是最后结束的节点。 而首节点li​的活动区间包含其他节点的活动区间, 因而必然是最后结束的节点。 所以l必然是li