is an adjacency matrix for an unweighted graph without self-loops. Also known as a simple graph in this class, but other sources may have conflicting definitions.
Theorem.
Proof.
If , then the paths of length 1 are exactly the edges.
If ,
[\mA^2]_{ij} = \sum_{r}\biggl[A_{ir}A_{rj} =
\begin{cases}
1 & i \rightarrow r \rightarrow j \\
0 & \text{else }
\end{cases}
\biggr].
This equation counts the number of vertices such that there is a path from to to . This is a length path.
For k=3, the case is similar
[\mA^3]_{ij} = \sum_{r,s}\biggl[A_{ir} A_{rs} A_{sj} =
\begin{cases}
1 & i \rightarrow r \rightarrow s \rightarrow j \\
0 & \text{else }
\end{cases}
\biggr].
This equation counts vertices and such that a path to to to exists. This is a path of length 3.
For general , we proceed inductively. Assume it works for ,
[\mA^{(\ell+1)}]_{ij} =
[\mA \mA^\ell]_{ij} =
\sum_{r}
\biggl[
A_{i,r}
[\mA^{\ell}]_{r,j}
= \begin{cases} [\mA^{\ell}]_{r,j} & i \rightarrow r \rightarrow \cdots \rightarrow j \\
0 & \text{otherwise } \end{cases} \biggr].
Now, we are counting the vertices that prepend a length path from to , and then to in steps. Again, this is exactly the number of paths from to in steps.
End proof.
Note that the above formulation counts the number of paths, which can repeat both vertices and edges.
Suppose is an adjacency matrix for an undirected simple graph, then
\diag(\mA^3) = \vt
gives twice the number of triangles around the th vertex in the th entry of :
t_i = [\mA^3]_{i,i} = \text{ twice the triangles around $i$ }.
This occurs because a triangle is exactly a path of length .
The count of all triangles is given by,
\trace(\mA^3) = \sum_{i=1}^{n}[\mA^3]_{ii}
which overcounts the number of triangles in the graph by a factor of 6.
The trace can be computed by summing the eigenvalues, therefore the number of triangles is given by,
\trace(\mA^3) = \sum_{i=1}^{n}\lambda^3_i.
Assuming rapid decay in the eigenvalues such that
\lambda_1^3 \ge \lambda_2^3 \ge \cdots \ge \lambda_s^3 \gg \lambda_{s+1}^3 \ge \cdots \lambda_n^3
then a reasonable estimate of the trace is
\trace(\mA^3) \approx \; \sum_{i=1}^{s}\lambda^3_i.
Tsourakakis worked on this idea in “Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws”
The numberof triangles can also be estimated,
\trace(\mA^3) = \frac{1}{n} E\biggl[ \frac{\vx^TA^3\vx}{\vx^T\vx}\biggr]
where is the expectation and is a random vector with random normals. Haim Avron and Sivan Toledo worked on some new analysis for these estimators in a recent paper: Randomized algorithms for estimating the trace of an implicit symmetric positive semi-definite matrix. Journal of the ACM, 58:8:1-8:34, April 2011.
The motivation/introduction to Markov chains was presented as slides which can be found at: http://www.cs.purdue.edu/homes/dgleich/nmcomp/slides/lecture-7.pdf
A Markov chain is an instance of a stochastic process. A stochastic process is a sequence of random variables,
X_0, X_1, X_2, ... \; \Leftrightarrow \; (X_n \geq 0)
This is a discrete time stochastic process (in contrast to continuous). The possible values of denotes the state space of the chain.
Example Consider a sequence of coin flips drawn from a random variable,
X_n \sim
\begin{cases}
H & \text{w/ prob } 0.55 \\
T & \text{w/ prob } 0.45
\end{cases}
This is an instance of an i.i.d chain (independent identically distributed). The state space,
Example
X_{n+1} =
\begin{cases}
X_{n+1} & \text{w/ prob } 0.5 \\
X_{n-1} & \text{w/ prob } 0.5
\end{cases}
The above is known as a random-walk process on integers,
More formally, a Markov chain is a stochastic process where
\probof{X_{n+1} = S_i \mid X_0, X_1,...,X_n} = \probof{X_{n+1} = \mathcal{S}_i \mid X_n }
We focus mostly on Time homogeneous Markov chains (or stationary Markov chains).
\probof{\mX_{n+1} = S_j \mid \mX_n = \mathcal{S}_i}
where the probability of the transition is independent of n, the current “time-step” of the chain.
A traditional definition of random-walks from a well-known probability textbook:
S_n = \sum_{i=1}^n \mX_i, \text{ where } (X_n, n \geq 0) \text{ is iid. }
This definition seems hard to adapt for things like random walks on graphs.
For this class, we will use the terms “random walk” and “Markov chain” almost interchangeably. The distinction between them will be mainly one of semantics. A Markov chain is a probabilistic construct, a random walk is a topological construct. For instance, for each Markov chain, we can associate a directed graph:
V = \mathcal{S}
E = \{ (S_i, S_j) \mid \probof{ X_{n+1} = S_j \mid X_{n} = S_i } > 0\}.
For this reason, writing down the transition graph of a random walk is a convenient way to describe the non-zero transition probabilities of a Markov chain.
Need a figure here illustrating the relationship
Uniform random walk Given a graph, a uniform random walk is a Markov chain where the probability of making any transition is uniform over all possible choices. That is, when the chain/walk is at a state/vertex, then the next state /vertex is picked uniformly among all edges from the vertex.
is a Markov chain; index the states in (assumed to be a finite set in this class),
1,2,...,|\mathcal{S}|
Then the transition matrix for is:
\mP_{ij} = \probof{\mX_{n+1} = \mathcal{S}_j \mid \mX_n = \mathcal{S}_i}
The following properties must also hold for stochastic matrices (from probability):
\sum_{j} \mP_{ij} = 1
Formally, a stochastic matrix is non-negative, with rows that sum to 1.
\mP_{ij} \geq 0,
\mP\ones = \ones \; \Leftrightarrow \; \sum_{j} P_{ij} = 1
The above, , sums the probabilities in each row, and therefore should be a vector of all 1’s.
Besides row stochastic matrices, there are also column stochastic matrices,
\mP_{ij} \geq 0, \mP^T\ones = \ones \; \Leftrightarrow \; \sum_{i}\mP_{ij} = 1
There are also doubly stochastic matrices (very special type of mathematical object). They are characterized by the Birkhoff-von Neumann theorem.
A state in a Markov chain can have a few different properties.
A state is called absorbing if it is impossible to leave this state. Therefore, the state is absorbing if and for .
For instance, consider the two state diagram where given by the transition matrix (stochastic row matrix),
\mP = \bmat{ 0 & 1 \\
0 & 1 }.
Thus, a random-walk on this state diagram would always end up at (acting as a sink; absorbing) with 0 probability of transitioning to .
In the previous example, is a perfect example of a transient state. Informally, a transient state is one for which the Markov chain may not return. That is, given that the chain was at state , then the probability that the chain is ever at state again is less than 1. This is a little tricky to formalize in terms of probability. We’ll formalize it shortly in terms of the strongly connected component structure of the Markov chain as a random walk on a graph.
The opposite of a transient state is a recurrent state! This is a state that the chain will always revisit. A very simple example is the absorbing state. However, an ergodic state is simply one that the chain will always revisit in the future. Again, this concept is easiest to formalize in terms of the strong component structure, so we’ll delay the formality.
A periodic state is a special type of recurrent state which can only be revisited on a specific period. Consider a simple directed cycle between three vertices:
Insert a picture here
Then a periodic state will be
A recurrent state that isn’t periodic is called ergodic. These are also called “aperiodic”.
In the second on random walks, we mentioned that every Markov chain can be considered as a directed graph where the states are vertices and the non-zero transition probabilities are the edges. By analyzing the structure of the strongly connected components of this graph, we can easily formalize the definition of the types of states.
Recall that a strongly connected component of a directed graph is a set of vertices where there are directed paths between all pairs of vertices. That is, if and are in a strong component, then there is a directed path from to and to .
Insert a picture of a graph and its strong components
We can also define a component graph, which is a new graph where each strong component in a single vertex, and the edges reflect ways of moving between the strong components. Note that this graph must be acyclic. Any cycle would have produced a larger strong component. Consequently, the component graph is a directed acyclic graph!
The terminal nodes in this dag (those at the end of the dag) are recurrent states. The other nodes in the dag correspond to transient components, and thus, identify the transient states. To see why this is the case, consider that for any of these transient components, there is a non-zero probability of leaving the component. Once a walk leaves the component, the walk cannot return. Hence, the probability that the walk will ever visit “that transient vertex” again is less than 1.
The absorbing states are those terminal components with exactly one state. The periodic states are those where the greatest common divisor of the
The strongly connected component structure of the the Markov chain means that we can permute the stochastic matrix for the Markov chain into a particular form.
Let be the transient components. Let be the recurrent components that aren’t absorbing and let represent the set of absorbing vertices.
\mP = \bmat{\mP_{T_1} & \mP_{T_1,T_2} & \cdots & \mP_{T_1,T_k} & \mP_{T_1,R_1} & \mP_{T_1,R_2} & \cdots & \mP_{T_1,R_m} & \mP_{T_1,A} \\
& \mP_{T_2} & \cdots & \mP_{T_2,T_k} & \mP_{T_2,R_1} & \mP_{T_2,R_2} & \cdots & \mP_{T_k,R_m} & \mP_{T_1,A} \\
& & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
& & & \mP_{T_k} & \mP_{T_k,R_1} & \mP_{T_k,R_2} & \cdots & \mP_{T_k,R_m} & \mP_{T_k,A} \\
& & & & \mP_{R_1} & 0 & \cdots & 0 & 0 \\
& & & & & \mP_{R_2} & \ddots & 0 & 0 \\
& & & & & & \ddots & & \vdots \\
& & & & & & & \mP_{R_m} & \\
& & & & & & & & \mI \\
}
Note that there are no transitions out of the recurrent states components.