Modifying the one state ground process

Next: New constructions for visualization Up: Algorithm construction Previous: The one state ground

Modifying the one state ground process

We modify the computation of the array

to compute a related array

We consider further implications of the reversibility of the ground process. The parameters and are a priori two degrees of freedom in the ground process. If the lengths of ${\bf A}$ and ${\bf B}$ are informative, setting and to maximize the probability to observe sequences of the given lengths gives a relationship between and . We choose and derive .

If the sequences ${\bf A}$ and ${\bf B}$ are subsequences of very very long genomes, the lengths of ${\bf A}$ and ${\bf B}$ may be artifacts of truncation. We express this in the ground process by , making insertions and deletions equally likely. In this case, extension of ${\bf A}$ is a zero information event, expressed by $\alpha = 1$ . Other transition probabilities are computed in terms of

$\begin{displaymath}B = l / (1 + l).\end{displaymath}$

Gaps at the ends of alignments could be artifacts of truncation. We model this by replacing by $\alpha$ at the edges of the array. This models extending the sequences in each direction infinitely with bases selected from distribution $\pi.$

We normalize by dividing by the probability to observe sequences ${\bf A}$ and ${\bf B}$ separately given and $\pi$ and omitting the factor for the inital state . We report separately the log likelihood to observe the given sequences,

$\begin{displaymath}\log \Pi_{\bf A}+ \log \Pi_{\bf B}.\end{displaymath}$

Assembling the above insights, we compute

$\begin{eqnarray*} Q(0,0) & = & 1, \\ Q(i,0) & = & Q(i-1, 0), \\ Q(i,j) & = & Q... ... \pi_{B[l_{\bf B}]}) - \\ & & Q( i-1, l_{{\bf B}}-1) \cdot E^2. \end{eqnarray*}$

We do not multiply the final entry by $1 - \alpha$ , because we are not asserting the sequences end.

The array provides the same kind of information as the . The essential difference is that the gap cost for leading and trailing gaps is canceled. It is possible to treat gaps differently along each edge of the array.

Next: New constructions for visualization Up: Algorithm construction Previous: The one state ground

Lawren Smithline 2003-11-13