Next: Comparison of three rodent Up: Probabalistic pairwise sequence alignment Previous: New constructions for visualization

Application of the array

Computing requires the sequences ${\bf A}$ and ${\bf B}$ , the parameters insertion rate , deletion rate , substitution rate and jump odds , and the distribution $\pi$ .

In the examples below, we fix , and take $\pi$ to be the distribution observed in the sequences ${\bf A}$ and ${\bf B}$ . We handle the nucleotide symbol as a match for each nucleotide with distribution $\pi$ .

In principle, these parameters can be dynamically reestimated for different points in the array. Also, a global maximum likelihood reestimation can be done in the manner of the [TKF] sum approach.

We compute, but do not apply, the noise value $\nu$ .

We apply a simple digital image technique to find the local extreme contours. We compute and plot, on a grayscale, the difference

$\begin{displaymath}\Delta W(i,j) = W(i+1,j) - W(i, j+1).\end{displaymath}$

In Section 4.1, we show an example of finding evolutionary distance by using $r = \infty$ and simple reestimation of . We show an example of curating sequences wih repeats and duplications.

Note in the following examples the black and white diamonds in the plots of $\Delta W$ . These occur at places where segments of the two sequences, represented on the horizontal and vertical axes, align with high identity. The sharp contrast along the diagonal of the diamond indicates a local extreme contour in the array. The intensity of a sequence identity feature in the array is proportional to its length. The longest identity feature has maximum intensity and depresses the intensity of other identity features. In the $\Delta W$ array, the width of an identity feature is proportional to its length, and its intensity is not affected by its length. Thus, $\Delta W$ shows different length identity features simultaneously.

Subsections

Next: Comparison of three rodent Up: Probabalistic pairwise sequence alignment Previous: New constructions for visualization

Lawren Smithline 2003-11-13