4. DNA chains consists of 4 bases, A for adenine; C for cytosine, G for guanine; and T for thymine. A typical sequence of DNA bases may look like e.g., ACCTGGAAGTCCTGAATG ... In the file DNA.txt, a sequence of 51 consecutive bases for part of a particular strand is coded as 1 for A, 2 for C, 3 for G and 4 for T. Assume that the sequence follows a Markov chain model. Either by counting manually, or using R or any other software to assist in the counting,
(a) Estimate the transition probabilities ˆpij of the chain going from state i to state j in the next position of the sequence. (Here, “state” is the type of DNA base.)
(b) From the very last base of the sequence, what would be the most probable base to be found in the next 3 positions of the sequence?
(c) What are the proportions of each type of base occurring if the entire sequence is
assumed to be infinitely long?
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.