4. You wish to develop a simple prototype for your own speech recognition software. This means that given n time-samples of a speech signal S = {s1, · · · , sn} (n fixed), you need to be able to determine the series of phonemes H = {h1, · · · , hn} (the perceptually distinct units of sound in a language) spoken in a recording. Fortunately, you already have available both a language model, which encodes the probability of a phoneme hi given the previous phoneme hi−1 as P(hi |hi−1), and a generative probabilistic model of the speech signal samples given the underlying phonemes P(si |hi).
(a) Assuming that the recording consists of only n = 4 samples, draw a model of this problem as both (i) a factor graph and (ii) a Markov Network, labeling all relevant parts. Use only the Markov network in subsequent parts of this problem.
(b) Derive an algorithm that determines the most probable series of 4 phonemes given the recorded data. Write everything in terms of (conditional and/or unconditional) probabilities, though feel free to create new terms defined in terms of those probabilities to reduce the amount of writing required. You can receive partial credit by writing the solution without the derivation.
(c) To determine which parts of the speech model of the n speech samples are more uncertain
in their estimates, derive how to efficiently calculate the marginal probability distribution of each
phoneme at each sample-time conditioned on the recorded data. You can receive partial credit
by writing the solution without the derivation.
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions

Consult our trusted tutors.