Question 2
Tweets have started to appear from unknown sources, using an alien language. The three most recent tweets are:
• do da da da do
• di di di do do
• da da da da da da
(a) Should we perform stop word removal and/or stemming on these three tweets?
(b) Construct the document term frequency matrix.
(c) Construct the cosine similarity score of each document to the query "da di” by using term frequencies.
(d) Which tweet is more similar to the query? Justify your answer.
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.