Author attribution.
Suppose that the N feature n-vectors x (1) , . . . , x (N) are word count histograms, and the labels y (1) , . . . , y (N) give the document authors (as one of 1, . . . , K). A classifier guesses which of the K authors wrote an unseen document, which is called author attribution. A least squares classifier using regression is fit to the data, resulting in the classifier
For each author (i.e., k = 1, . . . , K) we find the ten largest (most positive) entries in the n-vector ßk and the ten smallest (most negative) entries. These correspond to two sets of ten words in the dictionary, for each author. Interpret these words, briefly, in English.
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.