2. Loss Functions for Classification:
Again, assume that the function is f(w, b, x) = w T x + b. In the case of SVM and Perceptrons, we saw the following two loss functions: Li(w, b) = max(0, −yif(w, b, xi)) for Perceptron and Li(w, b) = max(0, 1 − yif(w, b, xi)) for Hinge Loss (SVM). Similar to question 1, let us see if the following loss functions are good choices:
(a) Li(w, b) = max(0, 1 − yif(w, b, xi))2
(b) Li(w, b) = [yi − f(w, b, xi ] 4
(c) Li(w, b) = exp[f(w, b, xi) − yi ]
(d) Li(w, b) = exp[−yif(w, b, xi)]
Part 1: Please answer exactly why these loss functions may or may not be a good choice for classification.
Part 2: Also, compute the gradients of the final loss function in each of the cases above.
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.