Task 3.
Select attributes that produce the highest accuracy: Use the dataset that produced maximal accuracy in the previous experiment and apply Information Gain (InfoGainAttributeEval) and Instancebased (ReliefFAttributeEval) attribute evaluation. This can be done through the Weka’s panel “Select Attributes” (to examine the attribute ranking) or in the Preprocess panel through filters (to actually reorder attributes according to their rank).
Following the steps below, apply each of the two filters and then run the IBk algorithm with increasing number of attributes chosen from the beginning of the ranked attribute list, e.g., 1, 5, 10, 50, 100, 200, 300, ...
3.1 For each filter:
a. Apply the filter to the data set.
• If you use the Weka’s panel “Select Attributes” to set the InfoGainAttributeEval and ReliefFAttributeEval filters for examining the attribute ranking: Choose as Search Method: Ranker. Set numToSelect to the number of attributes to retain. The default value (-1) indicates that all attributes are to be retained. Use either this option or a threshold to reduce the attribute set.
b. Choose the top ranked attributes (copy them from the output window).
c. Go to the Preprocess panel and discard the rest of the attributes.
• Choose Filter Remove and paste the selected attributes in the field for the parameter “attrbuteIndexes” along with the index of the class; then select invertSelection “True” since we want to remove the remaining attributes.
d. Now run the IBk algorithm measured with 10-fold cross validation.
e. Repeat the previous steps with another number of attributes to be selected: first Undo (the removed attributes), then change the numToSelect parameter’value in the Ranker parameter settings, remove the top ranked and apply IBk. For example, run with 1, 5, 10, 50, 100, 200, 300, … attributes. Plot the accuracies for each run in a graph, to create a graph representing the number of selected attributes vs. accuracy.
3.2 Compare the graphs produced with the two attribute selection methods (InfoGainAttributeEval and ReliefFAttributeEval) and analyze the results.
• Determine the optimal number of attributes for classification for each attribute selection method.
• Which attribute selection method works better? Comment on this.
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.