Describe the data preprcocessing tasks (including data encoding) that are required

Be Prepared For The Toughest Questions

Practice Problems

Part 1: Classification with Neural Networks

This part involves predicting the Class attribute in the following file: hepatitis.arff in the directory: /KDrive/SEH/SCSIT/Students/Courses/COSC2111/DataMining/data/arff/UCI/

The main goal is to achieve the lowest classification error with the lowest amount of overfitting. You are recommended to use ‘MultilayerPerceptron’ in Weka to complete this task, though alternatively, it is also possible to do this by using ’Javanns’.

For the neural network training runs build a table with the following headings:

1. Describe the data preprcocessing tasks (including data encoding) that are required. How many outputs and how many inputs will there be? How do you handle numeric and nominal attributes? What are the normalizations requred? How do you deal with missing values (if present)? Include your data preprocessing scripts (e.g., if you choose the Javanns option) as an appendix (not part of the page count).

2. Elaborate the pre-processing procedure (in Weka) to generate the necessary training, validation and test data files. How do you determine when to stop training a neural network? Include your data preparation script (e.g., if you choose Javanns) as an appendix (not part of the page count).

3. Describe how a trained neural network determines unseen test data instance’s class label. If Javanns is chosen, describe how to use the ‘analyze” program to do this.

4. Assuming that no hidden layer is used, carry out 5 training and test runs for a network. Comment on the limitations of this single-layer “perceptron” network, as opposed to a network where one or more hidden layers are employed.

5. Assuming that one hidden layer is used, carry out 10 training and test runs for a network with different numbers of hidden nodes. What would be a good strategy to figure out the right number of hidden nodes? From your runs, what seems to be the right number of hidden nodes for this problem? Comment on the variation in the training runs and the degree of overfitting.

6. For a network with 5 hidden nodes, explore at least 5 different combinations of learning rate and momentum. What do you conclude?

7. Compare the classification accuracy of ‘MultilayerPerceptron’ in Weka (or Javanns) with the classification accuracy of Weka J48. Comment on the pros and cons of employing these two different types of classifiers for classification tasks.

8. Experimenting with both Javanns and Weka MultilayerPerceptron, what are the pros and cons of these two different software programs for neural network training? What makes you decide to choose to use either Javanns or Weka? Provide your reasoning.

Hint

ComputerData encoding is the process of converting information into digital signals, such as ones and zeros. This lesson aims to explain how data is encoded before being transferred over a network. It will place more emphasis on the various data encoding methods. Additionally, analog and digital data will be discussed. Encoding is the technique of representing the 1s and 0s of the digital signals ...

Select Deadline for Completion

4 Days

3 Days

2 Days

1 Day

1 to 15 Hours

Know the process

Students succeed in their courses by connecting and communicating with
an expert until they receive help on their questions

Unable to find what you’re looking for?

Consult our trusted tutors.

Ask a Question

Be Prepared For The Toughest Questions

Practice Problems

Related questions

Know the process

Submit Question

Tutor Is Assigned

Receive Help

Unable to find what you’re looking for?