For this assessment, you are required to use Weka software and a text editor such as WordPad

Be Prepared For The Toughest Questions

Practice Problems

Assessment 4 Detail

For this assessment, you are required to use Weka software and a text editor such as WordPad, Notepad++ for windows system or Textedit for Mac.

You can download Weka from https://www.cs.waikato.ac.nz/ml/weka/downloading.html).

Task 1: Create and explore the Weka data file of type ARFF

Download a text file called dataset.csv from the subject site (Canvas) and open it using a text editor such as WordPad, Notepad++ etc., for windows system or Textedit for Mac. You need to explore and convert this file into an ARFF file for Weka. The text file you will be using contains a sample of real-life data related to customers. The data.csv file is not entirely formatted as a Weka file (ARFF). This file has some formatting errors, and your task is to find these errors and fix them to have a valid ARFF file. Save the valid file as a dataset.arff.

Explore the dataset.arff dataset using Weka Explorer and answer the following questions.

Make sure to include screenshots of the visualisations to support your answers.

1. Take a screenshot of your corrected ARFF file.

2. Which attribute in the dataset do you think is useless and did not provide useful information for prediction?

3. How many attributes the dataset has?

4. How many instances the dataset has?

5. What is the class attribute in the data.arff dataset?

6. What proportion of customers who has a mortgage and live in Inner City?

7. What proportion of customers who has a mortgage and not living in Inner City?

8. What proportion of customers have a mortgage, and their income is between $1000 and $10000?

9. How many customers are married and have no mortgage?

10. How many customers have not owned a car and have a mortgage?

Task 2: Practical Analysis

Use the dataset from Task 1 to perform data mining tasks for Task 2 and compare the performance on this data set for the following classification algorithms using classification algorithms:

• Naive Bayes

• HoeffdingTree

• SVM ( or SMO)

• J48

Write a summary report that compares the performance of these algorithms. Make sure to comment on these algorithms performance and accuracy using the performance metrics shown in the classifier output, such as the confusion matrix, etc. In your report, you need to state if there is a difference in the performance between these algorithms and which algorithm performs best.

Make sure to include the necessary tables, graphs, screenshots etc., to make your report understandable to the person who reads it.

Hint

StatisticsNaive Bayes classifiers: These are the family of simple probabilistic classifiers, which with strong independence assumptions between the features are based on the application of the Bayes' theorem. These classifiers are among the simplest Bayesian network models. So, they could achieve the higher accuracy levels when they are coupled with the kernel density estimation. Basically, they a...

Select Deadline for Completion

4 Days

3 Days

2 Days

1 Day

1 to 15 Hours

Know the process

Students succeed in their courses by connecting and communicating with
an expert until they receive help on their questions

Unable to find what you’re looking for?

Consult our trusted tutors.

Ask a Question

Be Prepared For The Toughest Questions

Practice Problems

Related questions

Know the process

Submit Question

Tutor Is Assigned

Receive Help

Unable to find what you’re looking for?