Your algorithm should be based on the classification algorithms learned during
Ask Expert

Be Prepared For The Toughest Questions

Practice Problems

Your algorithm should be based on the classification algorithms learned during

Introduction

This project requires you to explore classification algorithms on a real world dataset, and write a report explaining your experimental results. The language of implementation is up to you — the only requirement is that your program be able to interpret the data format specified below, and be able to classify instances and produce interesting statistics such as accuracy, false positive rate, false negative rate, etc. You are free to construct whatever user interface for your program, but you must fully document your interface.

Algorithm

• Your algorithm should be based on the classification algorithms learned during the course. Usually a straight forward implementation of one method will not lead to satisfactory performance. Your algorithm can be a combination of methods and should incorporate one or more data mining techniques when the situation arises. These techniques include (and certainly not limited to):

– Handling imbalanced dataset

– Proper imputation methods for missing values

– Different treatment of various type of features: continuous, discrete, categorical, etc.

Data

You’ll be examining the behavior of your model on a dataset from the UCI machine learning lab. The dataset is represented in a standard format, consisting of 3 files. The first file, census-income.names, describes the categories and features of the dataset. It also has some empirical results for your reference. The other two files are census-income.data and census-income.test, containing the actual data instances, formatted at one instance per line, as follows: 


The data you will be examining was extracted from the census bureau database. Each instance contains an individual’s educational, demographic and family information. Prediction task is to determine whether a person makes over 50K a year. You should use census-income.data to train your classifier and use census-income.test to evaluate the performance of your learning algorithm.

Your Mission...

Deliverables for this project are:

• Code to implement the classification algorithm for the data file formats given above

• A README file, with simple, clear instructions on how to compile and run your code

• Testing statistics for the application of your learning algorithm. At a minimum you should provide training set accuracy, test set accuracy

• A discussion of data mining techniques employed in your algorithm

• A report analyzing the behavior of your algorithm on the dataset, including any unusual or anomalous (in your opinion) behavior

Hint
Computer"IntroductionIn this project, I will be exploring classification algorithms on a real-world dataset obtained from the UCI machine learning lab. The dataset contains information from the US Census Bureau database, and the prediction task is to determine whether an individual makes over 50K a year based on their demographic, educational, and family information.To accomplish this task, I will...

Know the process

Students succeed in their courses by connecting and communicating with
an expert until they receive help on their questions

1
img

Submit Question

Post project within your desired price and deadline.

2
img

Tutor Is Assigned

A quality expert with the ability to solve your project will be assigned.

3
img

Receive Help

Check order history for updates. An email as a notification will be sent.

img
Unable to find what you’re looking for?

Consult our trusted tutors.

Developed by Versioning Solutions.