SCENARIO
Following your initial consultation with N00BIoT, the software development team has extracted data sets based upon your recommendations.
N00BIoT intends to launch a new version of Email Sentry at the end of the year. It will be marketed as N00BIoT ES2 (Powered By AI)
The software team is scrambling to produce a reliable email detector and has turned to you to provide the machine learning expertise and analysis to deliver a product with the following goals:
Very low false-positives on malware detection
High level of sensitivity in detecting malware.
TASK
You are to apply supervised machine learning algorithms to the data provided. You will train your ML model using the MalwareSample set, and then test them against the EmailSamples data set.
All analyses are to be done using R. You will report on your findings.
Part 1 – Preparing your data for constructing a supervised learning model using MalwareSamples10000.csv
You will need to write the appropriate code to,
i. Import the dataset MalwareSamples10000.csv into R studio.
ii. Set the random seed using your student ID.
iii. Partition the data into training and test sets using an 80/20 split.
The variable isMalware is the classification label and the outcome variable.
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.