Your task for this assignment is use AWS Elastic MapReduce (EMR) to process and analyze health data in your AWS S3 bucket with a big data framework cluster. You will launch a cluster using Spark and run a simple PySpark script stored in your AWS S3 bucket.
1. Accept invitation from AWS Academy to set up an Academic account
2. Run AWS EMR tutorial using your AWS Academy account: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-gs.html
3. Obtain a screenshot (#1) of completion of EMR job Amazon EMR page > Steps tab … status = completed for all steps
4. Copy/Paste stderr log file and name it cis362_prog2_stderr_lastname.txt Amazon EMR page > Steps tab > view logs > stderr
5. Locate result report in Amazon S3. Use AWS console or AWS CLI to transfer result report from S3 bucket to your local host and rename it to cis362_prog2_results_lastname.txt. Obtain screenshot (#2) of the running of either AWS console or AWS CLI command
6. Create README.txt file with following information:
a. course ID and section
b. your full name
c. the program assignment number and due date
d. the program purpose
e. the contents of the zip file
7. Paste your 2 screenshots onto a Word-type document named cis362_prog2_screenshots_lastname.docx. Store the Word-type document, stderr log file, result report file and README.txt file in a zip file named CIS362_prog2_lastname.zip
8. Submit your zip file as an attachment to an email message to [email protected] using a subject in this form: “csc362_prog2_lastname”. Do your own work. Students submitting copies of the same program will receive grades of zero for the assignment.
Visit https://www.7-zip.org/ for an open source tool to create a zip archive.
Here is a sample command: 7za a myzip.zip mydir/*
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.