Compare the layout of each of the two data frames
Ask Expert

Be Prepared For The Toughest Questions

Practice Problems

Compare the layout of each of the two data frames


Part 1 
The spreadsheet titled ‘censusdata.xlsx’ contains information about the number of bedrooms in occupied private dwellings for local government areas in Melbourne for the years 2011 and 2016. You will see that it is far from being ready for analysis and needs to be ‘wrangled’. Additionally a few errors have been deliberately introduced into the first two columns so these will need to be corrected by initial analysis.
1. Explain why the data in its current form is not considered to be in ‘tidy’ format. 
2. Write R code to read in the data (readxl package), manipulate it and output it to a single csv file having the following header row. region,year,br_count_0,br_count_1,br_count_2,br_count_3,br_count_4_or_more,br_count_unstate d,av_per_dwelling,av_per_household
Your code will have the following sections (not necessarily in the order given and the process may be iterative as you find more things to do). Please include comments in the code to separate each segment and explain your steps.
• Read in the data sets into two data frames df2011 and df2016.
• Compare the layout of each of the two data frames, then remove appropriate rows of one data frame to match the format of the other.
• Write a function that takes in a table of the original form and outputs a table in the desired form with columns specified above. 
o Remove unwanted rows or columns 
o Split values into multiple columns to make them atomic 
o Appropriately transform the data into the desired form 
o Rename columns
• Apply the function to each table to create two tables in the desired format.
• Do a summary of each table to look for unusual values. 
• Correct those values until the two tables have the same dimensions and format.
• Merge the two tables into a single table so that we see data in the form
Banyule,2011,78,1287,8457,21865,11366,645,3.1,2.6
Banyule,2016,...
Bayside,2011,...
Bayside,2016,...
Victoria,2011,...
Victoria,2016,...
Australia,2011,...
Australia,2016,...
(listed alphabetically by region, then by year, with Victoria and Australia at the end) 
• Write the result to a csv file (it should have 65 rows including the header). 
3. Which region(s) (ignoring Victoria and Australia) had the largest increase in the number of occupied dwellings with 3 or more bedrooms between 2011 and 2016? (Ignore the unstated counts.
Hint
Business 1) The data in the current form has character values for number of bedrooms in occupied private dwellings for local government areas. The data is not considered to be tidy because the town names aren’t specified in a specific column....

Know the process

Students succeed in their courses by connecting and communicating with
an expert until they receive help on their questions

1
img

Submit Question

Post project within your desired price and deadline.

2
img

Tutor Is Assigned

A quality expert with the ability to solve your project will be assigned.

3
img

Receive Help

Check order history for updates. An email as a notification will be sent.

img
Unable to find what you’re looking for?

Consult our trusted tutors.

Developed by Versioning Solutions.