Task
In this task, you'll be using two CSVs from the CIA's world factbook dataset. The file data/c2119.csv maps country names to population. The file data/c2228.csv maps country names to adult obesity rates.
Your task is to fill out the analyze(factbook_pop, factbook_obesity) function in analyze.py. This function should:
1. Read the CSV files (from the two parameter path strings) into a dataframe with Country, Population and Obesity Rate columns
2. Select countries with obesity rates higher than 20 percent and populations larger than 10,000,000 (107)
3. Sort the data by Obesity Rate descending
4. Select the top 10 countries
5. Index the result from 1-10
6. Return the dataframe.
The expected result (available in data/result.csv ) is:
Notes
Paths to CSVs are always relative to the current directory, so you don't need to worry about handling special cases.
Your code will be tested on submission with random values to ensure the result wasn't hardcoded.
Note that Pandas' testing library offers excellent diffs upon test failure but uses the somewhat confusing left and right names rather than actual and expected. Keep in mind that left == actual and right == expected. The call will be pd. testing.assert_frame_equal(actual, expected) throughout the challenge.
Rubric
The solution will be evaluated mainly on correctness but use of idiomatic, maintainable Pandas code is taken into consideration as well.
Resources
You may consult Pandas and Python documentation.
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.