Question 1
Suppose the dataset “adult.data.csv” has been uploaded as open data by a multinational tech company, after removing obvious identifiers, e.g., names and addresses. The dataset contains information about all its employees worldwide. You know one of the employees in the dataset, Alice. You further know the following facts about Alice (as background knowledge):
age: 60 workclass: Private marital-status: Divorced race: Black sex: Female
You seek to uniquely re-identify Alice in the dataset to learn (infer) more about her. (Hint: When doing comparison in Python it may be handy to convert all values to strings and use the strip() command to remove any white spaces.)
(a) Write a program that matches each row in the adult.data.csv dataset against your background knowledge, and returns all matching rows.
(b) How many rows match your background knowledge?
(c) You recall that you know an additional piece of information about Alice, that her “education” is 10th. How many rows match your background knowledge now?
(d) What are the retrieved rows from part (c)?
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.