A large supermarket has released a dataset of all purchases made by its 10,000 customers
Ask Expert

Be Prepared For The Toughest Questions

Practice Problems

A large supermarket has released a dataset of all purchases made by its 10,000 customers

Question 2

A large supermarket has released a dataset of all purchases made by its 10,000 customers after removing names, as “purchase-rest.csv”. The purchases are categorised into 40 items (numbered 1 to 40, inclusive). If a customer buys an item, the corresponding cell is 1, otherwise it is 0. You know that Bob is one of the customers. And you know his past buying habits (“bob.csv”) as background knowledge. You would like to identify which row in the released dataset is likely to be Bob’s data, thus re-identifying Bob. Note that there could be some items marked as 1 in bob.csv but marked as 0 in purchase-dense.csv in Bob’s row, and vice versa. Hence you cannot do a direct (subset) match as in Question 1. Instead, you seek to use a similarity metric, i.e., the Jaccard index (see Appendix).

(a) Write a program in Python that takes a list of binary values, and converts it into a set of indexes which are labelled 1 in the list. 

(b) Write a program in Python that takes two sets and computes their Jaccard index.

(c) Write a program that computes the Jaccard index of each row in the purchase-rest.csv dataset and your background knowledge about Bob, and outputs the top two rows with the highest Jaccard index.

(d) What are the row indexes, the set of items bought and the Jaccard index of the top two rows.

Hint
Accounts & Finance  A data set is an assortment of related, discrete things of related information that might be gotten to exclusively or in mix or oversaw all in all element. An informational index is coordinated into some sort of information structure....

Know the process

Students succeed in their courses by connecting and communicating with
an expert until they receive help on their questions

1
img

Submit Question

Post project within your desired price and deadline.

2
img

Tutor Is Assigned

A quality expert with the ability to solve your project will be assigned.

3
img

Receive Help

Check order history for updates. An email as a notification will be sent.

img
Unable to find what you’re looking for?

Consult our trusted tutors.

Developed by Versioning Solutions.