Implement the algorithm of Savasere, Omiecinski, and Navathe (SON algorithm)

Be Prepared For The Toughest Questions

Practice Problems

Exercise 2 - Frequent Itemsets

For this exercise, you have to read Section 6.4 up to 6.4.3 in Mining of Massive Datasets (3rd edition).

1. Implement the simple, randomized algorithm given in Section 6.4.1

2. Implement the algorithm of Savasere, Omiecinski, and Navathe (SON algorithm) in 6.4.3

3. Compare the two algorithms on the datasets T10I4D100K, T40I10D100K, chess, connect, mushroom, pumsb, pumsb_star provided at http://fimi.ua.ac.be/data/ and report the outcomes.

4. Experiment with different sample sizes in the simple randomized algorithm such as 1, 2, 5, 10% and compare your results (including the result produced by the SON algorithm).

Your approach should be as efficient as possible in terms of runtime and memory requirements. Report on any challenges that you might have observed in the implementation and by running the experiments.

Hint

ComputerMining is the process of extracting useful materials from the earth. Some examples of substances that are mined include coal, gold, or iron ore. Iron ore is the material from which the metal iron is produced. The process of mining dates back to prehistoric times....

Select Deadline for Completion

4 Days

3 Days

2 Days

1 Day

1 to 15 Hours

Know the process

Students succeed in their courses by connecting and communicating with
an expert until they receive help on their questions

Unable to find what you’re looking for?

Consult our trusted tutors.

Ask a Question

Be Prepared For The Toughest Questions

Practice Problems

Related questions

Know the process

Submit Question

Tutor Is Assigned

Receive Help

Unable to find what you’re looking for?