In machine learning, clustering is used for analyzing and grouping data
Ask Expert

Be Prepared For The Toughest Questions

Practice Problems

In machine learning, clustering is used for analyzing and grouping data

Pass Task 7.1P: K-Means and Hierarchical Clustering

Task description:

In machine learning, clustering is used for analyzing and grouping data which does not include pre-labelled class or even a class attribute at all. K-Means clustering and hierarchical clustering are all unsupervised learning algorithms.

K- means is a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters. It is a division of objects into clusters such that each object is in exactly one cluster, not several.

In Hierarchical clustering, clusters have a tree like structure or a parent child relationship. Here, the two most similar clusters are combined together and continue to combine until all objects are in the same cluster.

In this task, you use K-Means and Agglomerative Hierarchical algorithms to cluster a synthetic dataset and compare their difference.

You are given:

• np.random.seed(0)

• make_blobs class with input:

o n_samples: 200

o centers: [3,2], [6, 4], [10, 5]

o cluster_std: 0.9

• KMeans() function with setting: init = "k-means++", n_clusters = 3, n_init = 12

• AgglomerativeClustering() function with setting: n_clusters = 3, linkage = 'average'

• Other settings of your choice

You are asked to:

• plot your created dataset

• plot the two clustering models for your created dataset

• set the K-Mean plot with title “KMeans”

• set the Agglomerative Hierarchical plot with title “Agglomerative Hierarchical”

• calculate distance matrix for Agglomerative Clustering using the input feature matrix (linkage = complete)

• display dendrogram

Sample output as shown in the following figure is for demonstration purposes only. Yours might be different from the provided.


Hint
ComputerK-Means Clustering: It is an unsupervised learning algorithm used to solve the problems of clustering. It is an algorithm i.e., unsupervised learning algorithm, that groups the unlabeled dataset into different clusters. In this, K basically explains the number of pre-defined clusters which are required to be created in the process. For example, as if K=2, there are going to 2 clusters, and...

Know the process

Students succeed in their courses by connecting and communicating with
an expert until they receive help on their questions

1
img

Submit Question

Post project within your desired price and deadline.

2
img

Tutor Is Assigned

A quality expert with the ability to solve your project will be assigned.

3
img

Receive Help

Check order history for updates. An email as a notification will be sent.

img
Unable to find what you’re looking for?

Consult our trusted tutors.

Developed by Versioning Solutions.