Advent of Kotlin: Week 3

As artificial intelligence gets more and more important, this week we will implement an important AI algorithm: k-means clustering.

k-means clustering

This algorithm is used to find a clusters of similar points in our data. For instance, imagine that you have data about how your users act on your website. You could use k-means clustering to find a typical kinds of users.

I've recorded a presentation of the algorithm, together with a simple explanation:

Your task is to implement a generic solver of the algorithm in the KMeansSolver. It is used to find clusters in different kinds spaces:

Points in 2D space

Double numbers

List of Double numbers

Vectors

The functions you need to implement are described in comments. Notice, that the position of the values representing each categories are called a dictionary. In other sources they are called centroids.

/** * Solves k-means problem. * * The implementation should : * 1. Find the initial dictionary with `createInitialMeans`. * 2. Group points by the current state of our dictionary. * 3. For each group, find the new mean position of the element in * dictionary representing them. * 4. If the new error is better then at the previous iteration * (by more than `threshold`), go back to the step 2. * Otherwise return the final dictionary. * * @param points the points in our dataset. * @param dictSize how many elements do we expect in our dictionary (the number of means). * @param threshold the expected minimal improvement with each iteration. */ fun solve( points: List, dictSize: Int, threshold: Double = 0.0001 ): List = TODO() // Group points by the dictionary points representing categories fun grouped(dictionary: List, means: List): Map<P, List> = TODO() // Calculates the error of the dataset, that is the sum of distances to the closest dictionary points protected open fun calculateError(points: List, means: List): Double = TODO() // Calculates the next positions of the dictionary points protected open fun nextMeans(points: List, means: List): List = TODO() protected fun closestMean(point: P, means: List): P = means.minByOrNull { mean -> distanceBetween(point, mean) }!!

Ending

I wish you the best of luck in this and future challenges and can't wait to see your solutions :)

Advent of Kotlin: Week 3

k-means clustering

Ending

The author: