Advent of Kotlin: Week 3
As artificial intelligence gets more and more important, this week we will implement an important AI algorithm: k-means clustering.
This algorithm is used to find a clusters of similar points in our data. For instance, imagine that you have data about how your users act on your website. You could use k-means clustering to find a typical kinds of users.
I've recorded a presentation of the algorithm, together with a simple explanation:
Your task is to implement a generic solver of the algorithm in the KMeansSolver. It is used to find clusters in different kinds spaces:
The functions you need to implement are described in comments. Notice, that the position of the values representing each categories are called a dictionary. In other sources they are called centroids.
/**
* Solves k-means problem.
*
* The implementation should :
* 1. Find the initial dictionary with `createInitialMeans`.
* 2. Group points by the current state of our dictionary.
* 3. For each group, find the new mean position of the element in
* dictionary representing them.
* 4. If the new error is better then at the previous iteration
* (by more than `threshold`), go back to the step 2.
* Otherwise return the final dictionary.
*
* @param points the points in our dataset.
* @param dictSize how many elements do we expect in our dictionary (the number of means).
* @param threshold the expected minimal improvement with each iteration.
*/
fun solve(
points: List<P>,
dictSize: Int,
threshold: Double = 0.0001
): List<P> = TODO()
// Group points by the dictionary points representing categories
fun grouped(dictionary: List<P>, means: List<P>): Map<P, List<P>> = TODO()
// Calculates the error of the dataset, that is the sum of distances to the closest dictionary points
protected open fun calculateError(points: List<P>, means: List<P>): Double = TODO()
// Calculates the next positions of the dictionary points
protected open fun nextMeans(points: List<P>, means: List<P>): List<P> = TODO()
protected fun closestMean(point: P, means: List<P>): P =
means.minByOrNull { mean -> distanceBetween(point, mean) }!!
More videos about k-means clustering:
I wish you the best of luck in this and future challenges and can't wait to see your solutions :)
Marcin Moskala is a highly experienced developer and Kotlin instructor as the founder of Kt. Academy, an official JetBrains partner specializing in Kotlin training, Google Developers Expert, known for his significant contributions to the Kotlin community. Moskala is the author of several widely recognized books, including "Effective Kotlin," "Kotlin Coroutines," "Functional Kotlin," "Advanced Kotlin," "Kotlin Essentials," and "Android Development with Kotlin."
Beyond his literary achievements, Moskala is the author of the largest Medium publication dedicated to Kotlin. As a respected speaker, he has been invited to share his insights at numerous programming conferences, including events such as Droidcon and the prestigious Kotlin Conf, the premier conference dedicated to the Kotlin programming language.