k-means clustering
Mix.install([
{:scholar, "~> 0.2.0"},
{:exla, "~> 0.6.0"},
{:nx, "~> 0.6.0", override: true},
{:explorer, "~> 0.6.1"},
{:stb_image, "~> 0.6.1"},
{:scidata, "~> 0.1.10"},
{:req, "~> 0.3.9"},
{:kino, "~> 0.10.0"},
{:kino_vega_lite, "~> 0.1.9"},
{:tucan, "~> 0.3.0"}
])
Setup
This notebook introduces the KMeans clustering algorithm. We will explore KMeans in three different use cases. Let’s setup some aliases:
alias Scholar.Cluster.KMeans
require Explorer.DataFrame, as: DF
And let’s configure EXLA
as our default backend (where our tensors are stored) and compiler (which compiles Scholar code) across the notebook and all branched sections:
Nx.global_default_backend(EXLA.Backend)
Nx.Defn.global_default_options(compiler: EXLA)
key = Nx.Random.key(42)
#Nx.Tensor<
u32[2]
EXLA.Backend
[0, 42]
>
Iris Dataset
In the first example, we will focus on is the Iris Dataset. It is one of the most renowned datasets. It consists of 150 records describing three iris species: Iris Setosa, Iris Virginica, and Iris Versicolor. Our task will be to predict the species of given flowers.
Firstly, we load the data, then we split it into Training Data (x) and Target (y) and cast those into Nx tensors.
df = Explorer.Datasets.iris()
x = df |> DF.discard(["species"]) |> Nx.stack(axis: 1)
y =
df[["species"]]
|> DF.dummies(["species"])
|> Nx.stack(axis: 1)
|> Nx.argmax(axis: 1)
{x, y}
{#Nx.Tensor<
f64[150][4]
EXLA.Backend
[
[5.1, 3.5, 1.4, 0.2],
[4.9, 3.0, 1.4, 0.2],
[4.7, 3.2, 1.3, 0.2],
[4.6, 3.1, 1.5, 0.2],
[5.0, 3.6, 1.4, 0.2],
[5.4, 3.9, 1.7, 0.4],
[4.6, 3.4, 1.4, 0.3],
[5.0, 3.4, 1.5, 0.2],
[4.4, 2.9, 1.4, 0.2],
[4.9, 3.1, 1.5, 0.1],
[5.4, 3.7, 1.5, 0.2],
[4.8, 3.4, 1.6, 0.2],
[4.8, ...],
...
]
>,
#Nx.Tensor<
s32[150]
EXLA.Backend
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...]
>}
Exploratory Data Analysis
An important part of Data Science workflow is so-called Exploratory Data Analysis. EDA helps us understand the data in a better way and suggests some efficient strategies to solve problems. There is no one specific course of action which defines good EDA. It should contain tabular summaries and plots showing relations between features.
We start our EDA by finding the mean values of each feature by species.
grouped_data = DF.group_by(df, "species")
DF.summarise(
grouped_data,
petal_length: mean(petal_length),
petal_width: mean(petal_width),
sepal_width: mean(sepal_width),
sepal_length: mean(sepal_length)
)
#Explorer.DataFrame<
Polars[3 x 5]
species string ["Iris-setosa", "Iris-versicolor", "Iris-virginica"]
petal_length float [1.464, 4.26, 5.552]
petal_width float [0.2439999999999999, 1.3259999999999998, 2.026]
sepal_width float [3.4180000000000006, 2.7700000000000005, 2.9739999999999998]
sepal_length float [5.005999999999999, 5.936, 6.587999999999998]
>
We see that petal_length
and petal_width
are the most distinguishing features. Let’s explore them a little bit more.
Tucan.histogram(df, "petal_length", color_by: "species")
|> Tucan.facet_by(:column, "species")
|> Tucan.Scale.set_y_domain(0, 55)
|> Tucan.set_size(200, 200)
|> Tucan.set_title("Histograms of petal_length column by species", offset: 25, anchor: :middle)
{"$schema":"https://vega.github.io/schema/vega-lite/v5.json","__tucan__":{"plot":"histogram"},"data":{"values":[{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.9,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.7,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.6,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.9,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":4.6,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.4,"sepal_width":2.9,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.4,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.1,"sepal_length":4.8,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.1,"petal_width":0.1,"sepal_length":4.3,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.2,"petal_width":0.2,"sepal_length":5.8,"sepal_width":4.0,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.7,"sepal_width":4.4,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.9,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":5.1,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.3,"sepal_length":5.7,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.3,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.2,"sepal_length":5.4,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.1,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.0,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.6,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.5,"sepal_length":5.1,"sepal_width":3.3,"species":"Iris-setosa"},{"petal_length":1.9,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.4,"sepal_length":5.0,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.2,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.2,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.7,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":5.2,"sepal_width":4.1,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.5,"sepal_width":4.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.2,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":5.5,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.4,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.3,"sepal_length":5.0,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.3,"sepal_length":4.5,"sepal_width":2.3,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.4,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.6,"sepal_length":5.0,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.9,"petal_width":0.4,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":4.8,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.3,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.3,"species":"Iris-setosa"},{"petal_length":4.7,"petal_width":1.4,"sepal_length":7.0,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.4,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.9,"petal_width":1.5,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":5.5,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.5,"sepal_length":6.5,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.6,"sepal_length":6.3,"sepal_width":3.3,"species":"Iris-versicolor"},{"petal_length":3.3,"petal_width":1.0,"sepal_length":4.9,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.3,"sepal_length":6.6,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.4,"sepal_length":5.2,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":3.5,"petal_width":1.0,"sepal_length":5.0,"sepal_width":2.0,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.5,"sepal_length":5.9,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.0,"sepal_length":6.0,"sepal_width":2.2,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.4,"sepal_length":6.1,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.6,"petal_width":1.3,"sepal_length":5.6,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.4,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":5.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.0,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.2,"sepal_width":2.2,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.1,"sepal_length":5.6,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":5.9,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":6.1,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.9,"petal_width":1.5,"sepal_length":6.3,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.2,"sepal_length":6.1,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.3,"petal_width":1.3,"sepal_length":6.4,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.4,"sepal_length":6.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.8,"petal_width":1.4,"sepal_length":6.8,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":5.0,"petal_width":1.7,"sepal_length":6.7,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.0,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.5,"petal_width":1.0,"sepal_length":5.7,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":3.8,"petal_width":1.1,"sepal_length":5.5,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":3.7,"petal_width":1.0,"sepal_length":5.5,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.2,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":5.1,"petal_width":1.6,"sepal_length":6.0,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":5.4,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.6,"sepal_length":6.0,"sepal_width":3.4,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.5,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.3,"sepal_length":6.3,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.3,"sepal_length":5.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":5.5,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.2,"sepal_length":5.5,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.4,"sepal_length":6.1,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.2,"sepal_length":5.8,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":3.3,"petal_width":1.0,"sepal_length":5.0,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.3,"sepal_length":5.6,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.2,"sepal_length":5.7,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.3,"petal_width":1.3,"sepal_length":6.2,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.0,"petal_width":1.1,"sepal_length":5.1,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":6.0,"petal_width":2.5,"sepal_length":6.3,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.9,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.9,"petal_width":2.1,"sepal_length":7.1,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":1.8,"sepal_length":6.3,"sepal_width":2.9,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":2.2,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.6,"petal_width":2.1,"sepal_length":7.6,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":4.5,"petal_width":1.7,"sepal_length":4.9,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":6.3,"petal_width":1.8,"sepal_length":7.3,"sepal_width":2.9,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":1.8,"sepal_length":6.7,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":2.5,"sepal_length":7.2,"sepal_width":3.6,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.0,"sepal_length":6.5,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.3,"petal_width":1.9,"sepal_length":6.4,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":2.1,"sepal_length":6.8,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":2.0,"sepal_length":5.7,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.4,"sepal_length":5.8,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.3,"petal_width":2.3,"sepal_length":6.4,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":1.8,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.7,"petal_width":2.2,"sepal_length":7.7,"sepal_width":3.8,"species":"Iris-virginica"},{"petal_length":6.9,"petal_width":2.3,"sepal_length":7.7,"sepal_width":2.6,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":1.5,"sepal_length":6.0,"sepal_width":2.2,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.3,"sepal_length":6.9,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":2.0,"sepal_length":5.6,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":6.7,"petal_width":2.0,"sepal_length":7.7,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":1.8,"sepal_length":6.3,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.1,"sepal_length":6.7,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":6.0,"petal_width":1.8,"sepal_length":7.2,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":6.2,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":1.8,"sepal_length":6.1,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.1,"sepal_length":6.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":1.6,"sepal_length":7.2,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":1.9,"sepal_length":7.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":6.4,"petal_width":2.0,"sepal_length":7.9,"sepal_width":3.8,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.2,"sepal_length":6.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.5,"sepal_length":6.3,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":1.4,"sepal_length":6.1,"sepal_width":2.6,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":2.3,"sepal_length":7.7,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.4,"sepal_length":6.3,"sepal_width":3.4,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":1.8,"sepal_length":6.4,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":6.0,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.4,"petal_width":2.1,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.4,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.3,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.9,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.9,"petal_width":2.3,"sepal_length":6.8,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.5,"sepal_length":6.7,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":5.2,"petal_width":2.3,"sepal_length":6.7,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":1.9,"sepal_length":6.3,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":5.2,"petal_width":2.0,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.4,"petal_width":2.3,"sepal_length":6.2,"sepal_width":3.4,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.8,"sepal_length":5.9,"sepal_width":3.0,"species":"Iris-virginica"}]},"encoding":{"color":{"field":"species"},"column":{"field":"species"},"x":{"bin":{"binned":true},"field":"bin_petal_length","title":"petal_length"},"x2":{"field":"bin_petal_length_end"},"y":{"field":"count_petal_length","scale":{"domain":[0,55]},"stack":null,"type":"quantitative"}},"height":200,"mark":{"fillOpacity":1,"type":"bar"},"title":{"anchor":"middle","offset":25,"text":"Histograms of petal_length column by species"},"transform":[{"as":"bin_petal_length","bin":true,"field":"petal_length"},{"aggregate":[{"as":"count_petal_length","op":"count"}],"groupby":["bin_petal_length","bin_petal_length_end","species"]}],"width":200}
Tucan.scatter(df, "petal_length", "petal_width", filled: true, color_by: "species")
|> Tucan.set_size(300, 300)
|> Tucan.set_title("Scatterplot of data samples projected on plane petal_width x petal_length",
offset: 25
)
{"$schema":"https://vega.github.io/schema/vega-lite/v5.json","data":{"values":[{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.9,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.7,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.6,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.9,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":4.6,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.4,"sepal_width":2.9,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.4,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.1,"sepal_length":4.8,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.1,"petal_width":0.1,"sepal_length":4.3,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.2,"petal_width":0.2,"sepal_length":5.8,"sepal_width":4.0,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.7,"sepal_width":4.4,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.9,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":5.1,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.3,"sepal_length":5.7,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.3,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.2,"sepal_length":5.4,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.1,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.0,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.6,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.5,"sepal_length":5.1,"sepal_width":3.3,"species":"Iris-setosa"},{"petal_length":1.9,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.4,"sepal_length":5.0,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.2,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.2,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.7,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":5.2,"sepal_width":4.1,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.5,"sepal_width":4.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.2,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":5.5,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.4,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.3,"sepal_length":5.0,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.3,"sepal_length":4.5,"sepal_width":2.3,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.4,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.6,"sepal_length":5.0,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.9,"petal_width":0.4,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":4.8,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.3,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.3,"species":"Iris-setosa"},{"petal_length":4.7,"petal_width":1.4,"sepal_length":7.0,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.4,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.9,"petal_width":1.5,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":5.5,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.5,"sepal_length":6.5,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.6,"sepal_length":6.3,"sepal_width":3.3,"species":"Iris-versicolor"},{"petal_length":3.3,"petal_width":1.0,"sepal_length":4.9,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.3,"sepal_length":6.6,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.4,"sepal_length":5.2,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":3.5,"petal_width":1.0,"sepal_length":5.0,"sepal_width":2.0,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.5,"sepal_length":5.9,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.0,"sepal_length":6.0,"sepal_width":2.2,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.4,"sepal_length":6.1,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.6,"petal_width":1.3,"sepal_length":5.6,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.4,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":5.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.0,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.2,"sepal_width":2.2,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.1,"sepal_length":5.6,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":5.9,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":6.1,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.9,"petal_width":1.5,"sepal_length":6.3,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.2,"sepal_length":6.1,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.3,"petal_width":1.3,"sepal_length":6.4,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.4,"sepal_length":6.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.8,"petal_width":1.4,"sepal_length":6.8,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":5.0,"petal_width":1.7,"sepal_length":6.7,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.0,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.5,"petal_width":1.0,"sepal_length":5.7,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":3.8,"petal_width":1.1,"sepal_length":5.5,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":3.7,"petal_width":1.0,"sepal_length":5.5,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.2,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":5.1,"petal_width":1.6,"sepal_length":6.0,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":5.4,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.6,"sepal_length":6.0,"sepal_width":3.4,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.5,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.3,"sepal_length":6.3,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.3,"sepal_length":5.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":5.5,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.2,"sepal_length":5.5,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.4,"sepal_length":6.1,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.2,"sepal_length":5.8,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":3.3,"petal_width":1.0,"sepal_length":5.0,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.3,"sepal_length":5.6,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.2,"sepal_length":5.7,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.3,"petal_width":1.3,"sepal_length":6.2,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.0,"petal_width":1.1,"sepal_length":5.1,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":6.0,"petal_width":2.5,"sepal_length":6.3,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.9,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.9,"petal_width":2.1,"sepal_length":7.1,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":1.8,"sepal_length":6.3,"sepal_width":2.9,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":2.2,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.6,"petal_width":2.1,"sepal_length":7.6,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":4.5,"petal_width":1.7,"sepal_length":4.9,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":6.3,"petal_width":1.8,"sepal_length":7.3,"sepal_width":2.9,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":1.8,"sepal_length":6.7,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":2.5,"sepal_length":7.2,"sepal_width":3.6,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.0,"sepal_length":6.5,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.3,"petal_width":1.9,"sepal_length":6.4,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":2.1,"sepal_length":6.8,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":2.0,"sepal_length":5.7,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.4,"sepal_length":5.8,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.3,"petal_width":2.3,"sepal_length":6.4,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":1.8,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.7,"petal_width":2.2,"sepal_length":7.7,"sepal_width":3.8,"species":"Iris-virginica"},{"petal_length":6.9,"petal_width":2.3,"sepal_length":7.7,"sepal_width":2.6,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":1.5,"sepal_length":6.0,"sepal_width":2.2,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.3,"sepal_length":6.9,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":2.0,"sepal_length":5.6,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":6.7,"petal_width":2.0,"sepal_length":7.7,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":1.8,"sepal_length":6.3,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.1,"sepal_length":6.7,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":6.0,"petal_width":1.8,"sepal_length":7.2,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":6.2,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":1.8,"sepal_length":6.1,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.1,"sepal_length":6.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":1.6,"sepal_length":7.2,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":1.9,"sepal_length":7.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":6.4,"petal_width":2.0,"sepal_length":7.9,"sepal_width":3.8,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.2,"sepal_length":6.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.5,"sepal_length":6.3,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":1.4,"sepal_length":6.1,"sepal_width":2.6,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":2.3,"sepal_length":7.7,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.4,"sepal_length":6.3,"sepal_width":3.4,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":1.8,"sepal_length":6.4,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":6.0,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.4,"petal_width":2.1,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.4,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.3,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.9,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.9,"petal_width":2.3,"sepal_length":6.8,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.5,"sepal_length":6.7,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":5.2,"petal_width":2.3,"sepal_length":6.7,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":1.9,"sepal_length":6.3,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":5.2,"petal_width":2.0,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.4,"petal_width":2.3,"sepal_length":6.2,"sepal_width":3.4,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.8,"sepal_length":5.9,"sepal_width":3.0,"species":"Iris-virginica"}]},"encoding":{"color":{"field":"species","type":"nominal"},"x":{"field":"petal_length","scale":{"zero":false},"type":"quantitative"},"y":{"field":"petal_width","scale":{"zero":false},"type":"quantitative"}},"height":300,"mark":{"fillOpacity":1,"filled":true,"type":"point"},"title":{"offset":25,"text":"Scatterplot of data samples projected on plane petal_width x petal_length"},"width":300}
Tucan.scatter(df, "petal_length", "petal_width")
|> Tucan.facet_by(:column, "species")
|> Tucan.set_title(
"Scatterplot of data samples projected on plane petal_width x petal_length by species",
offset: 25
)
{"$schema":"https://vega.github.io/schema/vega-lite/v5.json","data":{"values":[{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.9,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.7,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.6,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.9,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":4.6,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.4,"sepal_width":2.9,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.4,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.1,"sepal_length":4.8,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.1,"petal_width":0.1,"sepal_length":4.3,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.2,"petal_width":0.2,"sepal_length":5.8,"sepal_width":4.0,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.7,"sepal_width":4.4,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.9,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":5.1,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.3,"sepal_length":5.7,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.3,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.2,"sepal_length":5.4,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.1,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.0,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.6,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.5,"sepal_length":5.1,"sepal_width":3.3,"species":"Iris-setosa"},{"petal_length":1.9,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.4,"sepal_length":5.0,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.2,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.2,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.7,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":5.2,"sepal_width":4.1,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.5,"sepal_width":4.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.2,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":5.5,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.4,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.3,"sepal_length":5.0,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.3,"sepal_length":4.5,"sepal_width":2.3,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.4,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.6,"sepal_length":5.0,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.9,"petal_width":0.4,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":4.8,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.3,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.3,"species":"Iris-setosa"},{"petal_length":4.7,"petal_width":1.4,"sepal_length":7.0,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.4,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.9,"petal_width":1.5,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":5.5,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.5,"sepal_length":6.5,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.6,"sepal_length":6.3,"sepal_width":3.3,"species":"Iris-versicolor"},{"petal_length":3.3,"petal_width":1.0,"sepal_length":4.9,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.3,"sepal_length":6.6,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.4,"sepal_length":5.2,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":3.5,"petal_width":1.0,"sepal_length":5.0,"sepal_width":2.0,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.5,"sepal_length":5.9,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.0,"sepal_length":6.0,"sepal_width":2.2,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.4,"sepal_length":6.1,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.6,"petal_width":1.3,"sepal_length":5.6,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.4,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":5.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.0,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.2,"sepal_width":2.2,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.1,"sepal_length":5.6,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":5.9,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":6.1,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.9,"petal_width":1.5,"sepal_length":6.3,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.2,"sepal_length":6.1,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.3,"petal_width":1.3,"sepal_length":6.4,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.4,"sepal_length":6.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.8,"petal_width":1.4,"sepal_length":6.8,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":5.0,"petal_width":1.7,"sepal_length":6.7,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.0,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.5,"petal_width":1.0,"sepal_length":5.7,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":3.8,"petal_width":1.1,"sepal_length":5.5,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":3.7,"petal_width":1.0,"sepal_length":5.5,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.2,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":5.1,"petal_width":1.6,"sepal_length":6.0,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":5.4,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.6,"sepal_length":6.0,"sepal_width":3.4,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.5,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.3,"sepal_length":6.3,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.3,"sepal_length":5.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":5.5,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.2,"sepal_length":5.5,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.4,"sepal_length":6.1,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.2,"sepal_length":5.8,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":3.3,"petal_width":1.0,"sepal_length":5.0,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.3,"sepal_length":5.6,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.2,"sepal_length":5.7,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.3,"petal_width":1.3,"sepal_length":6.2,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.0,"petal_width":1.1,"sepal_length":5.1,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":6.0,"petal_width":2.5,"sepal_length":6.3,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.9,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.9,"petal_width":2.1,"sepal_length":7.1,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":1.8,"sepal_length":6.3,"sepal_width":2.9,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":2.2,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.6,"petal_width":2.1,"sepal_length":7.6,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":4.5,"petal_width":1.7,"sepal_length":4.9,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":6.3,"petal_width":1.8,"sepal_length":7.3,"sepal_width":2.9,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":1.8,"sepal_length":6.7,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":2.5,"sepal_length":7.2,"sepal_width":3.6,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.0,"sepal_length":6.5,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.3,"petal_width":1.9,"sepal_length":6.4,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":2.1,"sepal_length":6.8,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":2.0,"sepal_length":5.7,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.4,"sepal_length":5.8,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.3,"petal_width":2.3,"sepal_length":6.4,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":1.8,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.7,"petal_width":2.2,"sepal_length":7.7,"sepal_width":3.8,"species":"Iris-virginica"},{"petal_length":6.9,"petal_width":2.3,"sepal_length":7.7,"sepal_width":2.6,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":1.5,"sepal_length":6.0,"sepal_width":2.2,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.3,"sepal_length":6.9,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":2.0,"sepal_length":5.6,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":6.7,"petal_width":2.0,"sepal_length":7.7,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":1.8,"sepal_length":6.3,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.1,"sepal_length":6.7,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":6.0,"petal_width":1.8,"sepal_length":7.2,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":6.2,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":1.8,"sepal_length":6.1,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.1,"sepal_length":6.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":1.6,"sepal_length":7.2,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":1.9,"sepal_length":7.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":6.4,"petal_width":2.0,"sepal_length":7.9,"sepal_width":3.8,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.2,"sepal_length":6.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.5,"sepal_length":6.3,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":1.4,"sepal_length":6.1,"sepal_width":2.6,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":2.3,"sepal_length":7.7,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.4,"sepal_length":6.3,"sepal_width":3.4,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":1.8,"sepal_length":6.4,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":6.0,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.4,"petal_width":2.1,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.4,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.3,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.9,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.9,"petal_width":2.3,"sepal_length":6.8,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.5,"sepal_length":6.7,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":5.2,"petal_width":2.3,"sepal_length":6.7,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":1.9,"sepal_length":6.3,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":5.2,"petal_width":2.0,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.4,"petal_width":2.3,"sepal_length":6.2,"sepal_width":3.4,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.8,"sepal_length":5.9,"sepal_width":3.0,"species":"Iris-virginica"}]},"encoding":{"column":{"field":"species"},"x":{"field":"petal_length","scale":{"zero":false},"type":"quantitative"},"y":{"field":"petal_width","scale":{"zero":false},"type":"quantitative"}},"mark":{"fillOpacity":1,"type":"point"},"title":{"offset":25,"text":"Scatterplot of data samples projected on plane petal_width x petal_length by species"}}
Now we have a better understanding of the data. Iris species have different petal widths and petal lengths. Iris Setosa has the smallest petal, Versicolor is medium size, and Virginica has the largest petal. We can ascertain that our analysis is correct and plot the so-called Elbow plot. The Elbow plot is a plot which presents Inertia vs the number of clusters. If there is a characteristic elbow, then we have a strong suggestion that the number of clusters is correct. Let’s train KMeans models for a different number of clusters from range 1 to 11.
clusterings = 1..11
models =
for num_clusters <- clusterings do
KMeans.fit(x, num_clusters: num_clusters, key: key)
end
inertias = for model <- models, do: Nx.to_number(model.inertia)
[680.8244, 152.36870647733906, 78.94084142614602, 57.44028021295475, 46.56163015873016,
38.95701115711985, 35.15943976939724, 30.324232174688056, 27.927083333333336, 26.371291306519566,
24.004956137000256]
Tucan.lineplot([num_clusters: clusterings, inertia: inertias], "num_clusters", "inertia",
x: [type: :nominal, axis: [label_angle: 0]],
title: "Elbow Plot"
)
|> Tucan.Axes.set_xy_titles("Number of Clusters", "Inertia")
|> Tucan.set_size(600, 300)
{"$schema":"https://vega.github.io/schema/vega-lite/v5.json","data":{"values":[{"inertia":680.8244000000001,"num_clusters":1},{"inertia":152.36870647733906,"num_clusters":2},{"inertia":78.94084142614601,"num_clusters":3},{"inertia":57.44028021295475,"num_clusters":4},{"inertia":46.56163015873015,"num_clusters":5},{"inertia":38.95701115711985,"num_clusters":6},{"inertia":35.15943976939724,"num_clusters":7},{"inertia":30.324232174688056,"num_clusters":8},{"inertia":27.92708333333333,"num_clusters":9},{"inertia":26.371291306519566,"num_clusters":10},{"inertia":24.00495613700025,"num_clusters":11}]},"encoding":{"x":{"axis":{"labelAngle":0,"title":"Number of Clusters"},"field":"num_clusters","type":"nominal"},"y":{"axis":{"title":"Inertia"},"field":"inertia","type":"quantitative"}},"height":300,"mark":{"fillOpacity":1,"type":"line"},"title":"Elbow Plot","width":600}
As you can see, we have the elbow when the number of clusters equals three. So this value of the parameter seems to be the best.
In order to compare our clustering with the target labels, we need to ensure our clusters are in a matching order.
defmodule Iris.Clusters do
import Nx.Defn
defn sort_clusters(model) do
# We sort clusters by the first coordinate
order = Nx.argsort(model.clusters[[.., 0]])
labels_maping = Nx.argsort(order)
%{
model
| labels: Nx.take(labels_maping, model.labels),
clusters: Nx.take(model.clusters, order)
}
end
end
{:module, Iris.Clusters, <<70, 79, 82, 49, 0, 0, 10, ...>>, true}
best_model = Enum.at(models, 2)
best_model = Iris.Clusters.sort_clusters(best_model)
accuracy = Scholar.Metrics.Classification.accuracy(best_model.labels, y)
#Nx.Tensor<
f32
EXLA.Backend
0.8933333158493042
>
Accuracy is nearly 90% - that’s pretty decent! Let’s look at our results plotted on one of the previous plots.
coords = [
cluster_petal_length: best_model.clusters[[.., 2]] |> Nx.to_flat_list(),
cluster_petal_width: best_model.clusters[[.., 3]] |> Nx.to_flat_list()
]
Tucan.layers([
Tucan.scatter(df, "petal_length", "petal_width", color_by: "species", filled: true),
Tucan.scatter(coords, "cluster_petal_length", "cluster_petal_width",
filled: true,
point_size: 100,
point_color: "green"
)
])
|> Tucan.set_size(300, 300)
|> Tucan.set_title(
"Scatterplot of data samples projected on plane petal_width x petal_length with calculated centroids",
offset: 25
)
{"$schema":"https://vega.github.io/schema/vega-lite/v5.json","height":300,"layer":[{"data":{"values":[{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.9,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.7,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.6,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.9,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":4.6,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.4,"sepal_width":2.9,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.4,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.1,"sepal_length":4.8,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.1,"petal_width":0.1,"sepal_length":4.3,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.2,"petal_width":0.2,"sepal_length":5.8,"sepal_width":4.0,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.7,"sepal_width":4.4,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.9,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":5.1,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.3,"sepal_length":5.7,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.3,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.2,"sepal_length":5.4,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.1,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.0,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.6,"species":"Iris-setosa"},{"petal_length":1.7,"petal_width":0.5,"sepal_length":5.1,"sepal_width":3.3,"species":"Iris-setosa"},{"petal_length":1.9,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.4,"sepal_length":5.0,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.2,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.2,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.7,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":4.8,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.4,"sepal_length":5.4,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":5.2,"sepal_width":4.1,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.5,"sepal_width":4.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.2,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":5.5,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.1,"sepal_length":4.9,"sepal_width":3.1,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.4,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.4,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.3,"sepal_length":5.0,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.3,"sepal_length":4.5,"sepal_width":2.3,"species":"Iris-setosa"},{"petal_length":1.3,"petal_width":0.2,"sepal_length":4.4,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.6,"sepal_length":5.0,"sepal_width":3.5,"species":"Iris-setosa"},{"petal_length":1.9,"petal_width":0.4,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.3,"sepal_length":4.8,"sepal_width":3.0,"species":"Iris-setosa"},{"petal_length":1.6,"petal_width":0.2,"sepal_length":5.1,"sepal_width":3.8,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":4.6,"sepal_width":3.2,"species":"Iris-setosa"},{"petal_length":1.5,"petal_width":0.2,"sepal_length":5.3,"sepal_width":3.7,"species":"Iris-setosa"},{"petal_length":1.4,"petal_width":0.2,"sepal_length":5.0,"sepal_width":3.3,"species":"Iris-setosa"},{"petal_length":4.7,"petal_width":1.4,"sepal_length":7.0,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.4,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.9,"petal_width":1.5,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":5.5,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.5,"sepal_length":6.5,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.6,"sepal_length":6.3,"sepal_width":3.3,"species":"Iris-versicolor"},{"petal_length":3.3,"petal_width":1.0,"sepal_length":4.9,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.3,"sepal_length":6.6,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.4,"sepal_length":5.2,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":3.5,"petal_width":1.0,"sepal_length":5.0,"sepal_width":2.0,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.5,"sepal_length":5.9,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.0,"sepal_length":6.0,"sepal_width":2.2,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.4,"sepal_length":6.1,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.6,"petal_width":1.3,"sepal_length":5.6,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.4,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":5.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.0,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.2,"sepal_width":2.2,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.1,"sepal_length":5.6,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":5.9,"sepal_width":3.2,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":6.1,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.9,"petal_width":1.5,"sepal_length":6.3,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.2,"sepal_length":6.1,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":4.3,"petal_width":1.3,"sepal_length":6.4,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.4,"sepal_length":6.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.8,"petal_width":1.4,"sepal_length":6.8,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":5.0,"petal_width":1.7,"sepal_length":6.7,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":6.0,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.5,"petal_width":1.0,"sepal_length":5.7,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":3.8,"petal_width":1.1,"sepal_length":5.5,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":3.7,"petal_width":1.0,"sepal_length":5.5,"sepal_width":2.4,"species":"Iris-versicolor"},{"petal_length":3.9,"petal_width":1.2,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":5.1,"petal_width":1.6,"sepal_length":6.0,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.5,"sepal_length":5.4,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.5,"petal_width":1.6,"sepal_length":6.0,"sepal_width":3.4,"species":"Iris-versicolor"},{"petal_length":4.7,"petal_width":1.5,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.3,"sepal_length":6.3,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.3,"sepal_length":5.6,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.3,"sepal_length":5.5,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.4,"petal_width":1.2,"sepal_length":5.5,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":4.6,"petal_width":1.4,"sepal_length":6.1,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.0,"petal_width":1.2,"sepal_length":5.8,"sepal_width":2.6,"species":"Iris-versicolor"},{"petal_length":3.3,"petal_width":1.0,"sepal_length":5.0,"sepal_width":2.3,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.3,"sepal_length":5.6,"sepal_width":2.7,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.2,"sepal_length":5.7,"sepal_width":3.0,"species":"Iris-versicolor"},{"petal_length":4.2,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":4.3,"petal_width":1.3,"sepal_length":6.2,"sepal_width":2.9,"species":"Iris-versicolor"},{"petal_length":3.0,"petal_width":1.1,"sepal_length":5.1,"sepal_width":2.5,"species":"Iris-versicolor"},{"petal_length":4.1,"petal_width":1.3,"sepal_length":5.7,"sepal_width":2.8,"species":"Iris-versicolor"},{"petal_length":6.0,"petal_width":2.5,"sepal_length":6.3,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.9,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.9,"petal_width":2.1,"sepal_length":7.1,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":1.8,"sepal_length":6.3,"sepal_width":2.9,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":2.2,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.6,"petal_width":2.1,"sepal_length":7.6,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":4.5,"petal_width":1.7,"sepal_length":4.9,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":6.3,"petal_width":1.8,"sepal_length":7.3,"sepal_width":2.9,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":1.8,"sepal_length":6.7,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":2.5,"sepal_length":7.2,"sepal_width":3.6,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.0,"sepal_length":6.5,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.3,"petal_width":1.9,"sepal_length":6.4,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":2.1,"sepal_length":6.8,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":2.0,"sepal_length":5.7,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.4,"sepal_length":5.8,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.3,"petal_width":2.3,"sepal_length":6.4,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":1.8,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.7,"petal_width":2.2,"sepal_length":7.7,"sepal_width":3.8,"species":"Iris-virginica"},{"petal_length":6.9,"petal_width":2.3,"sepal_length":7.7,"sepal_width":2.6,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":1.5,"sepal_length":6.0,"sepal_width":2.2,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.3,"sepal_length":6.9,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":2.0,"sepal_length":5.6,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":6.7,"petal_width":2.0,"sepal_length":7.7,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":1.8,"sepal_length":6.3,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.1,"sepal_length":6.7,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":6.0,"petal_width":1.8,"sepal_length":7.2,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":6.2,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":4.9,"petal_width":1.8,"sepal_length":6.1,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.1,"sepal_length":6.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.8,"petal_width":1.6,"sepal_length":7.2,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":1.9,"sepal_length":7.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":6.4,"petal_width":2.0,"sepal_length":7.9,"sepal_width":3.8,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.2,"sepal_length":6.4,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.5,"sepal_length":6.3,"sepal_width":2.8,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":1.4,"sepal_length":6.1,"sepal_width":2.6,"species":"Iris-virginica"},{"petal_length":6.1,"petal_width":2.3,"sepal_length":7.7,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.4,"sepal_length":6.3,"sepal_width":3.4,"species":"Iris-virginica"},{"petal_length":5.5,"petal_width":1.8,"sepal_length":6.4,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":4.8,"petal_width":1.8,"sepal_length":6.0,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.4,"petal_width":2.1,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.6,"petal_width":2.4,"sepal_length":6.7,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":2.3,"sepal_length":6.9,"sepal_width":3.1,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.9,"sepal_length":5.8,"sepal_width":2.7,"species":"Iris-virginica"},{"petal_length":5.9,"petal_width":2.3,"sepal_length":6.8,"sepal_width":3.2,"species":"Iris-virginica"},{"petal_length":5.7,"petal_width":2.5,"sepal_length":6.7,"sepal_width":3.3,"species":"Iris-virginica"},{"petal_length":5.2,"petal_width":2.3,"sepal_length":6.7,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.0,"petal_width":1.9,"sepal_length":6.3,"sepal_width":2.5,"species":"Iris-virginica"},{"petal_length":5.2,"petal_width":2.0,"sepal_length":6.5,"sepal_width":3.0,"species":"Iris-virginica"},{"petal_length":5.4,"petal_width":2.3,"sepal_length":6.2,"sepal_width":3.4,"species":"Iris-virginica"},{"petal_length":5.1,"petal_width":1.8,"sepal_length":5.9,"sepal_width":3.0,"species":"Iris-virginica"}]},"encoding":{"color":{"field":"species","type":"nominal"},"x":{"field":"petal_length","scale":{"zero":false},"type":"quantitative"},"y":{"field":"petal_width","scale":{"zero":false},"type":"quantitative"}},"mark":{"fillOpacity":1,"filled":true,"type":"point"}},{"data":{"values":[{"cluster_petal_length":1.464,"cluster_petal_width":0.24400000000000005},{"cluster_petal_length":4.393548387096775,"cluster_petal_width":1.4338709677419355},{"cluster_petal_length":5.742105263157896,"cluster_petal_width":2.0710526315789473}]},"encoding":{"x":{"field":"cluster_petal_length","scale":{"zero":false},"type":"quantitative"},"y":{"field":"cluster_petal_width","scale":{"zero":false},"type":"quantitative"}},"mark":{"color":"green","fillOpacity":1,"filled":true,"size":100,"type":"point"}}],"title":{"offset":25,"text":"Scatterplot of data samples projected on plane petal_width x petal_length with calculated centroids"},"width":300}
As we expect 😎
Clustering of pixel colors
The other interesting use case of KMeans clustering is pixel clustering. This technique replaces all pixels with similar colors (similar in terms of euclidean distance between RGB) with a centroid related to them.
Let us start with loading the referral image.
url =
"https://pix4free.org/assets/library/2021-01-12/originals/san_francisco_california_golden_gate_bridge_water.jpg"
%{body: raw_image} = Req.get!(url)
image = StbImage.read_binary!(raw_image)
{height, width, _num_channels} = image.shape
image = StbImage.resize(image, div(height, 3), div(width, 3))
shape = image.shape
image_kino = image |> StbImage.to_binary(:jpg) |> Kino.Image.new(:jpeg)
Now we will try to use only ten colors to represent the same picture.
x = image |> StbImage.to_nx() |> Nx.reshape({:auto, 3})
model =
KMeans.fit(x,
num_clusters: 10,
num_runs: 10,
max_iterations: 200,
key: key
)
repainted_x = Nx.take(model.clusters, model.labels)
tensor_to_image = fn x ->
x
|> Nx.reshape(shape)
|> Nx.round()
|> Nx.as_type({:u, 8})
|> StbImage.from_nx()
|> StbImage.to_binary(:jpg)
|> Kino.Image.new(:jpeg)
end
repainted_x = tensor_to_image.(repainted_x)
Look that even though we use only ten colors, we can say without any doubt that this is the same image. Let’s experiment more deeply. Now we will try 5, 10, 15, 20 and 40 colors and then compare the processed images with the original one.
clusterings = [5, 10, 15, 20, 40]
models =
for num_clusters <- clusterings do
KMeans.fit(x, num_clusters: num_clusters, key: key)
end
[
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
9
>,
clusters: #Nx.Tensor<
f32[5][3]
EXLA.Backend
[
[61.85893249511719, 60.239295959472656, 58.86780548095703],
[4.204394340515137, 87.57450866699219, 99.723876953125],
[136.80120849609375, 136.31080627441406, 133.2758026123047],
[8.59874153137207, 138.21804809570312, 150.6464385986328],
[212.19105529785156, 194.9630126953125, 186.4161376953125]
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
609477056.0
>,
labels: #Nx.Tensor<
s32[426400]
EXLA.Backend
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
17
>,
clusters: #Nx.Tensor<
f32[10][3]
EXLA.Backend
[
[43.8863410949707, 51.81157302856445, 52.23817825317383],
[217.59837341308594, 208.80609130859375, 205.50840759277344],
[151.6479949951172, 163.68496704101562, 166.06430053710938],
[6.749063968658447, 148.70291137695312, 160.81790161132812],
[87.08784484863281, 72.5816879272461, 68.8161849975586],
[111.68388366699219, 126.14994049072266, 125.45569610595703],
[1.9535311460494995, 76.27875518798828, 88.1100845336914],
[193.17453002929688, 100.84982299804688, 74.338134765625],
[5.854827404022217, 107.62191009521484, 120.50939178466797],
[225.98037719726562, 179.45010375976562, 155.50418090820312]
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
271123392.0
>,
labels: #Nx.Tensor<
s32[426400]
EXLA.Backend
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
25
>,
clusters: #Nx.Tensor<
f32[15][3]
EXLA.Backend
[
[37.49226379394531, 49.20851516723633, 49.95775604248047],
[68.16342163085938, 63.4951057434082, 62.18951416015625],
[222.685546875, 213.00254821777344, 209.28672790527344],
[1.7072257995605469, 74.72100830078125, 86.64070892333984],
[211.2851104736328, 104.6119613647461, 69.65288543701172],
[8.460535049438477, 161.30209350585938, 172.6630859375],
[168.0978240966797, 155.88272094726562, 149.5994873046875],
[80.14752960205078, 115.78617095947266, 120.41868591308594],
[107.40535736083984, 78.1260757446289, 71.10165405273438],
[233.2224578857422, 185.8236541748047, 160.95843505859375],
[1.716829776763916, 128.2496795654297, 141.5491180419922],
[118.86845397949219, 156.38539123535156, 165.07196044921875],
[175.82142639160156, 184.47825622558594, 189.05616760253906],
[132.0281982421875, 125.65474700927734, 118.93122863769531],
[4.0000224113464355, 101.1719741821289, 113.9442367553711]
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
177419888.0
>,
labels: #Nx.Tensor<
s32[426400]
EXLA.Backend
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
22
>,
clusters: #Nx.Tensor<
f32[20][3]
EXLA.Backend
[
[1.4399189949035645, 68.7367935180664, 80.45700073242188],
[212.9929962158203, 166.43873596191406, 144.96112060546875],
[7.347901821136475, 162.13064575195312, 173.42539978027344],
[2.1622705459594727, 107.44270324707031, 120.4988784790039],
[134.57351684570312, 77.13945770263672, 64.61695098876953],
[220.81675720214844, 107.48382568359375, 68.0926742553711],
[221.37522888183594, 214.39996337890625, 212.31607055664062],
[180.4493408203125, 186.41665649414062, 190.1817169189453],
[61.370914459228516, 112.97090148925781, 120.70513916015625],
[2.7440736293792725, 87.0137710571289, 99.2638931274414],
[108.83963012695312, 138.17112731933594, 144.1614227294922],
[147.15538024902344, 131.68154907226562, 123.84530639648438],
[108.10203552246094, 109.16707611083984, 103.55908203125],
[155.11891174316406, 159.10829162597656, 159.69961547851562],
[239.6630096435547, 196.75611877441406, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
131876120.0
>,
labels: #Nx.Tensor<
s32[426400]
EXLA.Backend
[15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
26
>,
clusters: #Nx.Tensor<
f32[40][3]
EXLA.Backend
[
[2.26080584526062, 85.23534393310547, 97.10236358642578],
[217.07537841796875, 176.8404083251953, 158.10992431640625],
[72.31938934326172, 168.48968505859375, 179.66793823242188],
[126.88947296142578, 86.30879211425781, 77.26168823242188],
[60.02167892456055, 54.381187438964844, 52.95188903808594],
[181.08859252929688, 173.72731018066406, 169.16676330566406],
[246.27549743652344, 195.74496459960938, 167.2862548828125],
[197.31752014160156, 200.42007446289062, 203.8845672607422],
[167.11375427246094, 153.47061157226562, 146.3658905029297],
[1.486596703529358, 126.65245056152344, 140.52223205566406],
[97.79022979736328, 103.88321685791016, 99.8821029663086],
[226.1820831298828, 220.26646423339844, 219.80264282226562],
[168.97100830078125, 110.16779327392578, 101.84797668457031],
[230.60598754882812, 206.01318359375, 192.64845275878906],
[5.403233528137207, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
64511020.0
>,
labels: #Nx.Tensor<
s32[426400]
EXLA.Backend
[23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, ...]
>
}
]
image_boxes =
for {model, num_clusters} <- Enum.zip(models, clusterings) do
repainted_x = Nx.take(model.clusters, model.labels)
image_kino = tensor_to_image.(repainted_x)
Kino.Layout.grid(
[Kino.Markdown.new("### Number of colors: #{num_clusters}"), image_kino],
boxed: true
)
end
image_box =
Kino.Layout.grid(
[Kino.Markdown.new("### Original image"), image_kino],
boxed: true
)
Kino.Layout.grid(image_boxes ++ [image_box], columns: 2)
Look that even with only five colors can recognize the Golden Gate Bridge in the image. On the other hand, with only 40 colors we keep almost all details except the sky and water surface. Sky and water do not map well because there is a small gradient in changing colors. Pixel clustering is a great way to compress images drastically with small integration in their appearance.
Clustering images from Fashion-MNIST
The last example is the clustering problem on the Fashion-MNIST Dataset. The dataset consists of 60000 images 28 by 28 pixels of ten different parts of clothing. Let’s dive into this clustering problem.
Before we start, we define the StratifiedSplit module. The module trims input data and splits it, so the number of samples per class is the same for each.
defmodule StratifiedSplit do
import Nx.Defn
defn trim_samples(x, labels, opts \\ []) do
opts = keyword!(opts, [:num_classes, :samples_per_class])
num_classes = opts[:num_classes]
samples_per_class = opts[:samples_per_class]
membership_mask = Nx.iota({1, num_classes}) == Nx.reshape(labels, {:auto, 1})
indices =
membership_mask
|> Nx.argsort(axis: 0, direction: :desc)
|> Nx.slice_along_axis(0, samples_per_class, axis: 0)
|> Nx.flatten()
{Nx.take(x, indices), Nx.take(labels, indices)}
end
end
{:module, StratifiedSplit, <<70, 79, 82, 49, 0, 0, 13, ...>>, true}
Firstly, load the data and cast it into Nx tensors.
{image_data, labels_data} = Scidata.FashionMNIST.download()
{images_binary, images_type, images_shape} = image_data
{num_samples, _num_channels = 1, image_height, image_width} = images_shape
images =
images_binary
|> Nx.from_binary(images_type)
|> Nx.reshape({num_samples, :auto})
|> Nx.divide(255)
{labels_binary, labels_type, _shape} = labels_data
target = Nx.from_binary(labels_binary, labels_type)
num_classes = 10
samples_per_class = 20
{images, target} =
StratifiedSplit.trim_samples(images, target,
num_classes: num_classes,
samples_per_class: samples_per_class
)
num_images = num_classes * samples_per_class
200
Let’s also define a function that will visualize an image in the tensor format for us.
tensor_to_kino = fn x ->
x
|> Nx.reshape({image_height, image_width, 1})
# Replicate the value into 3 channels for PNG
|> Nx.broadcast({image_height, image_width, 3})
|> Nx.multiply(255)
|> Nx.as_type({:u, 8})
|> StbImage.from_nx()
|> StbImage.resize(112, 112)
|> StbImage.to_binary(:png)
|> Kino.Image.new(:png)
end
#Function<42.3316493/1 in :erl_eval.expr/6>
Here is one of the images.
tensor_to_kino.(images[0])
We will try some different numbers of clusters and then measure the quality of clustering.
nums_clusters = 2..20
models =
for num_clusters <- 2..20 do
KMeans.fit(images, num_clusters: num_clusters, key: key)
end
[
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
4
>,
clusters: #Nx.Tensor<
f32[2][784]
EXLA.Backend
[
[0.0, 3.501400715322234e-5, 1.4005602861288935e-4, 3.851540677715093e-4, 3.501400933600962e-4, 7.002801285125315e-4, 4.901961074210703e-4, 0.009278712794184685, 0.04635854437947273, 0.09737396240234375, 0.22310924530029297, 0.29975488781929016, 0.32121849060058594, 0.3055672347545624, 0.3146008551120758, 0.36162465810775757, 0.32324934005737305, 0.3010154068470001, 0.20304621756076813, 0.06995797902345657, 0.019502801820635796, 0.003641456598415971, 0.003641456598415971, 0.0034313725773245096, 0.0025560224894434214, 0.0010504202218726277, 3.501400715322234e-5, 0.0, 0.0, 0.0, 3.501400715322234e-5, 4.901961074210703e-4, 5.252101109363139e-4, 0.006232493091374636, 0.05105042830109596, 0.13872550427913666, 0.2501050531864166, 0.37622547149658203, 0.533753514289856, 0.6370097994804382, 0.7304272055625916, 0.7347339391708374, 0.7232843637466431, 0.7482843399047852, 0.7121148109436035, 0.6053571105003357, 0.5261555314064026, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
10950.6201171875
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
3
>,
clusters: #Nx.Tensor<
f32[3][784]
EXLA.Backend
[
[0.0, 5.9417710872367024e-5, 1.1883542174473405e-4, 2.376708434894681e-4, 2.376708434894681e-4, 4.753416869789362e-4, 2.376708434894681e-4, 0.013071895577013493, 0.05971479415893555, 0.12269756942987442, 0.27730244398117065, 0.3171122968196869, 0.2941770851612091, 0.2795603275299072, 0.28009507060050964, 0.31200236082077026, 0.2995246648788452, 0.3170528709888458, 0.25864526629447937, 0.07664884626865387, 0.02192513458430767, 1.7825313261710107e-4, 4.1592397610656917e-4, 3.565062361303717e-4, 1.1883542174473405e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.376708434894681e-4, 5.9417710872367024e-5, 0.009566251188516617, 0.05246583744883537, 0.11509210616350174, 0.23559121787548065, 0.3770647943019867, 0.5828877091407776, 0.6433154940605164, 0.7103387117385864, 0.7102198004722595, 0.6955437064170837, 0.7320261001586914, 0.675638735294342, 0.6002377271652222, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
9246.3125
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[0, 0, 2, 0, 1, 1, 2, 1, 1, 1, 0, 0, 2, 0, 0, 1, 2, 1, 1, 2, 0, 0, 2, 0, 2, 1, 1, 1, 0, 1, 0, 0, 1, 0, 2, 1, 2, 1, 1, 1, 0, 0, 2, 0, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[4][784]
EXLA.Backend
[
[0.0, 0.0, 1.8239855126012117e-4, 6.383949075825512e-4, 5.471956683322787e-4, 0.0010943913366645575, 7.295942050404847e-4, 0.004012768156826496, 0.02891017124056816, 0.06675787270069122, 0.19589604437351227, 0.31500229239463806, 0.3976288139820099, 0.38148659467697144, 0.3805745542049408, 0.4316461682319641, 0.3498404026031494, 0.2810761630535126, 0.13807569444179535, 0.05654355138540268, 0.008755129761993885, 5.471956101246178e-4, 8.20793560706079e-4, 0.0014591884100809693, 6.383949657902122e-4, 0.0, 9.119927563006058e-5, 0.0, 0.0, 0.0, 9.119927563006058e-5, 9.119927417486906e-4, 0.0012767899315804243, 0.0015503877075389028, 0.05243958532810211, 0.18467853963375092, 0.29092568159103394, 0.4202462434768677, 0.5493844747543335, 0.6998631954193115, 0.8229821920394897, 0.810123085975647, 0.7939808964729309, 0.8111263513565063, 0.7670770883560181, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
8447.5419921875
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[3, 3, 0, 3, 1, 1, 0, 1, 2, 2, 3, 3, 0, 3, 3, 1, 0, 1, 1, 2, 3, 3, 0, 3, 0, 1, 1, 1, 3, 2, 3, 3, 1, 3, 0, 1, 0, 1, 2, 2, 3, 3, 0, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
6
>,
clusters: #Nx.Tensor<
f32[5][784]
EXLA.Backend
[
[0.0, 1.9607844296842813e-4, 3.9215688593685627e-4, 7.843137718737125e-4, 7.843137718737125e-4, 3.9215688593685627e-4, 3.9215688593685627e-4, 0.036274511367082596, 0.14215686917304993, 0.21843135356903076, 0.31882351636886597, 0.3727450966835022, 0.29549020528793335, 0.2123529464006424, 0.22921571135520935, 0.29098039865493774, 0.3682352900505066, 0.40980392694473267, 0.2998039126396179, 0.1456862837076187, 0.05725490301847458, 1.9607844296842813e-4, 0.001372549100778997, 5.882353289052844e-4, 5.882353289052844e-4, 0.0, 1.9607844296842813e-4, 0.0, 0.0, 0.0, 0.0, 7.843137718737125e-4, 1.9607844296842813e-4, 0.0313725508749485, 0.16862747073173523, 0.32960787415504456, 0.522549033164978, 0.6184313893318176, 0.6737255454063416, 0.7147058844566345, 0.7619606852531433, 0.7511764764785767, 0.6862744688987732, 0.7696077823638916, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
7935.5498046875
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[0, 3, 2, 3, 4, 4, 2, 4, 1, 1, 3, 3, 2, 3, 3, 4, 2, 4, 4, 2, 3, 3, 2, 3, 2, 4, 4, 4, 3, 1, 0, 3, 4, 3, 2, 4, 2, 4, 1, 1, 0, 3, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[6][784]
EXLA.Backend
[
[0.0, 1.9607844296842813e-4, 3.9215688593685627e-4, 7.843137718737125e-4, 7.843137718737125e-4, 3.9215688593685627e-4, 3.9215688593685627e-4, 0.036274511367082596, 0.14215686917304993, 0.21843135356903076, 0.31882351636886597, 0.3727450966835022, 0.29549020528793335, 0.2123529464006424, 0.22921571135520935, 0.29098039865493774, 0.3682352900505066, 0.40980392694473267, 0.2998039126396179, 0.1456862837076187, 0.05725490301847458, 1.9607844296842813e-4, 0.001372549100778997, 5.882353289052844e-4, 5.882353289052844e-4, 0.0, 1.9607844296842813e-4, 0.0, 0.0, 0.0, 0.0, 7.843137718737125e-4, 1.9607844296842813e-4, 0.0313725508749485, 0.16862747073173523, 0.32960787415504456, 0.522549033164978, 0.6184313893318176, 0.6737255454063416, 0.7147058844566345, 0.7619606852531433, 0.7511764764785767, 0.6862744688987732, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
7484.12109375
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[0, 3, 2, 3, 4, 4, 2, 5, 1, 1, 3, 3, 2, 3, 3, 4, 2, 4, 4, 1, 3, 3, 2, 3, 2, 5, 4, 5, 3, 1, 0, 3, 4, 3, 2, 4, 2, 5, 1, 1, 0, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[7][784]
EXLA.Backend
[
[0.0, 0.0, 1.9129604334011674e-4, 6.695361225865781e-4, 5.738881300203502e-4, 0.0011477762600407004, 5.738881300203502e-4, 0.004208512604236603, 0.03032042272388935, 0.0700143575668335, 0.20545193552970886, 0.3070301413536072, 0.3960784375667572, 0.3846963346004486, 0.38326162099838257, 0.43538981676101685, 0.3438546359539032, 0.2696317434310913, 0.12692491710186005, 0.043424203991889954, 0.009182210080325603, 5.738881300203502e-4, 7.65184173360467e-4, 0.001434720354154706, 6.69536180794239e-4, 0.0, 9.564802167005837e-5, 0.0, 0.0, 0.0, 9.564802167005837e-5, 9.564801584929228e-4, 0.001339072361588478, 0.0016260163392871618, 0.05499761179089546, 0.19368726015090942, 0.298995703458786, 0.41300809383392334, 0.5291248559951782, 0.6889526844024658, 0.8161645531654358, 0.8013391494750977, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
7091.55810546875
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[2, 1, 0, 1, 6, 6, 0, 3, 4, 5, 1, 1, 0, 2, 1, 6, 0, 3, 6, 5, 2, 1, 0, 2, 0, 3, 6, 3, 1, 4, 2, 1, 6, 1, 0, 3, 0, 3, 4, 4, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[8][784]
EXLA.Backend
[
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
6868.54296875
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[3, 4, 1, 6, 6, 6, 1, 7, 0, 5, 6, 4, 1, 4, 2, 6, 2, 7, 6, 5, 4, 4, 1, 4, 2, 7, 6, 7, 6, 0, 3, 4, 6, 6, 2, 7, 1, 7, 0, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
6
>,
clusters: #Nx.Tensor<
f32[9][784]
EXLA.Backend
[
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
6582.2734375
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[3, 4, 1, 6, 6, 8, 1, 7, 0, 5, 6, 4, 1, 4, 2, 8, 2, 7, 6, 5, 4, 4, 1, 4, 2, 8, 8, 7, 6, 0, 3, 4, 6, 6, 2, 8, 1, 7, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
6
>,
clusters: #Nx.Tensor<
f32[10][784]
EXLA.Backend
[
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
6426.0517578125
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[3, 4, 9, 6, 6, 8, 9, 7, 0, 5, 6, 4, 1, 4, 2, 8, 9, 7, 2, 5, 4, 4, 9, 4, 9, 8, 6, 7, 2, 0, 3, 4, 6, 6, 2, 8, 9, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
7
>,
clusters: #Nx.Tensor<
f32[11][784]
EXLA.Backend
[
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
6238.46923828125
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[3, 4, 9, 6, 6, 8, 9, 7, 0, 5, 6, 4, 1, 4, 2, 8, 9, 7, 2, 5, 4, 10, 9, 4, 9, 8, 6, 7, 2, 0, 3, 10, 6, 6, 2, 8, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[12][784]
EXLA.Backend
[
[0.0, 0.0, 0.0, 0.0, 0.0, 2.3068052541930228e-4, 0.0, 2.3068052541930228e-4, 0.06805074959993362, 0.1568627506494522, 0.17716263234615326, 0.17923875153064728, 0.18846596777439117, 0.18685120344161987, 0.18362168967723846, 0.22260668873786926, 0.21245676279067993, 0.20484431087970734, 0.21453288197517395, 0.07381777465343475, 0.017070358619093895, 2.3068052541930228e-4, 6.920415908098221e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.3068052541930228e-4, 6.920415326021612e-4, 0.04013840854167938, 0.30542102456092834, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
6081.9736328125
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[7, 3, 2, 3, 5, 5, 6, 1, 8, 4, 5, 0, 2, 0, 3, 10, 6, 1, 5, 4, 0, 11, 2, 0, 2, 1, 5, 1, 9, 10, 7, 11, 3, 9, 3, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[13][784]
EXLA.Backend
[
[0.0, 4.6136105083860457e-4, 0.011995386332273483, 0.012687427923083305, 0.012456747703254223, 0.012456747703254223, 0.012456747703254223, 0.011072664521634579, 0.013148789294064045, 0.01361014973372221, 0.012687427923083305, 0.014994233846664429, 0.06528258323669434, 0.058362167328596115, 0.059284891933202744, 0.06920415163040161, 0.044290658086538315, 0.013840830884873867, 0.01407151110470295, 0.015224914066493511, 0.0117647061124444, 0.013840830884873867, 0.01568627543747425, 0.014532871544361115, 0.014763553626835346, 0.003921568859368563, 0.0, 0.0, 0.0, 0.013148789294064045, 0.024221453815698624, 0.02283737063407898, 0.025836216285824776, 0.022606689482927322, 0.021453287452459335, 0.018223760649561882, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
5876.912109375
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[8, 10, 2, 1, 1, 5, 2, 3, 4, 12, 1, 10, 2, 10, 11, 5, 11, 0, 6, 7, 10, 10, 2, 10, 2, 0, 0, 3, 1, 12, 8, 10, 1, 1, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[14][784]
EXLA.Backend
[
[0.0, 4.6136105083860457e-4, 0.011995386332273483, 0.012687427923083305, 0.012456747703254223, 0.012456747703254223, 0.012456747703254223, 0.011072664521634579, 0.013148789294064045, 0.01361014973372221, 0.012687427923083305, 0.014994233846664429, 0.06528258323669434, 0.058362167328596115, 0.059284891933202744, 0.06920415163040161, 0.044290658086538315, 0.013840830884873867, 0.01407151110470295, 0.015224914066493511, 0.0117647061124444, 0.013840830884873867, 0.01568627543747425, 0.014532871544361115, 0.014763553626835346, 0.003921568859368563, 0.0, 0.0, 0.0, 0.013148789294064045, 0.024221453815698624, 0.02283737063407898, 0.025836216285824776, 0.022606689482927322, 0.021453287452459335, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
5797.251953125
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[8, 10, 13, 1, 1, 5, 2, 3, 4, 12, 1, 10, 13, 10, 11, 5, 11, 0, 6, 7, 10, 10, 13, 10, 13, 0, 0, 3, 1, 12, 8, 10, 1, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
4
>,
clusters: #Nx.Tensor<
f32[15][784]
EXLA.Backend
[
[0.0, 0.0, 2.4509805371053517e-4, 2.4509805371053517e-4, 0.0, 2.4509805371053517e-4, 4.901961074210703e-4, 0.001470588380470872, 4.901961074210703e-4, 0.02401961013674736, 0.05441176891326904, 0.1200980469584465, 0.23112745583057404, 0.2237745076417923, 0.26225489377975464, 0.2404411882162094, 0.20392157137393951, 0.15514707565307617, 0.037254903465509415, 0.030637256801128387, 0.02549019642174244, 0.035784315317869186, 0.0416666679084301, 0.0365196093916893, 0.008333333767950535, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.4509805371053517e-4, 2.4509805371053517e-4, 2.4509805371053517e-4, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
5662.34521484375
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[8, 11, 2, 5, 0, 0, 2, 10, 13, 6, 5, 11, 14, 12, 5, 9, 2, 10, 0, 4, 12, 11, 14, 12, 1, 7, 0, 10, 5, 6, 8, 11, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[16][784]
EXLA.Backend
[
[0.0, 0.0, 2.4509805371053517e-4, 2.4509805371053517e-4, 0.0, 2.4509805371053517e-4, 4.901961074210703e-4, 0.001470588380470872, 4.901961074210703e-4, 0.02401961013674736, 0.05441176891326904, 0.1200980469584465, 0.23112745583057404, 0.2237745076417923, 0.26225489377975464, 0.2404411882162094, 0.20392157137393951, 0.15514707565307617, 0.037254903465509415, 0.030637256801128387, 0.02549019642174244, 0.035784315317869186, 0.0416666679084301, 0.0365196093916893, 0.008333333767950535, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.4509805371053517e-4, 2.4509805371053517e-4, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
5568.4345703125
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[8, 11, 2, 5, 0, 0, 2, 10, 13, 6, 5, 11, 14, 12, 5, 9, 2, 10, 0, 4, 12, 11, 14, 15, 1, 7, 0, 10, 5, 6, 8, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[17][784]
EXLA.Backend
[
[0.0, 0.0, 2.801120572257787e-4, 2.801120572257787e-4, 0.0, 2.801120572257787e-4, 5.602241144515574e-4, 0.001680672401562333, 5.602241144515574e-4, 0.027450982481241226, 0.06218487769365311, 0.13613446056842804, 0.2641456723213196, 0.24537815153598785, 0.24425771832466125, 0.2731092572212219, 0.23305322229862213, 0.17731094360351562, 0.04257703199982643, 0.020168067887425423, 5.602241144515574e-4, 5.602241144515574e-4, 2.801120572257787e-4, 0.0, 2.801120572257787e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.801120572257787e-4, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
5398.72412109375
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[8, 11, 2, 5, 0, 0, 2, 10, 13, 6, 5, 11, 14, 12, 5, 9, 2, 10, 16, 4, 12, 11, 14, 15, 1, 7, 0, 10, 5, 6, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[18][784]
EXLA.Backend
[
[0.0, 0.0, 2.801120572257787e-4, 2.801120572257787e-4, 0.0, 2.801120572257787e-4, 5.602241144515574e-4, 0.001680672401562333, 5.602241144515574e-4, 0.027450982481241226, 0.06218487769365311, 0.13613446056842804, 0.2641456723213196, 0.24537815153598785, 0.24425771832466125, 0.2731092572212219, 0.23305322229862213, 0.17731094360351562, 0.04257703199982643, 0.020168067887425423, 5.602241144515574e-4, 5.602241144515574e-4, 2.801120572257787e-4, 0.0, 2.801120572257787e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
5336.95263671875
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[8, 11, 2, 5, 0, 0, 2, 10, 17, 6, 5, 11, 14, 12, 5, 9, 2, 10, 16, 4, 12, 11, 14, 15, 1, 7, 0, 10, 5, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[19][784]
EXLA.Backend
[
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.22875867318362e-4, 0.001568627660162747, 5.22875867318362e-4, 0.0313725508749485, 0.08000000566244125, 0.10849674046039581, 0.24653595685958862, 0.22901961207389832, 0.22797387838363647, 0.2549019753932953, 0.20392157137393951, 0.15973857045173645, 0.062745101749897, 0.033986929804086685, 0.02718954347074032, 0.0376470610499382, 0.04418300837278366, 0.038954250514507294, 0.00862745102494955, 0.0, 0.0, 0.0, 0.0, 0.0, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
5233.1396484375
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[8, 11, 2, 5, 0, 0, 2, 10, 17, 6, 0, 11, 14, 12, 18, 9, 2, 10, 18, 4, 12, 11, 14, 15, 1, 7, 0, 10, ...]
>
},
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
5
>,
clusters: #Nx.Tensor<
f32[20][784]
EXLA.Backend
[
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.22875867318362e-4, 0.001568627660162747, 5.22875867318362e-4, 0.0313725508749485, 0.08000000566244125, 0.10849674046039581, 0.24653595685958862, 0.22901961207389832, 0.22797387838363647, 0.2549019753932953, 0.20392157137393951, 0.15973857045173645, 0.062745101749897, 0.020130719989538193, 5.22875867318362e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
5173.603515625
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[8, 19, 2, 5, 0, 0, 2, 10, 17, 6, 0, 11, 14, 12, 18, 9, 2, 10, 18, 4, 12, 11, 14, 15, 1, 7, 0, ...]
>
}
]
data = [
num_clusters: nums_clusters,
inertia: for(model <- models, do: Nx.to_number(model.inertia))
]
Tucan.lineplot(data, "num_clusters", "inertia",
x: [type: :ordinal, axis: [label_angle: 0]],
width: 600,
height: 300
)
|> Tucan.Axes.set_xy_titles("Number of Clusters", "Inertia")
|> Tucan.Scale.set_y_domain(4800, 11500)
|> Tucan.set_title("Elbow Plot")
{"$schema":"https://vega.github.io/schema/vega-lite/v5.json","data":{"values":[{"inertia":10950.62109375,"num_clusters":2},{"inertia":9246.3134765625,"num_clusters":3},{"inertia":8447.5419921875,"num_clusters":4},{"inertia":7935.54931640625,"num_clusters":5},{"inertia":7484.1201171875,"num_clusters":6},{"inertia":7091.55810546875,"num_clusters":7},{"inertia":6868.54248046875,"num_clusters":8},{"inertia":6582.2734375,"num_clusters":9},{"inertia":6426.0517578125,"num_clusters":10},{"inertia":6238.46923828125,"num_clusters":11},{"inertia":6081.97265625,"num_clusters":12},{"inertia":5876.912109375,"num_clusters":13},{"inertia":5797.251953125,"num_clusters":14},{"inertia":5662.3447265625,"num_clusters":15},{"inertia":5568.43408203125,"num_clusters":16},{"inertia":5398.724609375,"num_clusters":17},{"inertia":5336.9521484375,"num_clusters":18},{"inertia":5233.13916015625,"num_clusters":19},{"inertia":5173.60400390625,"num_clusters":20}]},"encoding":{"x":{"axis":{"labelAngle":0,"title":"Number of Clusters"},"field":"num_clusters","type":"ordinal"},"y":{"axis":{"title":"Inertia"},"field":"inertia","scale":{"domain":[4800,11500]},"type":"quantitative"}},"height":300,"mark":{"fillOpacity":1,"type":"line"},"title":{"text":"Elbow Plot"},"width":600}
Look that this time there is no elbow on a plot. We need to use a different method to predict the number of classes. Now we will use Silhouette Score. It is a metric that indicates the quality of clustering. The higher score we achieve, the better clustering we get. However, we should be aware that Silhouette Score is just a heuristic and not always works.
silhouette_scores =
for {model, num_clusters} <- Enum.zip(models, nums_clusters) do
Scholar.Metrics.Clustering.silhouette_score(images, model.labels, num_clusters: num_clusters)
|> Nx.to_number()
end
[0.1867797076702118, 0.19426067173480988, 0.18798942863941193, 0.16196762025356293,
0.14662104845046997, 0.15014168620109558, 0.1334874927997589, 0.12096332758665085,
0.12907366454601288, 0.12029680609703064, 0.12559780478477478, 0.12784500420093536,
0.12478214502334595, 0.1155780702829361, 0.11121585220098495, 0.11069852113723755,
0.10738851875066757, 0.10977344214916229, 0.10331624001264572]
data = [num_clusters: nums_clusters, silhouette_scores: silhouette_scores]
Tucan.lineplot(data, "num_clusters", "silhouette_scores",
points: true,
point_color: "darkBlue",
x: [type: :ordinal, axis: [label_angle: 0]]
)
|> Tucan.Axes.set_xy_titles("Number of Clusters", "Silhouette score")
|> Tucan.Scale.set_y_domain(0.088, 0.205)
|> Tucan.set_size(600, 300)
|> Tucan.set_title("Silhouette score vs Number of Clusters")
{"$schema":"https://vega.github.io/schema/vega-lite/v5.json","data":{"values":[{"num_clusters":2,"silhouette_scores":0.18677975237369537},{"num_clusters":3,"silhouette_scores":0.1942606419324875},{"num_clusters":4,"silhouette_scores":0.18798941373825073},{"num_clusters":5,"silhouette_scores":0.16196760535240173},{"num_clusters":6,"silhouette_scores":0.14662104845046997},{"num_clusters":7,"silhouette_scores":0.15014170110225677},{"num_clusters":8,"silhouette_scores":0.13348747789859772},{"num_clusters":9,"silhouette_scores":0.12096334248781204},{"num_clusters":10,"silhouette_scores":0.12907366454601288},{"num_clusters":11,"silhouette_scores":0.12029680609703064},{"num_clusters":12,"silhouette_scores":0.1255977749824524},{"num_clusters":13,"silhouette_scores":0.12784500420093536},{"num_clusters":14,"silhouette_scores":0.12478211522102356},{"num_clusters":15,"silhouette_scores":0.1155780628323555},{"num_clusters":16,"silhouette_scores":0.11121582984924316},{"num_clusters":17,"silhouette_scores":0.11069852113723755},{"num_clusters":18,"silhouette_scores":0.10738851875066757},{"num_clusters":19,"silhouette_scores":0.10977346450090408},{"num_clusters":20,"silhouette_scores":0.10331626981496811}]},"encoding":{"x":{"axis":{"labelAngle":0,"title":"Number of Clusters"},"field":"num_clusters","type":"ordinal"},"y":{"axis":{"title":"Silhouette score"},"field":"silhouette_scores","scale":{"domain":[0.088,0.205]},"type":"quantitative"}},"height":300,"mark":{"fillOpacity":1,"point":{"color":"darkBlue"},"type":"line"},"title":{"text":"Silhouette score vs Number of Clusters"},"width":600}
As we can see, the model with num_clusters equal to 3 has the highest Silhouette Score. Now we will visualize this clusterization.
best_num_clusters = 3
best_model = Enum.at(models, 1)
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend
3
>,
clusters: #Nx.Tensor<
f32[3][784]
EXLA.Backend
[
[0.0, 5.9417710872367024e-5, 1.1883542174473405e-4, 2.376708434894681e-4, 2.376708434894681e-4, 4.753416869789362e-4, 2.376708434894681e-4, 0.013071895577013493, 0.05971479415893555, 0.12269756942987442, 0.27730244398117065, 0.3171122968196869, 0.2941770851612091, 0.2795603275299072, 0.28009507060050964, 0.31200236082077026, 0.2995246648788452, 0.3170528709888458, 0.25864526629447937, 0.07664884626865387, 0.02192513458430767, 1.7825313261710107e-4, 4.1592397610656917e-4, 3.565062361303717e-4, 1.1883542174473405e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.376708434894681e-4, 5.9417710872367024e-5, 0.009566251188516617, 0.05246583744883537, 0.11509210616350174, 0.23559121787548065, 0.3770647943019867, 0.5828877091407776, 0.6433154940605164, 0.7103387117385864, 0.7102198004722595, 0.6955437064170837, 0.7320261001586914, 0.675638735294342, 0.6002377271652222, 0.5515151619911194, 0.35151517391204834, ...],
...
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend
9246.3125
>,
labels: #Nx.Tensor<
s32[200]
EXLA.Backend
[0, 0, 2, 0, 1, 1, 2, 1, 1, 1, 0, 0, 2, 0, 0, 1, 2, 1, 1, 2, 0, 0, 2, 0, 2, 1, 1, 1, 0, 1, 0, 0, 1, 0, 2, 1, 2, 1, 1, 1, 0, 0, 2, 0, 2, 1, ...]
>
}
predicted_cluster_with_indices =
best_model.labels
|> Nx.to_flat_list()
|> Enum.with_index()
|> Enum.group_by(&elem(&1, 0), &elem(&1, 1))
for cluster <- 0..(best_num_clusters - 1) do
indices = predicted_cluster_with_indices[cluster]
boxes =
for index <- indices do
original_cluster = Nx.to_number(target[index])
Kino.Layout.grid([
Kino.Markdown.new("Original cluster: #{original_cluster}"),
tensor_to_kino.(images[index])
])
end
Kino.Layout.grid(
[
Kino.Markdown.new("## Cluster #{cluster}"),
Kino.Layout.grid(boxes, columns: 5)
],
boxed: true
)
end
|> Kino.Layout.grid()
Oops, it doesn’t look right! That’s because our algorithm for three clusters gathers images by colors rather than shapes. To spot this, let’s plot the average image of each cluster.
for cluster <- 0..(best_num_clusters - 1) do
indices = predicted_cluster_with_indices[cluster]
mean_image =
indices
|> Enum.map(&images[&1])
|> Nx.stack()
|> Nx.mean(axes: [0])
tensor_to_kino.(mean_image)
end
|> Kino.Layout.grid(columns: 3)
One of the images has a vertical line (something like trousers), the next image is almost all white (similar to a jumper), and the last one is mostly black. This time Silhouette Score turns out to be not the best indicator. To get better clustering, try to rerun the code with a higher number of clusters.