Intro to ML
Preparing data for training
Mix.install([
{:axon, "~> 0.5"},
{:nx, "~> 0.5"},
{:explorer, "~> 0.5"},
{:kino, "~> 0.8"}
])
require Explorer.DataFrame, as: DF
iris = Explorer.Datasets.iris()
Normalize data
There’s several reasons why you should always normalize data, just to mention a few:
- Improved Model Performance: Normalizing data helps in improving the performance of many machine learning algorithms. By bringing all features to the same scale, the optimization algorithms can converge faster, leading to quicker training and better model performance.
- Prevention of Biased Features: Features with larger scales can dominate over those with smaller scales during the training process, potentially causing the model to be biased towards features with larger scales. Normalization prevents this bias and ensures that each feature contributes proportionally to the learning process.
- Stability of Algorithms: Some algorithms are sensitive to the scale of the input data. For instance, distance-based algorithms like k-nearest neighbors (KNN) and support vector machines (SVM) can be influenced by the scale of features. Normalizing the data ensures that such algorithms are not biased by the scale and perform consistently across different datasets.
- Interpretability and Convergence: Normalization can also aid in the interpretation of feature importance. When features are on different scales, it becomes challenging to compare their contributions to the model. Normalization makes it easier to interpret the importance of features relative to each other. Additionally, normalization helps in faster convergence of - optimization algorithms, reducing the training time.
- Handling Outliers: Normalization can help in handling outliers by bringing the values of features within a similar range. Outliers, which may have an exaggerated effect on algorithms, can be mitigated by scaling the data.
cols = ~w(sepal_width sepal_length petal_length petal_width)
normalized_iris =
DF.mutate(
iris,
for col <- across(^cols) do
{col.name, (col - mean(col)) / variance(col)}
end
)
normalized_iris =
DF.mutate(
normalized_iris,
species: Explorer.Series.cast(species, :category)
)
shuffled_normalized_iris = DF.shuffle(normalized_iris)
train_df = DF.slice(shuffled_normalized_iris, 0..119)
test_df = DF.slice(shuffled_normalized_iris, 120..149)
feature_columns = [
"sepal_length",
"sepal_width",
"petal_length",
"petal_width"
]
x_train = Nx.stack(train_df[feature_columns], axis: 1)
y_train =
train_df["species"]
|> Nx.stack(axis: -1)
|> Nx.equal(Nx.iota({1, 3}, axis: -1))
x_test = Nx.stack(test_df[feature_columns], axis: 1)
y_test =
test_df["species"]
|> Nx.stack(axis: -1)
|> Nx.equal(Nx.iota({1, 3}, axis: -1))
Multinomial Logistic Regression(MLRM) with Axon
model =
Axon.input("iris_features", shape: {nil, 4})
|> Axon.dense(3, activation: :softmax)
Axon.Display.as_graph(model, Nx.template({1, 4}, :f32))
data_stream =
Stream.repeatedly(fn ->
{x_train, y_train}
end)
trained_model_state =
model
|> Axon.Loop.trainer(:categorical_cross_entropy, :sgd)
|> Axon.Loop.metric(:accuracy)
|> Axon.Loop.run(data_stream, %{}, iterations: 500, epochs: 10)
data = [{x_test, y_test}]
model
|> Axon.Loop.evaluator()
|> Axon.Loop.metric(:accuracy)
|> Axon.Loop.run(data, trained_model_state)
Difference between regression and classification problem?
- Classification: The goal is to predict which class a data point or class belongs to
- Regression: The goal is to predict a continous numerical value