Hands On: Thinking About Testing

thinking_about_testing.livemd

@dgigafox

programming_machine_learn...

Share to X

Share to Bluesky

More notebooks

Hands On: Thinking About Testing

Problem & Solution

Situation:

You have a large dataset of handwritten characters with 1,000,000 samples and split them into 3 - 900,000 for training, 50,000 for validation, and 50,000 for testing.

Problem:

The data could be ordered. Consequently, there is a great chance that some letters are exclusively included only on the validation or test sets, and are missing on the training set and vice versa. Hence there are some data that were never before seen during training and would be hard for the model to determine.

Solution:

Shuffle the data before splitting them into training, validation, and test set like in MNIST where data is already pre-shuffled. This will ensure that distribution of data is the same throughout the three sets.

Other notebooks:

Ryan Wold
@afomi

livewallet

LiveWallet - Part 2 - Generate addresses

002-generating-keys.livemd

tutorial advanced bsv httpoison jason handkit manic qr_code kino

2024-1-25
Nathan Shafer
@nshafer

advent_of_sql

Day 8

day8.livemd

tutorial advanced sql ecto_sql postgrex kino

2025-12-20
@DockYard-Academy

curriculum

Lucas Numbers

lucas_numbers.livemd

tutorial intermediate jason kino youtube hidden_cell

2023-3-21
@instancer-kirik

resolvinator

Risk Event Occurrence Register

event_occurrence.livemd

tutorial advanced data-science ai resolvinator kino kino_db kino_vega_lite timex jason

2024-11-8
Matt Willy
@TheEndIsNear

elixir-ml

Deep Learning And Axon

deep-learning-axon.livemd

tutorial advanced data-science nx axon exla scidata kino table_rex

2024-3-17
@andyl

livebooks

MLIE Chapter 2

chapter2.livemd

tutorial advanced data-science nx exla benchee

2023-12-4
Appunite S.A.
@appunite

au-llm-chat

AppUnite LLM Chat Demo

appunite_llm_chat_demo.livemd

tutorial advanced gen-server ai req jason kino httpoison livebook_env jido_ai

2025-7-16

Back