Process pupil attainment and progress
Mix.install(
[
{:kino, "~> 0.14"},
{:vega_lite, "~> 0.1.11"},
{:kino_vega_lite, "~> 0.1.13"},
# {:school_kit, github: "elliotblackburn/school_kit"}
{:school_kit, path: Path.join(__DIR__, ".."), env: :dev},
],
# config_path: :school_kit,
lockfile: :school_kit
)
alias VegaLite, as: Vl
Select input data
To get started, we need to select both the source results data and the results year of the cohort.
It’s important to select the correct cohort year, as this will determine which DfE national estimates are selected. If you’re analysing mock data, you may wish to run against different cohort years to get a feel for the ball-park range of your results.
Your results data CSV must contain the appropriately named column headers, otherwise subjects may be missed. You can find a template CSV in the files attached to this notebook.
import Kino.Shorts
cohort_year = read_select("Cohort year", [{nil, "Select cohort year"}, {"2023", "2023"}])
data_file_input = read_file("Results data", accept: ~w(.csv))
if cohort_year == nil do
Kino.interrupt!(:error, "You must select a cohort year")
else
:ok
end
data_file = if data_file_input == nil do
Kino.interrupt!(:error, "You must select an input file")
else
Kino.Input.file_path(data_file_input.file_ref)
end
:ok
Load and parse data
The data comes to us in a CSV format with a row per student, and a column per facet. These facets include:
- Attributes about the student, such as SEND status
- The school the student is from (for processing multi-academy trusts as a whole)
- Each grade from each subject they sat
SchoolKit
comes with a parser for a particular CSV format, although it would be possible to write your own. This parser reads in the data, and parses each student record to create an elixir map we can work with.
The parser will also normalise all qualification grades over to the GCSE reformed system, this allows us to compare various subjects, and calcuate Attainment and Progress 8.
The resulting student record is then represented as an elixir map, with atom keys. Each record has only the subjects the student received a grade for, each grade is a floating point number normalised to the GCSE Reformed spec.
parsed_results =
data_file
|> SchoolKit.Parser.from_csv()
Attainment 8 - How well a student performed in their exams
A students Attainment 8 shows what they managed to achieve in their final GCSE exam results. It is based exclusively on their exam results.
The attainment 8 isn’t all that useful by itself, but it feeds into the Progress 8 score which we will calculate after. Attainment 8 is also not something students will think about for themselves, but it helps us to compare cohorts of students, and understands a schools performance.
Attainment 8 is calculated via a series of weighted buckets. Particular subjects fit into different buckets, and some buckets are weighted higher. This calculation and the subject bucket assignments are dictated by the DfE.
Bucket 1 - English and Maths
This is for English and Maths, and the end result is double weighted. Bucket 1 is calculated by taking the best grade between English Literature and English Language, and adding it to the grade for Maths. For example, a student with English Literature of 4, English Language of 5, and Maths of 4 will get a final Bucket 1 grade of 18. The English Language grade was higher, so that was doubled and added to Maths which was also doubled.
The unused English grade can be re-used in Bucket 3 later on.
Bucket 2 - English Baccalaureate (EBacc)
Sciences, Computing, Geography, History, and Modern Foreign Languages make up the majority of the English Baccalaureate which is Bucket 2. The top 3 grades from across these subjects are selected, and added together to create the Bucket 2 result. This bucket is single weighted.
Bucket 3 - Open Subjects (Vocationals)
Vocational qualifications such as BTEC’s and anything else fall into Bucket 3. The unused English subject from Bucket 1 is also included here. Like Bucket 2, we take the top three grades and add them together to get the final Bucket 3 result. This bucket is also single weighted.
Total Attainment 8
The total attainment 8 is simply calculated by adding together the total value from all three buckets. We also calculate a 10 subject average which is used in other calculations later on.
results_with_attainment_8 =
parsed_results
|> Enum.map(fn student_record ->
SchoolKit.Attainment8.calculate_attainment_8(student_record)
end)
Kino.Markdown.new("**Attainment 8 calculated, displaying first student record as an example**")
|> Kino.render()
Kino.Tree.new(List.first(results_with_attainment_8))
Progress 8 - The amount of progress a student made during KS4
A students Progress 8 is a representation of how much progress a student made between KS2 and the end of KS4. It is calculated using a lookup data set published by the DfE for each cohort after they’ve taken their final GCSE exams. This data set defines an expected progress of a student in a subject given a particular KS2 result.
results_with_progress_8 =
results_with_attainment_8
|> Enum.map(fn student_record ->
SchoolKit.Progress8.calculate_progress_8(student_record, cohort_year)
end)
Kino.Markdown.new("**Progress 8 calculated, displaying first student record as an example**")
|> Kino.render()
Kino.Tree.new(List.first(results_with_progress_8))
|> Kino.render()
Kino.Download.new(
fn ->
Jason.encode!(results_with_progress_8, pretty: true)
end,
filename: "results_data.json",
label: "Download result data"
)
Subject performance report
performance_by_subject =
data_file
|> SchoolKit.generate_subject_progress_report_from_csv(cohort_year)
|> Kino.DataTable.new(
name: "Summary per Subject",
keys: [
"Subject",
"Attainment Avg",
"Progress Avg",
"Male Attainment Avg",
"Male Progress Avg",
"Female Attainment Avg",
"Female Progress Avg",
"Disadvantaged Attainment Avg",
"Disadvantaged Progress Avg",
"SEND Attainment Avg",
"SEND Progress Avg"
]
)
reduced_student_record =
results_with_progress_8
|> Enum.reject(fn student -> student.ks2 == nil or student.progress_8 == nil end)
|> Enum.reject(fn student -> student.attendance_band == "" end)
|> Enum.map(fn student ->
%{
attendance_band: student.attendance_band,
school: student.school,
attainment_8: student.attainment_8.total.total_score,
progress_8: student.progress_8.total.progress_8,
}
end)
progress_by_attendance =
Vl.new(title: "Median, Min, and Max values of Students Progress 8 by Attendance", width: 600, height: 400)
|> Vl.data_from_values(reduced_student_record)
|> Vl.mark(:boxplot, extent: "min-max")
|> Vl.encode_field(:y, "progress_8", title: "Progress 8 Score", type: :quantitative)
|> Vl.encode_field(:x, "attendance_band", title: "Attendance Band", type: :nominal)
|> Kino.VegaLite.new()
attainment_by_school =
Vl.new(title: "Median, Min, and Max values of Students Attainment 8 by School", width: 600, height: 400)
|> Vl.data_from_values(reduced_student_record)
|> Vl.mark(:boxplot, extent: "min-max")
|> Vl.encode_field(:y, "attainment_8", title: "Attainment 8 Score", type: :quantitative)
|> Vl.encode_field(:x, "school", title: "School", type: :nominal)
|> Kino.VegaLite.new()
progress_by_school =
Vl.new(title: "Median, Min, and Max values of Students Progress 8 by School", width: 600, height: 400)
|> Vl.data_from_values(reduced_student_record)
|> Vl.mark(:boxplot, extent: "min-max")
|> Vl.encode_field(:y, "progress_8", title: "Progress 8 Score", type: :quantitative)
|> Vl.encode_field(:x, "school", title: "School", type: :nominal)
|> Kino.VegaLite.new()
Kino.Layout.tabs([
{"Headlines Per-Subject", performance_by_subject},
{"Attainment Spread by School", attainment_by_school},
{"Progress Spread by School", progress_by_school},
{"Progress Spread by Attendance", progress_by_attendance}
])