RTO auto-grading assistant

The problem

A Registered Training Organisation delivering Certificate III, IV and Diploma-level qualifications was running written-assessment grading by hand. Trainers spent roughly 18 minutes per submission: reading the learner response, matching it against the rubric and unit-of-competency criteria, drafting feedback, lodging the grade in the LMS.

Volume had grown 31% over two years. Trainer headcount had grown 9%. The gap was being absorbed by overtime. Trainer-satisfaction scores were the leading-indicator metric on the CEO's dashboard and they were declining.

The CEO didn't want to "grade with AI". The ASQA standards are clear that competency decisions are professional judgements made by qualified trainers. What he wanted was for his trainers to spend their time on the judgement part of grading, not the rubric-reading and feedback-typing part.

What we did

Three weeks of scoping (including a thorough review of the AQF mapping for each qualification), eight weeks of build, three weeks of pilot. The deployed system:

Read the learner's submitted response
Mapped the response against the unit-of-competency criteria and the rubric
Produced a draft grade with paragraph-level evidence pointers ("this section of the learner's submission addresses criterion 2.1.b")
Drafted feedback in the trainer's voice — calibrated on five sample-graded submissions from each trainer
Routed everything to a trainer-finaliser queue. The trainer was the competency-decision authority. Always.
Wrote every draft, every trainer adjustment, every confidence score and every model version into an ASQA-aligned audit log

The system never made the competency decision. The trainer did. The system did the typing.

The outcome — at 6 months across all qualifications

	Before (FY24 baseline)	After (6 months in production)
Submissions graded per week	~3,400	~3,600 (within enrolment growth)
Trainer time per submission	~18 min	~5 min
Trainers added in period	n/a	0
Trainer-satisfaction score (internal survey)	6.2 / 10	8.4 / 10
Trainer-adjustment-to-draft rate	n/a	23% (system surfaces a starting point, trainer adjusts)
Cost per assessment graded (model + infra)	n/a	A$0.06
ASQA audit findings on AI-assisted grading	n/a	0 (audit conducted month 5)

The 23% trainer-adjustment rate is the metric the CEO most often cites. The trainers were not rubber-stamping the AI. They were exercising professional judgement on roughly one in four submissions, which is roughly what they did before — but now without the typing overhead.

The thing my trainers told me they wanted back was time to write good feedback. They had been writing rubric-shaped feedback because that was all they had time for. Now they write useful feedback.

— CEO, Registered Training Organisation

What we'd do differently

Per-trainer voice calibration earlier. We calibrated trainer voice in week ten using five sample submissions per trainer. We should have done this in week one. The early-pilot trainers found the feedback drafts impersonal until calibration landed.

Map AQF before unit-of-competency. We mapped unit-of-competency first, AQF level second. AQF should have been the framing — it's the scaffolding that ASQA audits against, and it would have made the assessment-criteria mapping cleaner.

What we didn't do

We didn't make any competency decision. We didn't process any submission without the trainer-finaliser step. We didn't deploy the system on summative assessments before the formative-assessment pilot had completed.

The most consequential design decision was the routing-queue UI: the trainer sees the system-drafted grade and feedback alongside the raw learner submission, with side-by-side scrolling. The trainer never sees the AI draft alone. This was the artefact ASQA spent the longest reviewing during the audit. They signed it off in writing.

Auto-grading assistant for a Registered Training Organisation

The problem

What we did

The outcome — at 6 months across all qualifications

What we'd do differently

What we didn't do

Start with the Audit.