title: "Appointment-intake triage for a national GP clinic chain" dek: "A 2024 implementation that classified incoming patient calls into 6 urgency tiers in real time. Reduced 'urgent-but-routed-as-routine' incidents by 87%." sector: "Healthcare & Allied Health" client: "National GP clinic chain · 28 clinics · ~400,000 patients" engagement: "Pilot" duration: "11 weeks" year: "2024" outcome: "Mis-routed urgent calls: -87% · receptionist time per call: 4 min → 90 sec" solution: "Real-time voice-to-text intake classifier with AHPRA-aligned urgency tiers and human escalation on uncertainty." timeSaved: "~2.5 minutes per intake call · ~A$0.08 cost per call processed" visual: "none" cardFigure: "workflow" timeMetric: "2.5 min" timeMetricLabel: "saved / call" costMetric: "A$0.08" costMetricLabel: "cost per call" speedMetric: "2.7×" speedMetricLabel: "faster triage" publishedAt: "2024-09-30" keywords:
- GP clinic AI Australia
- appointment triage automation
- AHPRA compliant AI
- healthcare voice AI
The problem
A national GP clinic chain — 28 clinics across four states, around 400,000 patients — was running its appointment intake through a centralised reception team. Receptionists handled roughly 8,400 calls per day. They were asked to classify each call into one of six urgency tiers: emergency, urgent-same-day, urgent-next-day, routine, telehealth-eligible, administrative.
The misclassification rate was the worry. Around 3.2% of calls were classified as "routine" when they should have been "urgent-same-day". Of those, a small number — but a non-trivial number — were patients with conditions that deteriorated overnight. The Chief Medical Officer wasn't worried about average performance; she was worried about the long-tail safety event.
She had been advised by two vendors to deploy a fully-automated AI triage bot. She rejected both. The clinical risk profile of a fully-automated patient-triage agent wasn't acceptable to her, the medical director council or AHPRA's then-current draft guidance on clinical AI.
What we did
Eight weeks of build, three weeks of staged pilot at three lighter-volume clinics. The deployed system:
- Transcribed patient calls in real time using AU-resident speech-to-text
- Classified the call into the six urgency tiers using GPT-4 with structured-output schemas
- Surfaced the classification to the receptionist, alongside the receptionist's own draft classification — they remained the decision-maker
- Flagged disagreements between the AI and the receptionist for medical-director review at end-of-day
- Logged every decision, transcription, classification, and disagreement for AHPRA-aligned audit
The receptionist was never bypassed. The system functioned as a second opinion — the receptionist remained the routing authority. This was the binding clinical-governance constraint and it shaped every architectural choice.
The outcome — at 6 months in production across all 28 clinics
| Before (FY23 baseline) | After (6 months in production) | |
|---|---|---|
| Calls per day handled by reception | ~8,400 | ~8,400 (no volume change) |
| Average receptionist time per intake call | 4 min | 90 seconds |
| 'Urgent-but-routed-as-routine' incidents | ~270/year (3.2%) | ~35/year (0.4%) |
| Reduction in mis-routed urgent calls | n/a | -87% |
| Receptionist disagreement-with-AI rate | n/a | 4.1% (caught 22% of routing issues the receptionist alone would have missed; same model also caught issues the receptionist had right) |
| Cost per call processed (model + infra) | n/a | A$0.08 |
| AHPRA / clinical-governance findings | n/a | 0 |
The CMO has not measured the safety-event outcomes downstream because, in her words, "we're seeing fewer of them at the clinic level, and that's exactly what we wanted".
The thing we needed was a system that didn't take responsibility away from our receptionists. We needed it to make them better. That's what this is.
— Chief Medical Officer, national GP clinic chain
What we'd do differently
Build the medical-director review workflow first. We treated it as a downstream artefact and built it in week eight. It should have been week one — it's the artefact the clinical governance committee actually relies on.
Test transcription on accented English earlier. AU-resident transcription handled most calls excellently but struggled with strong regional accents in the first month. We retrained acoustic models in week four; should have been week one.
What we didn't do
We didn't replace any receptionist. We didn't deploy any fully-automated triage system that bypassed a human. We didn't process any call without consent (every call begins with the standard "this call may be recorded" advice). We didn't store call recordings beyond the 90-day clinical-governance retention period.
The interesting work was not the AI. It was the governance scaffolding — the disagreement-review workflow, the audit log, the consent-and-retention design. That's the work most clinical-AI vendors skip. It's the work we lead with.
