MSc Dissertation · University of Leicester

PRAXIS
AI Clinical
Documentation

Pipeline for Real-time Analysis and eXtraction of clInical documentation from Speech

An end-to-end AI system that listens to British GP consultations and automatically generates structured clinical notes — reducing documentation time while maintaining clinical accuracy.

0 % Best WER on British GP Audio
0 SOAP Quality Score
0 Clinical Note Templates
0 % Offline / On-device

NHS GPs Spend More Time on
Paperwork Than Patients

British GPs spend up to 11 minutes per consultation on clinical documentation. This administrative burden reduces patient face-time, contributes to burnout, and is a leading cause of GP workforce attrition in the NHS.

11min

Average documentation time per consultation

46%

Of GP time spent on administrative tasks

1:4

GPs plan to leave the profession within 5 years

End-to-End Clinical
Documentation Pipeline

PRAXIS processes GP consultation audio through four intelligent stages, producing structured clinical notes ready for GP review.

01

Speech-to-Text

Fine-tuned Whisper and MedASR models transcribe British English GP audio with medical vocabulary awareness.

Whisper Small + LoRA · MedASR Conformer
02

Speaker Diarisation

Pyannote neural diarisation identifies who said what — separating Doctor and Patient speech with timestamps.

Pyannote 3.1 · Rule-based fallback
03

Clinical NLP

MedCAT extracts symptoms, medications, and conditions with SNOMED CT linking. medspaCy detects negations.

MedCAT · medspaCy · SNOMED CT
04

SOAP Generation

QLoRA fine-tuned Med42-8B generates structured clinical notes with UK English, SNOMED codes, and safety-netting.

Med42-8B · QLoRA · Ollama

Built for
NHS Clinical Standards

6 Clinical Templates

SOAP, SBAR, Clinical Summary, Referral Letter, Discharge Summary, and EMIS-style consultation records.

SNOMED CT Coding

Automatic clinical terminology coding validated against 73 reference SNOMED CT codes commonly used in UK primary care.

Safety-Netting

Every generated note includes safety-netting advice — red flag symptoms, when to return, emergency contacts.

UK English

Enforced British medical terminology — paracetamol not acetaminophen, A&E not ER, physiotherapy not physical therapy.

Fully Offline

Runs entirely on-device with no cloud dependency. All models run locally via Ollama — compliant with NHS data governance.

Hallucination Grounding

Post-generation validation checks that diagnoses and medications in the SOAP note are grounded in the original transcript.

Benchmark Results

Evaluated on the PriMock57 dataset — 57 British GP mock consultations (17.27 hours of audio).

Speech-to-Text (WER on 14 Test Clips)

ModelWERTime/Clip
Best Base Whisper Small 50.1% 0.71s
Base Whisper Medium51.3%1.56s
LoRA Whisper v257.4%0.54s
Base MedASR93.1%0.45s
Fine-tuned MedASR82.4%0.46s

SOAP Generation (8 Consultations)

ModelScoreSNOMED
Best PRAXIS-SOAP-v1 0.98 100%
Med42-8B (Baseline)0.97100%
Mistral-7B (Zero-shot)0.98100%
Meditron3-8B0.7712%
🔍

MedASR Domain Mismatch

First published evaluation of Google MedASR on British GP audio. 93.1% WER reveals critical accent/domain mismatch — the model was trained on US radiology dictation.

LoRA Hallucination Discovery

LoRA fine-tuning on small datasets causes catastrophic hallucination in Whisper (514% WER). Fixed by freezing the encoder.

📊

Model Size Paradox

Whisper Small (244M params) outperforms Whisper Medium (769M) on British GP audio — challenging the larger-is-better assumption.

Built With

Speech-to-Text

OpenAI Whisper, Google MedASR, LoRA/PEFT fine-tuning

Diarisation

Pyannote Audio 3.1 neural speaker segmentation

Clinical NLP

MedCAT, medspaCy, SNOMED CT ontology

SOAP Generation

Med42-8B, Meditron3-8B, Mistral-7B via Ollama

Fine-tuning

QLoRA (4-bit NF4), HuggingFace Transformers, PEFT

Dataset

PriMock57: 57 British GP consultations, 17.27 hours

Evaluation

jiwer (WER/CER), automated quality scoring, GP blind scoring

Infrastructure

Python 3.12, FastAPI, Streamlit, Apple M4 Pro (fully local)

Research Team

SS

Satyawan Singh

MSc AI for Business Intelligence

University of Leicester, 2025-2026. Research focus: clinical NLP, speech recognition, and medical LLM fine-tuning for NHS primary care.

E

Eliyas

Dissertation Supervisor

University of Leicester, School of Computing and Mathematical Sciences. Supervising the technical direction and academic rigour of the PRAXIS project.

AB

AB Analytics Ltd

Industry Partner

Healthcare analytics company providing real-world clinical requirements and deployment context for the Patient Health Data Management System (PHDMS).