MSc Dissertation · University of Leicester

PRAXIS
AI Clinical
Documentation

Pipeline for Real-time Analysis and eXtraction of clInical documentation from Speech

An end-to-end AI system that listens to British GP consultations and automatically generates structured clinical notes — reducing documentation time while maintaining clinical accuracy.

Explore Pipeline View Results

0 % Best WER on British GP Audio

0 SOAP Quality Score

0 Clinical Note Templates

0 % Offline / On-device

The Problem

NHS GPs Spend More Time on
Paperwork Than Patients

British GPs spend up to 11 minutes per consultation on clinical documentation. This administrative burden reduces patient face-time, contributes to burnout, and is a leading cause of GP workforce attrition in the NHS.

11min

Average documentation time per consultation

46%

Of GP time spent on administrative tasks

1:4

GPs plan to leave the profession within 5 years

The Solution

End-to-End Clinical
Documentation Pipeline

PRAXIS processes GP consultation audio through four intelligent stages, producing structured clinical notes ready for GP review.

Speech-to-Text

Fine-tuned Whisper and MedASR models transcribe British English GP audio with medical vocabulary awareness.

Whisper Small + LoRA · MedASR Conformer

Speaker Diarisation

Pyannote neural diarisation identifies who said what — separating Doctor and Patient speech with timestamps.

Pyannote 3.1 · Rule-based fallback

Clinical NLP

MedCAT extracts symptoms, medications, and conditions with SNOMED CT linking. medspaCy detects negations.

MedCAT · medspaCy · SNOMED CT

SOAP Generation

QLoRA fine-tuned Med42-8B generates structured clinical notes with UK English, SNOMED codes, and safety-netting.

Med42-8B · QLoRA · Ollama

Features

Built for
NHS Clinical Standards

6 Clinical Templates

SOAP, SBAR, Clinical Summary, Referral Letter, Discharge Summary, and EMIS-style consultation records.

SNOMED CT Coding

Automatic clinical terminology coding validated against 73 reference SNOMED CT codes commonly used in UK primary care.

Safety-Netting

Every generated note includes safety-netting advice — red flag symptoms, when to return, emergency contacts.

UK English

Enforced British medical terminology — paracetamol not acetaminophen, A&E not ER, physiotherapy not physical therapy.

Fully Offline

Runs entirely on-device with no cloud dependency. All models run locally via Ollama — compliant with NHS data governance.

Hallucination Grounding

Post-generation validation checks that diagnoses and medications in the SOAP note are grounded in the original transcript.

Evaluation

Benchmark Results

Evaluated on the PriMock57 dataset — 57 British GP mock consultations (17.27 hours of audio).

Speech-to-Text (WER on 14 Test Clips)

Model	WER	Time/Clip
Best Base Whisper Small	50.1%	0.71s
Base Whisper Medium	51.3%	1.56s
LoRA Whisper v2	57.4%	0.54s
Base MedASR	93.1%	0.45s
Fine-tuned MedASR	82.4%	0.46s

SOAP Generation (8 Consultations)

Model	Score	SNOMED
Best PRAXIS-SOAP-v1	0.98	100%
Med42-8B (Baseline)	0.97	100%
Mistral-7B (Zero-shot)	0.98	100%
Meditron3-8B	0.77	12%

🔍

MedASR Domain Mismatch

First published evaluation of Google MedASR on British GP audio. 93.1% WER reveals critical accent/domain mismatch — the model was trained on US radiology dictation.

⚠

LoRA Hallucination Discovery

LoRA fine-tuning on small datasets causes catastrophic hallucination in Whisper (514% WER). Fixed by freezing the encoder.

📊

Model Size Paradox

Whisper Small (244M params) outperforms Whisper Medium (769M) on British GP audio — challenging the larger-is-better assumption.

Technology

Built With

Speech-to-Text

OpenAI Whisper, Google MedASR, LoRA/PEFT fine-tuning

Diarisation

Pyannote Audio 3.1 neural speaker segmentation

Clinical NLP

MedCAT, medspaCy, SNOMED CT ontology

SOAP Generation

Med42-8B, Meditron3-8B, Mistral-7B via Ollama

Fine-tuning

QLoRA (4-bit NF4), HuggingFace Transformers, PEFT

Dataset

PriMock57: 57 British GP consultations, 17.27 hours

Evaluation

jiwer (WER/CER), automated quality scoring, GP blind scoring

Infrastructure

Python 3.12, FastAPI, Streamlit, Apple M4 Pro (fully local)

About

Research Team

Satyawan Singh

MSc AI for Business Intelligence

University of Leicester, 2025-2026. Research focus: clinical NLP, speech recognition, and medical LLM fine-tuning for NHS primary care.

Eliyas

Dissertation Supervisor

University of Leicester, School of Computing and Mathematical Sciences. Supervising the technical direction and academic rigour of the PRAXIS project.

AB Analytics Ltd

Industry Partner

Healthcare analytics company providing real-world clinical requirements and deployment context for the Patient Health Data Management System (PHDMS).

PRAXIS AI ClinicalDocumentation

NHS GPs Spend More Time onPaperwork Than Patients

End-to-End ClinicalDocumentation Pipeline

Speech-to-Text

Speaker Diarisation

Clinical NLP

SOAP Generation

Built forNHS Clinical Standards

6 Clinical Templates

SNOMED CT Coding

Safety-Netting

UK English

Fully Offline

Hallucination Grounding

Benchmark Results

Speech-to-Text (WER on 14 Test Clips)

SOAP Generation (8 Consultations)

MedASR Domain Mismatch

LoRA Hallucination Discovery

Model Size Paradox

Built With

Research Team

Satyawan Singh

Eliyas

AB Analytics Ltd

PRAXIS
AI Clinical
Documentation

NHS GPs Spend More Time on
Paperwork Than Patients

End-to-End Clinical
Documentation Pipeline

Built for
NHS Clinical Standards