Triaging Gastric Biopsies with Computer-vision-based Artificial Intelligence System

This abstract has open access

Abstract Description

Submission ID :

HAC1196

Submission Type

HA Staff

Authors (including presenting author) :

Chan CKR (1), Palmerston JB (2), Garifullin A (2), Wan JCH (2), So B (2), CHAN MHH (3)

Affiliation :

(1) Department of Pathology, Alice Ho Miu Ling Nethersole Hospital, (2) Artificial Intelligence Systems Team 1: AI Lab and Innovation, Information Technology and Health Informatics Division, (3) Department of Chemical Pathology
Prince of Wales Hospital

Keyword 1: :

histopathology

Keyword 2: :

artificial intelligence

Keyword 3: :

image analysis

Keyword 4: :

gastric biopsy

Keyword 5: :

NULL

Keyword 6: :

NULL

Introduction :

Gastric biopsies are one of the most common samples processed in a histology laboratory, accounting for close to half of all biopsies received. Interpretation of these specimens is a resource-intensive manual process by pathologists. More than half of the gastric biopsies are normal, with a minority showing clinically significant abnormalities, i.e. cancer, intestinal metaplasia and Helicobactor infection. An automatic triage system has the potential to improve the diagnostic process, reducing pathologists’ burden and turn-around time. This study presents a computer-vision-based artificial intelligence (AI) system designed to detect abnormalities in whole-slide images (WSIs) of gastric biopsies.

Objectives :

To develop and evaluate an AI system to reduce workload for expert pathologists by detecting abnormalities in gastric WSIs in three categories: abnormal tissue, Helicobacter infection (HPACG), and intestinal metaplasia.

Methodology :

The study utilized a multi-stage AI pipeline. The clustering-constrained-attention multiple-instance learning (CLAM) method is utilized to process WSIs into image patches. MedSigLip, a large pre-trained medical imaging foundation model, was used to encode features from these patches. CLAM multi-instance learning (MIL) framework aggregates these patch features to classify each WSI in three independent binary classification tasks. The system was trained on 2,542 de-identified WSIs and evaluated on a subset via 10-fold cross-validation in independent groups per classification task. Three classification tasks were evaluated: normal vs abnormal, Helicobacter infection (HPACG), and intestinal metaplasia (IM). Classification of metrics including sensitivity, true negative rate (TNR), precision, negative predictive value (NPV), were computed from a standard model confidence threshold of 0.5, along with the threshold-independent area under the ROC curve (AUC).

Result & Outcome :

Evaluation of the trained system on binary classification tasks yielded the following macro-averaged performance metrics: Sensitivity ranged from 0.69 (normal vs abnormal) to 0.76 (intestinal metaplasia); TNR ranged from 0.85 (abnormal vs normal) to 0.96 (HPACG); NPV ranged from 0.88 (abnormal vs normal) to 0.97 (HPACG, Intestinal Metaplasia); Precision (PPV) ranged from 0.55 (Intestinal Metaplasia) to 0.63 (normal vs abnormal); and AUC ranged from 0.84 (normal vs abnormal) to 0.92 (HPACG). These results indicate promising proof-of concept performance.