WOFAPS 2025 8th World Congress of Pediatric Surgery

View Abstract

Oral Presentation - 154

ChatGPT-4o® in Pediatric Burn Care: Expert Review of Its Role in Initial Clinical Decision-Making

Asya Eylem Boztaş, İncinur Genişol, Ayşe Demet Payza, Özkan Okur, Arzu Şencan
Behçet Uz Çocuk Hastalıkları Eğitim ve Araştırma Hastanesi Çocuk Cerrahisi Kliniği, İzmir/ Türkiye

Introduction:Artificial intelligence(AI)is playing an increasingly prominent role in advancing the field of medicine. Among AI technologies, ChatGPT stands out as a potential tool in clinical support and education. This study aims to evaluate, the accuracy and quality of responses generated by ChatGPT-4o® to frequently asked questions(FAQs) posed by practicing physicians regarding the initial assessment of pediatric burn injuries, with assessment of pediatric burn specialists.

Methods:Thirty-four FAQs about pediatric burn care were posed to ChatGPT-4o twice, one week apart, blindedly by four experienced pediatric surgeons who work at a national tertiary referral burn center. Questions were divided into five subgroups; initial assessment and triage, fluid resuscitation and hemodynamic management, wound care and infection prevention, pain management and sedation, special situations and follow-up. The results given by chatbot were evaluated by pediatric surgeons using modified five-point DISCERN tool(mDISCERN) for reliability and the Global Quality Scale(GQS) for comprehensive quality of the answers. Inter-rater reliability was measured using intraclass correlation coefficients(ICC).

Results:ChatGPT-4o® demonstrated high-quality and reliable responses to pediatric burn care questions. The median GQS was 4.75(3.50–5.00), with 67.7% of responses scoring ≥4.75, and 41.2% receiving a perfect score of 5.00. The mDISCERN median score was 9.25(7.00–10.00), and 74% of responses scored ≥9.25, reflecting strong informational reliability. There was a very strong correlation between GQS and mDISCERN scores(r=0.858,p<.001), indicating consistent alignment between content quality and reliability. Inter-rater reliability analysis showed good agreement for individual scores(ICC= 0.63) and excellent consistency for average scores(ICC= 0.87, p< .001), supporting the robustness of the reviewers' assessments.

Conclusions:ChatGPT-4o proves to be a high-quality and reliable source of information for the initial evaluation of pediatric burn patients, providing substantial support for healthcare professionals in clinical decision-making. Its consistent accuracy and relevance position it as a promising adjunct tool in pediatric burn care.

Close