Applicability of online chat-based artificial intelligence models to colorectal cancer screening

Affiliations

Aurora Medical Center, Oshkosh

Abstract

Background: Over the past year, studies have shown potential in the applicability of ChatGPT in various medical specialties including cardiology and oncology. However, the application of ChatGPT and other online chat-based AI models to patient education and patient-physician communication on colorectal cancer screening has not been critically evaluated which is what we aimed to do in this study.

Methods: We posed 15 questions on important colorectal cancer screening concepts and 5 common questions asked by patients to the 3 most commonly used freely available artificial intelligence (AI) models. The responses provided by the AI models were graded for appropriateness and reliability using American College of Gastroenterology guidelines. The responses to each question provided by an AI model were graded as reliably appropriate (RA), reliably inappropriate (RI) and unreliable. Grader assessments were validated by the joint probability of agreement for two raters.

Results: ChatGPT and YouChat™ provided RA responses to the questions posed more often than BingChat. There were two questions that > 1 AI model provided unreliable responses to. ChatGPT did not provide references. BingChat misinterpreted some of the information it referenced. The age of CRC screening provided by YouChat™ was not consistently up-to-date. Inter-rater reliability for 2 raters was 89.2%.

Conclusion: Most responses provided by AI models on CRC screening were appropriate. Some limitations exist in their ability to correctly interpret medical literature and provide updated information in answering queries. Patients should consult their physicians for context on the recommendations made by these AI models.

Document Type

Article

PubMed ID

38267726


 

Share

COinS