Development and validation of a non-invasive, chairside oral cavity cancer risk assessment prototype using machine learning approach


Oral cavity cancer (OCC) is associated with high morbidity and mortality rates when diagnosed at late stages. Early detection of increased risk provides an opportunity for implementing prevention strategies surrounding modifiable risk factors and screening to promote early detection and intervention. Historical evidence identified a gap in the training of primary care providers (PCPs) surrounding the examination of the oral cavity. The absence of clinically applicable analytical tools to identify patients with high-risk OCC phenotypes at point-of-care (POC) causes missed opportunities for implementing patient-specific interventional strategies. This study developed an OCC risk assessment tool prototype by applying machine learning (ML) approaches to a rich retrospectively collected data set abstracted from a clinical enterprise data warehouse. We compared the performance of six ML classifiers by applying the 10-fold cross-validation approach. Accuracy, recall, precision, specificity, area under the receiver operating characteristic curve, and recall-precision curves for the derived voting algorithm were: 78%, 64%, 88%, 92%, 0.83, and 0.81, respectively. The performance of two classifiers, multilayer perceptron and AdaBoost, closely mirrored the voting algorithm. Integration of the OCC risk assessment tool developed by clinical informatics application into an electronic health record as a clinical decision support tool can assist PCPs in targeting at-risk patients for personalized interventional care.

Document Type


PubMed ID