Machine Learning Applications in Spine Surgery

This literature review sought to identify and evaluate the current applications of artificial intelligence (AI)/machine learning (ML) in spine surgery that can effectively guide clinical decision-making and surgical planning. By using specific keywords to maximize search sensitivity, a thorough literature research was conducted in several online databases: Scopus, PubMed, and Google Scholar, and the findings were filtered according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A total of 46 studies met the requirements and were included in this review. According to this study, AI/ML models were sufficiently accurate with a mean overall value of 74.9%, and performed best at preoperative patient selection, cost prediction, and length of stay. Performance was also good at predicting functional outcomes and postoperative mortality. Regression analysis was the most frequently utilized application whereas deep learning/artificial neural networks had the highest sensitivity score (81.5%). Despite the relatively brief history of engagement with AI/ML, as evidenced by the fact that 77.5% of studies were published after 2018, the outcomes have been promising. In light of the Big Data era, the increasing prevalence of National Registries, and the wide-ranging applications of AI, such as exemplified by ChatGPT (OpenAI, San Francisco, California), it is highly likely that the field of spine surgery will gradually adopt and integrate AI/ML into its clinical practices. Consequently, it is of great significance for spine surgeons to acquaint themselves with the fundamental principles of AI/ML, as these technologies hold the potential for substantial improvements in overall patient care.


Introduction And Background
The advent of machine learning (ML) applications within clinical medicine signifies the dawn of a new era for addressing healthcare challenges, such as artificial intelligence (AI) tools that can leverage large datasets improving healthcare systems and minimizing human error [1][2][3][4][5][6].The field of spine surgery is no exception, where technologies like augmented reality, computer navigation, and robotics are already leaving their mark in both clinical settings and operating rooms [7][8][9].Grasping the foundational principles of ML and AI is of paramount importance in effectively and safely unlocking their potential [10][11][12].In its early stages, this literature review aims to offer an initial glimpse into the world of ML/AI applications within spine surgery, delving into their objectives, outcomes, and effectiveness [13].
ML, a subset of AI, is dedicated to crafting algorithms that enhance themselves (learners) through experiential learning [14].Notable instances of AI/ML integration in spine surgery encompass tasks such as image classification (e.g., automating the detection of vertebral compression fractures in CT or MRI scans) [7,15,16], the creation of models for preoperative risk assessment [17][18][19] and the development of tools to support clinical decision-making [7,10,20].
Boundaries between classic statistics and ML might seem blurry because they are both based on statistical models; however, while the former is derived from mathematics, the latter is derived from computer science.Furthermore, classic statistics infers relationships between variables whereas ML endeavors to predict these [21,22].Moreover, the inference (in statistics) involves testing the null against an alternative hypothesis for an outcome with a confidence measure, whereas the prediction (in ML) involves predicting outcomes without requiring more prior data because there are derived relationships [23].As an example, Ogink et al. trained a neural network successfully to predict early and accurately which patients undergoing surgery for spinal stenosis will require admission to a rehabilitation facility after hospital discharge [24].
In spine surgery literature, three of the most commonly used ML applications are: 1. Artificial Neural Networks (ANN), 2. Support Vector Machine (SVM), and 3. Classification and Regression Trees (CART).These applications have some unique and some overlapping features [25,26].
Taking into account the novelty of ML/AI, the aim of this literature review was, on the one hand, to bring the nonexpert reader closer to its terminology and principles and, on the other hand, to elucidate their Original clinical studies investigating and assessing ML/AI applications in spinal surgery were included, while reviews, studies of implant designing and development, non-English language studies, congress lectures, and non-spine surgery-related studies were excluded.Primary databases search resulted in 335 articles and after applying inclusion and exclusion criteria, 46 articles were eligible for final recruitment (Figure 1).Over 80% of the studies were of low level of evidence as per the Oxford Centre for Evidence-Based Medicine (cohort and case-control studies) and 77.5% were published after 2018 (

Results
The studies that were finally included in this review were further divided into two main categories: (a) 22 studies regarding the use of ML/AI in assisting clinical decision-making by classification of the given pathology, preoperative patient selection, and preoperative planning (Table 2) [20,[27][28][29] and (b) 24 studies focusing on postoperative outcomes prediction capability of ML/AI (Table 3) [30].The performance evaluation of the AI/ML model in the examined studies encompassed several measures.These included metrics like the area under the curve (AUC) derived from receiver operating characteristic (ROC) curves, along with accuracy (%), sensitivity (%), and specificity (%) [31,32].The AUC metric serves as an indicator of the ML model's capacity to distinguish, with its values spanning from 0.50 to 1.A value nearing 1 signifies a heightened predictive ability of the model, while values within the range of 0.51-0.69suggest less effective performance.Statistical analysis entailed a one-way analysis of variance (ANOVA), followed by subsequent post hoc Tukey tests.The level of statistical significance was predefined as p<0.05.Regression analysis was the most common ML/AI model applied (58.5%) [54] whereas Bayesian Point Machines had the highest mean AUC score (0.80).AUC was the most frequently utilized accuracy assessment tool (90.2%) and was good for all models overall with a mean score of 0.75 [55][56][57].Accuracy was also good overall, mean 74.9% [58].Deep learning/ANN had the highest mean sensitivity (81.5%) (Table 4).Typically, the ML/AI models demonstrated their strongest performance in tasks related to preoperative patient assessment and planning, as well as cost prediction and estimating the length of hospital stays [59][60][61].

Discussion
This review comprehensively analyzed and evaluated the prevailing trends in the utilization of ML/AI applications within the realm of spine surgery.The outcomes of these investigations demonstrated an overall positive trajectory, particularly in terms of preoperative planning and cost optimization.This positive trajectory signifies their potential to emerge as a promising tool for ensuring precise and efficient treatment and management for spine patients.
To successfully incorporate ML into the healthcare sector, healthcare practitioners need to acquaint themselves with ML terminology and techniques, such as decision trees, SVM, and ANN.Notably, ML's predictive capabilities shine when dealing with substantial datasets, such as patient-reported outcomes (PROMs) [64].This was exemplified in Khan et al.'s study [29], where multiple supervised learners accurately predicted improvements in Short Form-36 (SF-36) scores post-surgery for degenerative cervical myelopathy.
Their models effectively integrated various factors like comorbidities, examination findings, imaging, and basic characteristics to provide comprehensive predictive insights.
Additionally, a review by Varghese et al. found that ML's potential extends to characterizing the performance of medical devices such as pedicle screws [57].They used ML to analyze input permutations in their pedicle screw strength protocol.Their study utilized diverse foam densities and angles for pedicle screw insertion, achieving a promising model with low error rates and high predictive accuracy for pedicle screw failure.
Within the scope of this review, 22 studies (47.8%) explored AI/ML applications for classifying pathological This review stands as a pioneering effort to evaluate and consolidate AI/ML applications for optimizing patient selection, predicting surgical outcomes, and managing complications in spine surgery.Encompassing 46 studies, the review showcases AI/ML-based prediction and optimization models that have the potential to guide clinical decision-making and surgical planning.Across various AI/ML methods, the models demonstrated satisfactory accuracy, averaging 74.9% overall accuracy and an AUC of 0.75.Notably, these models excelled in optimizing preoperative patient selection, planning, cost prediction, hospital discharge, and length of stay.They also performed commendably in predicting postoperative mortality, functional outcomes, and clinical results (AUC between 0.70 and 0.89).
While AI/ML models showed limited success in predicting postoperative complications (AUC 0.50-0.69),they still hold the potential to improve preoperative planning and enhance the cost-effectiveness of healthcare services.Furthermore, the review points out that AI/ML models could help minimize unnecessary healthcare costs and offer models for risk-adjusted reimbursement.It also highlight AI/ML's role in enhancing clinical decision-making precision and patient care, allowing resource optimization for postoperative follow-up and focused care for high-risk patients.

Limitations
The current study is subject to several limitations.Firstly, it is important to acknowledge that the field of ML/AI remains relatively nascent, particularly in its application to spine surgery, and thus its complete impact and potential are yet to be fully realized.Additionally, it is essential to recognize the limited availability of relevant literature, which necessitates a cautious approach when interpreting our findings.
Lastly, the retrospective nature of the study introduces inherent limitations that must be duly acknowledged.Notwithstanding the aforementioned limitations, this study contributes to the existing body of literature.

Conclusions
This review delineates the specific domains within spine surgery where the influence of ML/AI is most pronounced, shedding light on the precise manner in which ML/AI can exert its impact.Furthermore, it serves to bridge the gap between spine surgeons and the emerging field of ML/AI, thereby facilitating a better understanding of its potential applications.Notably, this review provides evidence of promising outcomes stemming from the use of ML/AI in spine surgery, even in its early stages.This observation implies that as the field matures, even more favorable results may be anticipated particularly in supporting and guiding clinical decision-making by powerfully refining the massive data extracted from PROMs and National Registries and improving outcomes overall.
As the field progresses, future research direction should include creating externally validated and commercially viable systems that can seamlessly integrate with existing hospital infrastructures.Additionally, further exploration of optimal methods for identifying surgical candidates from a diverse range of preoperative data is warranted.With the rapid expansion of literature, technology accessibility, and clinical applications, understanding AI/ML-based applications is becoming increasingly crucial in the context of spine surgery.It is important to note that while this review presents statistical findings and trends from recent studies, it does not establish definitive relationships between AI/ML and clinical effectiveness.

FIGURE 1 :
FIGURE 1: PRISMA study flowchart PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses