Accuracy of Ultrasound Scans as Compared to Fine Needle Aspiration Cytology in the Diagnosis of Thyroid Nodules

Introduction: Thyroid nodules (TNs) are among the more common findings on physical examinations. Due to the fear of the TN harboring malignancy and with the increasing incidence of thyroid cancer, ultrasound (US) scanning is used as an important diagnostic tool in the assessment of a TN. The American College of Radiology's Thyroid Imaging Reporting and Data System (TI-RADS) was established based on specific patterns composed of two or more features. According to the TI-RADS guidelines, a suspicious nodule by US findings should undergo fine-needle aspiration cytology (FNAC), in which results would guide further management. Objective: This study was carried out to assess the accuracy of US as compared to FNAC in the diagnosis of a thyroid nodule. Methodology: This retrospective study involved 213 cases that were sent for FNAC after having done a US scan of the thyroid. Data was gathered from all patient files that were referred for FNAC thyroid between 01/02/2018 and 30/06/2021 in Al-Ahli Hospital in the state of Qatar. The US scans were interpreted and reported according to the TI-RADS criteria. The FNAC samples were interpreted and reported according to the Bethesda System for Reporting Thyroid Cytopathology. Data were tabulated and analyzed with Excel (Microsoft, Redmond, WA, USA) and SPSS version 25 (IBM Corp., Armonk, NY, USA). Results: The study showed that US had a sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of 73.9%, 72.6%, 24.6% and 95.8%, respectively, with a significant association between the results of US and the results of FNAC (X2 (1, n = 213) = 20.295, p < .001) and a significant positive correlation (phi coefficient = .309, p < .001). In addition, the data showed that the odds for having a positive FNAC were 7.519 (95% CI: 2.811, 20.112) times greater for cases with positive US compared with cases with negative US. The relative risk of having a positive FNAC when the US was positive was 5.913 (95% CI: 2.440, 14.332) times greater compared to when the US was negative. Conclusion: While our results showed that US cannot be solely relied on in diagnosing TNs, they did show that US can reliably rule out a malignancy in TNs. Recent studies have been showing increasing accuracy of US in diagnosing TNs and more studies are needed to explore this topic.


Introduction
Thyroid nodules (TNs) are solid or fluid-filled lumps that form within the thyroid gland. The estimated prevalence by palpation is 3%-7% in some countries [1]. The prevalence is higher among randomly selected individuals by high-resolution ultrasonography where it may increase to 67% according to one study [2]. Thyroid nodules are always examined due to the fear of it being thyroid malignancy. The incidence of thyroid cancer is on the rise and is now the fifth most common cancer diagnosed in adult women worldwide and the second most common in women over 50 years of age [3,4]. The advancements in diagnostic technologies may be contributing to the increasing prevalence of TNs, but it may be explained by other traditional risk factors such as increasing age, insufficient iodine intake, exposure to radiation, and unhealthy lifestyles. This in turn increases the risk of obesity and metabolic syndrome, which are regarded as risk factors for TNs [5,6].
Several conditions can cause nodules to develop in the thyroid gland, including overgrowth of normal thyroid tissue, thyroid cysts, Hashimoto's disease, multinodular goiter and thyroid cancer [7]. Most TNs 1 2 3 1 aren't serious and don't cause symptoms. However, a small percentage of TNs are caused by thyroid malignancy. This percentage is variable in different countries. A study in the United States found that only one out of every 20 clinically identifiable nodules turns out malignant [8]. Another study found that the proportion of thyroid cancer from TNs may reach up to 15% [9,10].
One of the important diagnostic tools in the assessment of TNs is ultrasound (US). It is currently the most accurate imaging modality for detecting TNs. It provides the best information about the shape and structure of nodules. Furthermore, it is useful as a guide in performing fine-needle aspiration (FNA) if required [11].
Different guidelines were proposed in order to help radiologists and clinicians readily recognize the sonographic patterns and classify nodules into categories. In 2009, the Thyroid Imaging Reporting and Data System (TI-RADS) was established based on specific patterns composed of two or more features. This model offers a standardized and simplified approach for radiologists to follow, with a good diagnostic performance of high sensitivity (88%), negative predictive value (88%) and accuracy (94%) [12]. However, radiologic findings alone are inconclusive. Therefore, according to the TI-RADS guidelines, a suspicious nodule by US findings should undergo FNA cytology (FNAC), in which results would guide further management [12,13].
In 2015, the American Thyroid Association (ATA) constructed new guidelines with a risk stratification model from very low suspicion to high suspicion for malignancy. It utilizes sonographic features based on the TI-RADS criteria. Patients with a TI-RADS score of 2 and 3 are considered low risk and are not routinely aspirated. This resulted in a reduction of the number of unnecessary FNAs [14]. A recent meta-analysis showed that the TI-RADS categories were a promising tool to differentiate between benign and malignant nodules, with a sensitivity and specificity of 0.79 (95% CI = 0.77-0.81) and 0.71 (95% CI = 0.70-0.72), respectively [15]. The objective of this study is to build up on this aspect of the literature, assessing the accuracy of thyroid US as compared to FNAC in the prediction of thyroid cancer.

Materials And Methods
This was a retrospective study of 213 cases that were sent for FNAC after a US scan of the thyroid. After gaining ethical approval from Al-Ahli Hospital, Doha, Qatar (EIC number EC2-2022), data was gathered from all patient files that were referred for FNAC thyroid between 01/02/2018 and 30/06/2021 at Al-Ahli Hospital. This amounted to 320 files. Of these, 25 cases had FNAC samples reported as inadequate and were therefore excluded from the study, leaving 295 cases. Of these, 82 did not have records of US scans of the thyroid and were therefore excluded from the study. This left us with a sample of 213 cases that had both a US scan and an FNAC thyroid.
The US scans were interpreted and reported by experienced radiologists according to the American College of Radiology's TI-RADS criteria. US features were scored as shown in Table 1 and, accordingly, the TI-RADS score was determined as shown in    For the purposes of this study, TI-RADS 1, 2 and 3 were considered negative US scan results as they are not very suspicious for malignancy, and are not a direct indication for FNAC. These cases are managed according to clinical suspicion, where clinical factors are taken into consideration rather than relying on the US result.
On the other hand, TI-RADS 4 and 5 will be considered positive US scans as they hold high suspicion of malignancy and are a direct indication for FNAC, regardless of the clinical picture.
Bethesda I is an insufficient sample and as previously mentioned, these have been excluded from comparison in our study. Bethesda II is reported as benign. Bethesda III and IV are the "borderline" results and are considered for follow-up studies to further evaluate the nodule. However, Bethesda V and VI are considered malignant. Therefore, for the purposes of this study, Bethesda II, III and IV were considered negative and Bethesda V and VI were considered positive.
Various demographic, clinical and sonographic criteria were considered in this research. The sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated for US scans of the thyroid. Odds ratio and relative risk were also calculated to compare US results to those of FNAC. The association between the various criteria considered in this study with the results of the US and FNAC was analyzed using chi-squared test. Data were tabulated and analyzed with Excel (Microsoft, Redmond, WA, USA) and SPSS version 25 (IBM Corp., Armonk, NY, USA).

Results
The number of cases with the various criteria considered in the study are shown in Table 4 Table 5.

Discussion
As previously discussed, the current global consensus is that thyroid US on its own is insufficient to diagnose thyroid cancer. However, with the constantly evolving and advancing field of radiology, new studies are emerging that attempt to challenge this idea. A study conducted in Turkey in 2021 compared the reliability of the ATA and TI-RADS guidelines for thyroid US with FNAC results. It concluded that both guidelines can accurately predict malignancy, and may in fact eventually lead to a decrease in unnecessary FNAs [18].
Another study comparing the ATA, British Thyroid Association and TI-RADS showed that all three guidelines had sensitivities and NPV of over 90%, with ATA being the best at 98% and 95%, respectively [19].
In our study, we demonstrated a sensitivity, specificity, NPV and PPV of 75.0%, 68.7%, 97.9% and 12.7%, respectively.  [20]. The latter showed the combined approach resulted in at least 97% sensitivity and 97% NPV [21]. The findings of these studies, among others, show promise in the possibility of eliminating or reducing the need of using FNACs for thyroid nodules.
The current gold standard for diagnosing thyroid cancer is FNAC. A meta-analysis of the Bethesda reporting system found that the sensitivity, specificity, NPV and PPV were 97%, 50.7%, 96.3% and 55.9%, respectively [26]. Another study comparing the effectiveness of TI-RADS criteria in US to the Bethesda reporting system of FNAC demonstrated that the Bethesda reporting system had a sensitivity, specificity, and accuracy of 90%, 94.3% and 91.1%, respectively [27]. These values are not significantly superior to the values demonstrated by ultrasonography alone in some of our previously mentioned studies. Furthermore, a study conducted in one center demonstrated a false negative rate of FNAC of 15%, concluding that the Bethesda risk stratification system often underestimates malignancy rates [28].
The results of our study should be considered in the context of the following limitations, one being our small sample size. A larger study would produce more reliable data, especially if conducted in a specialist center. A second limitation is the nature of the study itself. Patients are only referred for FNAC if they have suspicious findings on US. Therefore, it is difficult to assess the proportion of false negative cases, who may eventually have positive FNACs despite having negative US scans. Once again, a larger sample size may give more reliable data in light of this issue. Another limitation is the nature of healthcare in the region where this study was conducted, where lots of patients seek private healthcare, often outside of the country. This resulted in a large proportion of missing patient data in the hospital system. This is one of the reasons why so many patients had to be excluded from our study, as previously mentioned.

Conclusions
US was shown to be a reliable tool in the assessment of TNs. There is a considerable amount of discrepancy between different literatures regarding this topic. Some studies and the current consensus suggest that US cannot be used without FNAC to diagnose TNs. Other studies, however, suggest that the advancement in US quality and techniques have led to US being up to par with FNAC in terms of accuracy. With the constant development and evolution of imaging techniques, we expect US scans to become more and more reliable.
Highly specialized radiology centers with modern equipment and adequate experience may soon be able to, without the use of FNACs, achieve results that are on par with our current accepted standard. Therefore, we recommend that large studies be conducted in such centers to assess and compare the reliability of modern US scans in diagnosing TNs to the current gold standard of FNAC.

Additional Information Disclosures
Human subjects: Consent was obtained or waived by all participants in this study. Ethics Committee, Al-Ahli Hospital, Doha, Qatar issued approval EC2-2022. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue. Conflicts of interest: In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.