An Analysis of the Readability of Online Sarcoidosis Resources

Introduction Sarcoidosis is an inflammatory disease characterized by the formation of noncaseating granulomas in multiple organ systems. The presentation can vary widely; although some patients with sarcoidosis can be asymptomatic, sarcoidosis can also present in others with symptomatic multiorgan system involvement. Considering the potential severity of the disease, patients need to be well-informed about sarcoidosis to better manage their health. This study aims to assess the readability levels of online resources about sarcoidosis. Methods We conducted a retrospective cross-sectional study. The term "sarcoidosis" was searched online using both Google and Bing to find websites written in English. Each website was categorized by type: academic, commercial, government, nonprofit, and physician. The readability scores for each website were calculated using six different readability tests: the Flesch-Kincaid reading ease (FKRE), Flesch-Kincaid grade level (FKGL), Gunning fog score (GFS), Simple Measure of Gobbledygook (SMOG), automated readability index (ARI), and Coleman-Liau index (CLI). FKRE gives a score that corresponds to the difficulty of the text, while the remaining tests give a score that corresponds to a grade level in terms of reading ability. A one-sample t-test was used to compare all test scores with the national recommended standard of a sixth-grade reading level. Our null hypothesis was that the readability scores of the websites searched would not differ statistically significantly from the sixth-grade reading level and that there would be no significant differences across website categories. To evaluate the difference between the categories of websites, ANOVA testing was used. Results Thirty-four websites were analyzed. Each of the six readability tests for the websites had an average score, which corresponded to being significantly harder to read than the nationally recommended sixth-grade reading level (p<0.001). None of the mean readability scores showed a statistically significant difference across the five different website categories. Conclusions This is the first study, to our knowledge, to examine the readability of online English resources on sarcoidosis and calculate standardized readability scores for them. It implies that the online English material for sarcoidosis is above the health literacy recommended reading levels for patients. There is a need to simplify the material to be easier to read for patients.


Introduction
Sarcoidosis is an inflammatory disease characterized by the formation of noncaseating granulomas in multiple organ systems [1].The prevalence of this malady is approximately 10-20 per 100,000 individuals [1].Sarcoidosis affects Black Americans more frequently than White Americans and typically occurs in younger patients [2].It can range from being asymptomatic in some patients to symptomatic multiorgan involvement in other patients [3].It is theorized that a combination of genetic and environmental factors are responsible, with possible infectious and autoimmune involvement, but the exact cause remains undefined [4].Pulmonary fibrosis is the most common cause of death from sarcoidosis in Western countries [5].Sarcoidosis is typically treated with corticosteroids and immunosuppressive agents, such as methotrexate, azathioprine, or an anti-tumor necrosis factor medication [6].The mortality rate from this disorder is 1%-8% [7].
Given the potential severity of the disease, it is important for patients to be compliant with medication.Research has shown that patients who are more educated about their chronic disease tend to be more compliant with medication than patients who are not [8].This makes sense, as they have a better understanding of what the medication does to combat the disease and the risks of not taking the medication [8].Patients with sarcoidosis who are more adherent to their medication have also been found to have a higher quality of life compared to those who are not [9].Patients who are more "health-literate" have been reported to have higher rates of medication adherence compared to those who are less "health-literate" [10].Health literacy is defined as "the personal, cognitive, and social skills that determine the ability of individuals to gain access to, understand, and use information to promote and maintain good health" [11].
Internet healthcare resources play a critical role in patient information and decision-making [12].These can include websites that give patients more information about their diseases regarding identification and management.Sarcoidosis is no exception, as there are plenty of articles online discussing its symptoms when patients should suspect it and talk to a physician, certain treatments, and more.One study found that more than half of people looking for health information had their health behavior influenced by online resources [13].Because of the great influence the Internet can have over prospective patients, it is important to ensure that these sources are easily accessible.However, many online health sources can be difficult to understand because of the way they are written with complex terminology and sentence structure [14].
The average adult reads at about an eighth-grade level [15].Meanwhile, the average Medicare beneficiary is only capable of a fifth-grade reading level [16].Given the low literacy rates in America, the National Institutes of Health (NIH), the American Medical Association (AMA), and the United States Department of Health and Human Services (USDHHS) have all recommended that health education materials for patients be written at a sixth-grade reading level or below [17,18].
Given this fact, we wished to discover the readability level of articles on sarcoidosis commonly found on the Internet and whether they matched the official recommendations.

Materials And Methods
This study did not require Institutional Review Board approval, and there was no patient involvement.Based on the methodology by Mc Carthy et al., we performed a retrospective cross-sectional study [19].Their methodology consisted of searching for "slipped upper femoral epiphysis" on different search engines, collecting the websites from the first two pages of search results, eliminating websites that met their exclusion criteria, and then analyzing the readability scores of the websites left using WebFx.com[19].
In March 2023, "sarcoidosis" was inputted into Google and Bing.Table 1 shows the number of results for each Internet search engine.The inclusion criteria included the first 25 websites in English from each search engine; prior research has shown that people are unlikely to look at search results beyond the first 25 [20].This corresponds to roughly the first two pages of search results on an Internet search engine.Prior readability studies have also used this cutoff [21,22].

Search engine Hits returned
Google 24,500,000 Bing 359,000

TABLE 1: Results by search engine
Exclusion criteria were applied that prohibited duplicate websites, medical journals, pages requiring login information, and websites that were unable to be analyzed for readability.This is seen in Table 2. Medical journals were excluded as they were considered too complex for a normal person to understand, following the reasoning in prior readability studies [19].

Medical journal 2
Login required 3 Unable to be analyzed 3 Websites included 34

TABLE 2: Summary of websites excluded
Afterwards, the websites to be used in the analysis were classified according by type, drawing on the methodology used by a previous readability study [19].The five categories included academic, commercial, government, nonprofit, and physician."Academic" included websites that were owned or associated with a university."Commercial" included websites that had advertisements."Government" were websites associated with governments of countries or government agencies."Nonprofit" involves websites operated by nonprofit groups or NGOs (nongovernmental organizations)."Physician" refers to websites owned by individual physicians or physician groups (e.g., American Academy of Dermatology).These categories are shown in Table 3. 2024

TABLE 3: Websites by category
WebFx.com is a free website that can calculate how readable, or how easy to read, other websites are.This tool was used to analyze and collect data on the websites we chose.Notably, as indicated in Table 2, three websites were excluded from the final analysis because they could not be interpreted by WebFx.com to provide us with data for them.WebFx.comworks by providing scores for six readability tests: Flesch-Kincaid reading ease (FKRE), Flesch-Kincaid grade level (FKGL), Gunning fog score (GFS), Simple Measure of Gobbledygook Index (SMOG), Coleman-Liau index (CLI), and automated readability index (ARI).These tests are elaborated on in Table 4, adapted from Zhou et al., which offers a brief description of each test and the formula used to calculate the readability score [23].

Test name Description Formula
Flesch-Kincaid reading ease (FKRE) Predicts the reading grade level of written material.Invented by Robert Gunning.Uses the number of sentences, words, and complex words (defined as words with three or more syllables) in its equation.

TABLE 4: Information on readability tests
Adapted from Zhou et al. [23] The FKRE is one of the most used measures of readability; a higher score corresponds to higher readability [24].Table 5, adapted from Spadaro et al., shows FKRE values and their corresponding readability levels, with higher scores corresponding to easier readability [24].For example, a score of 60 would be appropriate for a 9th or 10th-grade reading level.For the remaining five tests, their scores are supposed to correlate to grade-level indicators.Thus, a score around seven would be suitable for a seventh-grade reading level, while a score of nine would be appropriate for a ninth-grade reading level [23,25].As such, for these five tests, in contrast to FKRE, a lower score corresponds to higher readability [25].1.2.5042;RStudio Team, Boston, MA).RStudio was used for calculations of one-sided, one-sample t-tests; all other remaining statistical tests were performed on SPSS, which is unable to calculate one-sided, onesample t-tests.Significance was set for P values less than 0.05.ANOVA testing was used to compare the means and analyze differences among the five types of websites.
One-sample t-tests were performed using a sixth-grade reading level as the standard, which corresponds to a score of 80 for FKRE and 6 for non-FKRE tests (FKGL, GFS, SMOG, ARI, CLI).This is because the AMA and NIH have recommended that education materials for patients should not be written with a readability score higher than sixth-grade level [17,18].
Our null hypothesis was that the mean FKRE scores of the websites would be greater than or equal to 80 (so at sixth-grade reading level or easier), while the mean scores for non-FKRE tests (i.e., FKGL, GFS, SMOG, ARI, CLI) would be less than or equal to 6 (also at sixth-grade reading level or easier).
Our alternative hypothesis was that the mean FKRE score of the websites would be less than 80 (indicating a reading level harder than sixth grade), while the mean non-FKRE test scores for the websites would be greater than six (also indicating a reading level harder than sixth grade).

Results
Twenty-five websites were found on both the first two pages of Bing and Google, for a total of 50 websites.These websites are listed in Table 10 as supplementary material.Eight websites were duplicates between the two search engines, meaning 42 unique websites.Eight websites were excluded as they were either medical journals, required logins, or could not be analyzed.Table 7 shows the results of the t-test for FKRE.It was a one-sample one-sided t-test and was conducted against a value of 80 with a significance level of 0.05.The P value was <0.05, indicating that the mean FKRE value of the websites analyzed on sarcoidosis is statistically significantly less than the recommended FKRE value of 80. Since it was a one-sided t-test looking to see if the mean value of FKRE would be less than 80, we only get the upper bound of the 95% confidence interval.The upper bound of the 95% confidence interval is 50.88.This means that we are 95% confident that the mean value of the FKRE of the websites is lower than 50.88.

Type of test Mean value Significance (one-tailed P value) 95% confidence interval for upper
Flesch-Kincaid reading ease 46.9 <0.001 50.8834

TABLE 7: One-sample t-test comparing mean readability score with recommended standards for FKRE
Table 8 shows the results of the t-tests for the non-FKRE reading tests.All five of these t-tests were one sample, one-sided t-tests, conducted against a value of 6 with a significance level of 0.05.Since our alternative hypothesis was that mean non-FKRE test scores for the websites would be greater than 6, we only obtained the lower bound of the 95% confidence interval.For all five t-tests, the P value is <0.05, indicating that each reading test value analyzed for the websites on sarcoidosis is statistically significantly greater than the recommended value of 6.The upper bound of the 95% confidence interval is also listed for each test in Table 8, indicating that we are 95% confident that the mean value for each of those tests for the websites is greater than the number listed.

FKRE tests
Table 9 presents the results of the one-way ANOVA, which compares each readability test's mean value across the six different website categories.None of the readability tests showed a statistically significant difference across the website categories, as the P value was >0.05 for each.Therefore, the different website categories were not statistically significantly different concerning their readability for any of the readability tests.

Discussion
This study shows that current online resources for sarcoidosis are inadequate for patient education.As was mentioned earlier, an FKRE value of 80 corresponds to the government-recommended sixth-grade reading level [17,18].The mean FKRE value of the websites in our analysis was 46.9.Using a one-sample, one-tailed t-test, this mean value was statistically significantly lower than a score of 80 (p<0.001).This indicates that these websites are more difficult to read than a sixth-grade reading level.We found similar results for the other reading tests.As mentioned, the non-FKRE tests are grade-level indicators, meaning that their scores correspond exactly to the reading level of the text.For the non-FKRE tests (FKGL, GFS, SMOG, ARI, CLI), when comparing their respective mean values (8.7, 9.1, 7.0, 7.2, 15.1) using a one-sample, one-tailed t-test, they were all statistically significantly greater than the government-recommended sixth-grade reading level (p<0.001).This demonstrates that websites about sarcoidosis found on Google and Bing are not written at the optimal level for patient education and health literacy regarding this disease.
The disappointing results for readability in online sarcoidosis material appear to match similar results for other immune-mediated diseases.Studies on the readability of online resources on lupus, systemic sclerosis, and Sjogren's syndrome reveal readability numbers that are well beyond the recommended standards [26][27][28].In fact, the FKGL scores for systemic sclerosis (11.5) and Sjogren's Syndrome (12.21) are even higher than the FKGL scores we found for sarcoidosis (8.7) in our analysis [26,28].This readability issue is not isolated to this field of medicine.Various other disciplines across the medical field also struggle with online resources being above the recommended patient reading levels, including urology, neurosurgery, pediatrics, and more [19,[29][30][31][32].
The ANOVA revealed that there were no statistically significant differences with respect to readability among the different website types.This shows that there is no particular resource type that patients can rely on for education on sarcoidosis.Unfortunately, ll of the website categories are above the recommended reading levels set by expert organizations.While one could potentially excuse academic and physician websites for having high readability scores, as their target audience instead of patients likely includes other clinicians or scientists trying to refresh their knowledge about a disease, this reasoning does not apply to commercial, government, and nonprofit websites.The creators of these websites should try to make their websites easier to read for patients, requiring a less complex lexicon.Moreover, even the academic and physician websites could also try their best to simplify language and avoid jargon so that if a patient were to use their websites, those patients could be more likely to understand more.
There are very real and crucial consequences from these findings.The end result is a missed opportunity to tackle health literacy across these fields and regarding these diseases.Low health literacy has led to a cost of anywhere between $50 billion and $73 billion per year [33].Health literacy is the single best predictor of health status, so patients need to have a good understanding to have the best medical outcome [33].Patients with limited health literacy have considerable gaps in their knowledge; therefore, they struggle to follow self-care and medical advice [34].Health literacy also enhances patient-physician communication, which can substantially improve compliance and clinical outcomes [35].A lack of comprehensible resources can cause issues with the patient-physician relationship and the patient's understanding of their ailment.Thus, boosting health literacy by having more easy-to-read material on the internet is imperative to help lead to more favorable outcomes for patients, including for patients managing sarcoidosis.This could potentially be achieved by using simpler language, avoiding jargon, and other means.
Limitations of this study include the fact that only English-language websites were analyzed.Census data indicate that approximately 22% of nearly 42 million Spanish-speaking Americans either did not speak English well or at all [36].This means that there are 9-10 million Americans who cannot get health information from English websites; rather, they use websites written in Spanish for their health information needs.This is worth investigating in a future research project.Additionally, the use of readability formulas, while convenient to use, are not definitive tools for measuring readability.These tools may have shortcomings, as some of the formulas used the number of syllables to gauge the difficulty of reading.For example, text containing words such as "dermal" and "pleural" would be considered easier to understand than text with words such as "steroids" and "disease."The former, while shorter, are part of the medical lexicon that the average individual is less likely to know, while the latter are more mainstream and likely to be known.This is the first study to evaluate the readability of content on sarcoidosis on websites written in English.It shows significant evidence that the material available on the internet is beyond the recommended literacy level for patients.Interventions such as intentionally simplifying the language and avoiding jargon should be undertaken to increase readability and simplify information for patient education.This could lead to improved disease outcomes, as patients would be better equipped to make appropriate healthcare decisions.Further studies should be conducted to examine the readability of online resources written in Spanish, given the prevalence of Spanish-speaking populations.

0. 4 [
(words/sentences) + 100(complex words/words)] Simple Measure of Gobbledygook (SMOG) Predicts the reading grade level of written material.Designed by G Harry McLaughlin.Uses the number of sentences and complex words (defined as words with three or more syllables) in its equation.1.043√(complex words x 30 / number of sentences) +3.1291 Automated readability index (ARI) Predicts the reading grade level of written material.Developed by RJ Senter and EA Smith for the US Air Force.Uses the number of words, sentences, and characters in its equation.4.71 (characters/words) + 0.5(words/sentences) -21.43 Coleman-Liau index (CLI) Predicts reading grade level of written material.Created by Meri Coleman and TL Liau.Uses the average number of letters per 100 words and the average number of sentences in its equation.0.0588 (average number of letters per 100 words) -0.296(average number of sentences per 100 words) -15.8 The mean readability values for FKRE are presented in Figure1.The mean readability values for non-FKRE tests are shown in Figure2.All categories had average readability scores above the recommended sixthgrade reading level.

Table 2
summarizes the websites excluded.A total of 34 websites were used in the analysis, with 11 unique for Google, 15 unique for Bing, and 8 for both search engines.Table3demonstrates the included websites separated into five categories.

Table 6 .
All readability test scores were above the sixth-grade reading level.