Algorithms Identifying Patients With Acute Exacerbation of Interstitial Pneumonia and Acute Interstitial Lung Diseases Developed Using Japanese Administrative Data

Background: We aimed to develop algorithms to identify patients with acute exacerbation of interstitial pneumonia and acute interstitial lung diseases using Japanese administrative data. Methods: This single-center validation study examined diagnostic algorithm accuracies. We included patients >18 years old with at least one claim that was a candidate for acute exacerbation of interstitial pneumonia, acute interstitial lung diseases, and pulmonary alveolar hemorrhage who were admitted to our hospital between January 2016 and December 2021. Diagnoses of these conditions were confirmed by at least two respiratory physicians through a chart review. The positive predictive value was calculated for the created algorithms. Results: Of the 1,109 hospitalizations analyzed, 285 and 243 were for acute exacerbation of interstitial pneumonia and acute interstitial lung diseases, respectively. As there were only five cases of pulmonary alveolar hemorrhage, we decided not to develop an algorithm for it. For acute exacerbation of interstitial pneumonia, acute interstitial lung diseases, and acute exacerbation of interstitial pneumonia or acute interstitial lung diseases, algorithms with high positive predictive value (0.82, 95% confidence interval: 0.76-0.86; 0.82, 0.74-0.88; and 0.89, 0.85-0.92, respectively) and algorithms with slightly inferior positive predictive value but more true positives (0.81, 0.75-0.85; 0.77, 0.71-0.83; and 0.85, 0.82-0.88, respectively) were developed. Conclusion: We developed algorithms with high positive predictive value for identifying patients with acute exacerbation of interstitial pneumonia and acute interstitial lung diseases, useful for future database studies on such patients using Japanese administrative data.


Introduction
Interstitial lung disease (ILD) is the term used to describe a rare and heterogeneous group of diseases with high mortality.Some ILDs have an acute course, while others have a chronic fibrosing course with occasional acute exacerbation (AE), and severe respiratory failure occurs in both types [1].The poor prognosis associated with AE of idiopathic pulmonary fibrosis (IPF; AE-IPF) is well reported.However, AEs of other fibrotic ILDs are also associated with poor prognoses, although little is known about them [2].Pulmonary alveolar hemorrhage (PAH) is another disease that requires differentiation from the aforementioned diseases [1].Each of these diseases requires hospitalization and sometimes results in fatal outcomes.Although two randomized controlled trials (RCTs) exist for one of these conditions, i.e., AE-IPF [3,4], it is not easy to conduct RCTs for such rare diseases and their external validity is limited.Therefore, it is necessary to conduct large observational studies on diverse patient populations utilizing real-world data to complement RCTs [5].
However, large-scale database studies on AE of interstitial pneumonia (IP), acute ILDs, and PAH remain scarce [6].This scarcity may be attributed to challenges in the accurate diagnosis of these diseases, especially because important imaging information cannot be accessed in database studies.For accurately identifying cases in large-scale database research, it is crucial to conduct validation studies that evaluate the accuracy

Setting and patients
We conducted a retrospective, single-center validation study to examine the accuracy of diagnostic algorithms for AE-IP, acute ILDs, and PAH.The study was conducted at the Saiseikai Kumamoto Hospital, which is a 400-bed tertiary teaching hospital located in Kumamoto, Japan.We included patients aged >18 years with at least one claim for the following diagnostic International Classification of Diseases, 10th Revision (ICD-10) codes: J679, J702, J704, J80, J82, J841, J849, J984, M331, M332, or R048 (detailed explanation in the subsequent paragraph) who were admitted to our hospital between January 2016 and December 2021 (Table 1).We did not exclude patients with multiple hospitalizations and considered each hospitalization separately.

TABLE 1: Disease names adopted as inclusion criteria
AE, acute exacerbation; AIP, acute interstitial pneumonia; ALI, acute lung injury; ARDS, acute respiratory distress syndrome; ARF, acute respiratory failure; CPFE, combined pulmonary fibrosis and emphysema; DM, dermatomyositis; DAD, diffuse alveolar damage; DPC, diagnosis procedure combination; EP, eosinophilic pneumonia; HP, hypersensitivity pneumonitis; ICD-10, the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision, IIP, idiopathic interstitial pneumonia; ILD, interstitial lung disease; IP, interstitial pneumonia; IPF, idiopathic pulmonary fibrosis; NSIP; nonspecific interstitial pneumonia; OP, organizing pneumonia; PM, polymyositis; UIP, usual interstitial pneumonia *Disease name in main diagnosis, diagnosis that triggered hospitalization, diagnosis using the most medical resources, or diagnosis using the second most medical resources in the DPC data We extracted data from the Diagnosis Procedure Combination (DPC), a payment system used in acute care hospitals throughout Japan to reimburse healthcare providers for their medical services [10,11].Under DPC, providers are paid a fixed amount for each diagnosis-procedure combination, rather than payment according to a fee-for-service model.It is intended to encourage efficient and cost-effective healthcare delivery by reducing unnecessary testing and procedures.Each diagnosis and procedure is assigned a specific DPC code, which is used to determine the amount of money reimbursed to the healthcare provider.The DPC codes are used to classify diagnoses and procedures into categories based on their complexity and resource use.In the DPC, the diagnoses are divided into the following six categories: (1) the main disease, (2) the disease that triggered hospitalization, (3) the disease that required the use of most medical resources, (4) the disease that required the use of second-most medical resources, (5) comorbidities present at admission, and 6) conditions that developed after admission [12].Disease names belonging to all these categories are assigned ICD-10 codes.Moreover, modifiers such as disease course (acute/chronic), suspicion, and AE can be appended to the disease names.In this study, we used the ICD-10 code for any of the diseases in the first three categories as the algorithm for patient inclusion.We ignored all the claims with "suspicion" attached to them.Meanwhile, receipt data, which are part of the DPC content, are submitted to the insurer for billing purposes.These include basic patient information (e.g., age, sex), disease names, certain medical procedures, and drugs (Figure 1).Although these disease names are not as finely classified as the DPC diagnoses, the main disease, disease that triggered hospitalization, and disease using the most medical resources are each flagged for the hospitalized patients.Some information on the administered procedures and drugs is reflected in the receipts, but not necessarily for those that are not calculated according to the DPC system as mentioned above.We extracted the following data: demographic data (age, sex), body mass index, Brinkman index, Barthel index score, Hugh-Jones classification, disease names, procedures (blood test: D007-28 for Krebs von den Lungen-6 (KL-6) testing and D007-35 for surfactant protein-D (SP-D) testing, imaging: E002-1 for chest CT, oxygen administration information: J024 for oxygen inhalation, J026-4 for high-flow therapy, and J045 for mechanical ventilation), medications (steroid therapy), and additional fee for emergency medical management.

Reference standard
AE-IP was diagnosed based on the 2016 AE-IPF criteria [13] as follows: (1) previous or concurrent diagnosis of ILD; (2) acute worsening or development of dyspnea, typically within 30 days; (3) high-resolution CT with new bilateral ground-glass opacity and/or consolidation on the background of fibrosis; and (4) deterioration not fully explained by cardiac failure or fluid overload.Acute ILDs included acute IP (AIP), cryptogenic organizing pneumonia (OP), acute hypersensitivity pneumonitis (HP), connective tissue disease-associated ILD (especially idiopathic inflammatory myopathy-associated ILD), acute eosinophilic pneumonia (EP), and drug-associated lung injury [1].Among these, AIP and cryptogenic OP were diagnosed based on the statement of the American Thoracic Society/European Respiratory Society [14], and acute HP was diagnosed based on the international guidelines of the American Thoracic Society/Japanese Respiratory Society/Latin American Thoracic Association [15].The diagnosis of idiopathic inflammatory myopathy was based on the European League Against Rheumatism/American College of Rheumatology classification criteria [16].The diagnosis of acute eosinophilic pneumonia was based on modified Philit criteria [17], and that of drugassociated lung injury was based on criteria reported in a previous study [18].Meanwhile, because the exact diagnosis of these diseases is sometimes difficult, some cases of acute ILDs were diagnosed through clinical judgment based on the above criteria.If a patient met the criteria for both AE-IP and acute ILDs, AE-IP was considered the diagnosis.For example, patients with drug-associated lung injury were classified as having AE-IP if they had a background of fibrosis on CT and as having acute ILDs if they did not.In the case of PAH, there is no gold standard for diagnosis.Hemoptysis and anemia, along with the findings of imaging (new bilateral infiltrates on chest radiographs/CT) and bronchoalveolar lavage (BAL) studies (progressively hemorrhagic BAL or hemosiderin-laden macrophages) help diagnose PAH [19][20][21].However, it was reported that hemoptysis was initially absent in approximately one-third of the cases [20].Furthermore, BAL is not always possible if a patient has severe respiratory failure.Therefore, PAH was diagnosed clinically based on the aforementioned findings.
Two respiratory specialists preliminarily established the above-mentioned criteria.Diagnoses of AE-IP, acute ILDs, and PAH were confirmed by at least two respiratory physicians based on the aforementioned criteria via chart review.Any conflicts were resolved through discussion.The physicians were blinded to how the claims were coded.The algorithms were derived using data on the disease names, emergency medical care claims, prescription drugs (corticosteroids), blood tests (KL-6, SP-D), imaging tests (chest CT), and oxygen administration (O2 supplementation, high-flow therapy, and mechanical ventilation).First, the accuracy of each disease name in the inclusion criteria, as well as the combination of these disease names with other information (e.g., the AE modifier and emergency admission flag), were evaluated for the diagnosis of AE-IP, acute ILDs, and PAH.Based on previous studies [8,22], we also evaluated the accuracy of combining each disease name with acute respiratory failure (ARF) (J960).If fewer than 10 patients had any of the three conditions, we discontinued the creation of an algorithm for that disease.Furthermore, for disease combinations with < 10 cases, no additional information on drugs, tests, or procedures was considered.Next, algorithms were created to identify each disease by combining the disease names with a high positive predictive value (PPV).We then developed two sets of algorithms: 1) specific algorithms (narrow algorithms) and 2) algorithms with a marginally lower PPV but higher true positives (TPs) (broad algorithms).When creating the narrow algorithms, we combined disease names with PPVs of 0.8 or above, with higher PPVs receiving preference.Meanwhile, the broad algorithms were developed by merging disease names with PPVs of 0.7 or greater, with a focus on having more TPs.A sample size of approximately 100 is sufficient for validation studies that only attempt to obtain the PPV [23].Therefore, we created each algorithm using more than 100 cases.

Statistical analysis
For diagnosis, we evaluated the interobserver agreement between the two respiratory physicians using Gwet's AC1-index, which is one of the measures of the observer agreement and is less affected by prevalence than kappa statistics [24].
For each disease name and its combinations, the PPV was computed as the proportion of TPs to the sum of TPs and false positives.Moreover, the 95% confidence intervals (CIs) for the binomial distribution were computed using an exact method.As a chart review was not performed for patients without the target ICD-10 codes, the sensitivity, specificity, and negative predictive value could not be calculated.Statistical analyses were performed using Stata software (Stata Statistical Software, Release 1, StataCorp LLC, College Station, TX).

Ethics
This study was approved by the Ethics Committee of Saiseikai Kumamoto Hospital (approval number: 1065) and Kyoto University Graduate School and Faculty of Medicine (approval number: R3596).The requirement for written informed consent was waived because of the retrospective nature of the study.We explained this study on the website of our hospital and the patients were given the opportunity to opt out.

Results
A total of 1,109 hospitalizations were analyzed, including 285 for AE-IP and 243 for acute ILDs.We judged 576 admissions were not for AE-IP, acute ILDs or PAH, of which 94 were not for IP and 482 were for IP but not for an AE or acute ILDs.Of the 842 included patients, 667, 114, 42, 12, 3, 3, and 1 had been admitted once, twice, three times, four times, five times, six times, and seven times, respectively.Only 5 patients had PAH; therefore, we did not develop an algorithm for PAH detection.The patient characteristics are shown in Table 2.The interobserver agreement assessed using Gwet's AC1-index was 0.78 (95% CI, 0.76-0.81).The PPVs with 95% CIs for each disease name are shown in Table 3.Several disease names (such as diffuse IP, nonspecific IP, diffuse alveolar damage, and acute lung injury) occurred in zero patients.In addition, Table 4 shows the disease names with additional information on steroid therapy, and Table 5 shows those with information on KL-6 and SP-D tests, chest CT, O2 supplementation, high-flow therapy, and mechanical ventilation.We included only additional information on steroid therapy in the algorithm because including the other information reduced the number of patients and the information did not outweigh the information on steroid therapy.In the receipt data, the recorded percentages of blood tests, imaging tests, procedures, and drugs were low, even though they were actually performed in clinical practice (KL-6: 37/771, 5%; SP-D: 56/574, 10%; chest CT: 34/437, 8%; oxygen inhalation: 48/776, 6%; high-flow therapy: 5/87, 6%; mechanical ventilation: 21/174, 12%; and steroid therapy: 101/668, 15%).However, this information was reflected in the DPC dataset.Therefore, we decided to create two groups of algorithms, depending on whether additional information on steroid therapy was available: 1) when only receipt data were available and 2) when data from the entire DPC dataset were available.In cases of prolonged hospitalization, where multiple hospitalization receipts could have been generated, the receipts from the last month were used.The PPVs and 95% CIs for each disease, generated using receipt data alone, are shown in Table 6.

Narrow Algorithm Without Corticosteroid Information
Several disease names that had PPVs higher than 0.8 for AE-IP were identified, including those with AE modifiers (AE-idiopathic IP, AE-IP, and AE-HP).Combining these disease names, a narrow algorithm without corticosteroid information with a PPV of 0.82 (95% CI, 0.76-0.86)was derived (Table 7).

Narrow Algorithm With Corticosteroid Information
In addition to the narrow algorithm of AE-IP without corticosteroid information, several disease names that combined IPF with AE modifiers, emergency hospitalization, and information on steroid therapy had PPVs of 0.8 or higher.A narrow algorithm with corticosteroid information with a PPV of 0.84 (95% CI, 0.78-0.88)was derived by combining these disease names (Table 7).

Broad Algorithm Without Corticosteroid Information
In addition to the narrow algorithm of AE-IP without corticosteroid information, AE-IPF had a PPV higher than 0.7, and a broad algorithm without corticosteroid information with a PPV of 0.81 (95% CI, 0.75-0.85)was derived (Table 7).

Broad Algorithm With Corticosteroid Information
In addition to the narrow algorithm of AE-IP with corticosteroid information, AE-IPF had a PPV higher than 0.7, and a broad algorithm with corticosteroid information and a PPV of 0.82 (95% CI, 0.76-0.86)was derived (Table 7).

Narrow Algorithm Without Corticosteroid Information
Several disease names with PPVs higher than 0.8 for acute ILDs were identified, including EP, AE-OP, acute drug-induced ILDs, and drug-induced IP + emergency admission.Combining these disease names led to the development of a narrow algorithm without corticosteroid information with a PPV of 0.82 (95% CI, 0.74-0.88;Table 7).In addition to the narrow algorithm of acute ILDs without corticosteroid information, several disease names that combined idiopathic IP, IP, OP, and pulmonary fibrosis with ARF, emergency hospitalization, and information on steroid therapy had PPVs higher than 0.8, and a narrow algorithm with corticosteroid information and a PPV of 0.82 (95% CI, 0.75-0.87)was derived by combining these disease names (Table 7).

Broad Algorithm Without Corticosteroid Information
Among the disease names of the narrow algorithm for acute ILDs without corticosteroid information, druginduced IP and OP had PPVs higher than 0.7, without combinations with AE modifiers or emergency admission flags.Furthermore, acute IP and idiopathic IP + ARF had PPVs > 0.7.Combining these disease names led to the development of a broad algorithmwithout corticosteroid information with a PPV of 0.77 (95% CI, 0.71-0.83;Table 7).

Broad Algorithm With Corticosteroid Information
In addition to the broad algorithm of acute ILDs without corticosteroid information, several disease names including idiopathic IP, pulmonary fibrosis, IP + emergency admission + information on steroid therapy, and IP + ARF + corticosteroid information had PPVs of 0.7 or higher.Combining these disease names, we finally derived a broad algorithm with corticosteroid information and a PPV of 0.77 (95% CI, 0.71-0.82;Table 7).

Algorithm for AE-IP or acute ILDs
Narrow Algorithm Without Corticosteroid Information Several disease names with PPVs higher than 0.8 for either AE-IP or acute ILDs were identified, including AIP, acute drug-induced ILDs and IP, dermatomyositis-ILD, idiopathic IP + AE/ARF, OP + AE/emergency admission, and IP/HP + AE.We derived a narrow algorithm without corticosteroid information with a PPV of 0.89 (95% CI, 0.85-0.92)by combining these disease names (Table 7).

Narrow Algorithm With Corticosteroid Information
In addition to the narrow algorithm of AE-IP or acute ILDs without corticosteroid information, several disease names that combined IPF, idiopathic IP, IP, and pulmonary fibrosis with AE modifiers, ARF, emergency hospitalization, and information on steroid therapy had PPVs of 0.8 or higher.A narrow algorithm with corticosteroid information and a PPV of 0.89 (95% CI, 0.85-0.91)was derived by combining these disease names (Table 7).

Broad Algorithm Without Corticosteroid Information
Among the narrow algorithms of AE-IP or acute ILDs without corticosteroid information, OP had a PPV > 0.7, without combinations with AE modifiers or emergency admission flags.In addition, IPF/pulmonary fibrosis + AE and IP/HP + ARF had PPVs > 0.7.Finally, we derived a broad algorithm without corticosteroid information and with a PPV of 0.85 (95% CI, 0.82-0.88)by combining these disease names (Table 7).

Broad Algorithm With Corticosteroid Information
In addition to the broad algorithm of AE-IP or acute ILDs without corticosteroid information, IPF/idiopathic IP/IP + emergency admission + corticosteroid information and pulmonary fibrosis + emergency admission/ARF + information on steroid therapy had PPVs of 0.7 or higher.A broad algorithm with corticosteroid information and a PPV of 0.85 (95% CI, 0.82-0.88)was derived by combining these disease names (Table 7).
The narrow and broad algorithms for AE-IP and acute ILDs derived in this study had favorable PPVs.We recently reported a validation study of AE-IPF in which the PPVs for the narrow and broad algorithms were 0.72 (95% CI, 0.62-0.81)and 0.61 (0.53-0.68), respectively [8].This multicenter study was conducted in eight Japanese tertiary centers and focused only on AE-IPF, excluding AE of other ILDs.Furthermore, a validation study of AE-IPF performed in the United States showed a PPV of 0.621 (95% CI 0.533-0.704)[22].Both algorithms of AE-IP derived in the present study had a higher PPV than the algorithms used in the previous studies.This may be because we did not distinguish only AE-IPF but also considered the AE of other IPs.It is often difficult to distinguish IPF from other fibrotic ILDs.However, after developing AE, patients with IPF and those with other fibrotic ILDs receive similar treatment, mainly high-dose corticosteroid therapy, and their prognosis is equally poor [25].Therefore, we see no problem in lumping the AEs of IPF and other fibrotic ILDs together, and the algorithm developed in this study can be applied to studies of patients with AE-IP.A recent validation study examined the validity of DPC data for various respiratory diseases and found high specificity and PPVs for both IPF and non-IPF IP [9].Both IPF (J841 for IPF in Japanese) and non-IPF (C966, D219, E85, and J841 for IPs other than IPF in Japanese and J60-J65, J67, J70, J840, M05, M06, and M30-35) IP data were extracted if one of these disease names was listed in the DPC database, and their algorithm did not include modifiers such as AE.Therefore, IP with an acute course was not considered extractable using their algorithm.In addition, the ICD-10 codes for non-IPF IPs considered in that study were more extensive than those in our study.However, the ICD-10 codes that were not included in our study denoted rare IPs and IPs with non-acute courses.Therefore, not using these ICD-10 codes for identifying AE-IP and acute ILDs was not considered a major problem.We investigated the validity of diagnosis for IP in a larger number of patients, especially those with AE-IP and acute ILDs, and developed algorithms with a high PPV.
The receipt data of hospitalized patients may be an unreliable source of information regarding the use of drugs and procedures.This is because a low percentage of these treatments and procedures are typically recorded on receipts, even if they were actually administered, whereas they are more thoroughly documented in the DPC database.The reason for this is that, under the Japanese DPC system, a fixed amount is paid based on the combination of disease and procedure instead of fee-for-service; thus, detailed claims are meaningless.It is inappropriate to use inpatient drug and procedure claims when conducting studies using receipt data.
Additional information on steroid therapy may be useful in identifying AE-IP or acute ILD cases; however, the impact of this information may depend on the algorithm used.In the present study, adding information on steroid therapy increased the PPV for most disease names.For a narrow algorithm, adding this information can increase the number of TPs without reducing the PPV and can help in the accurate identification of AE-IP or acute ILDs.In other words, algorithms based on steroid information help to extract target patients with higher sensitivity without compromising specificity.Meanwhile, for the broad algorithm, adding information on steroid therapy did not markedly increase the number of TPs.This suggests that including this information may be more helpful in identifying cases of AE-IP or acute ILDs in a limited patient population specified by researchers than in the general patient population.

Clinical and research implications
The algorithms developed in our study could contribute to future database studies using Japanese administrative data on AE-IP and acute ILDs.Some previous studies have used thresholds of 80-85% to denote the PPVs of their algorithms as "high" [26,27], and our algorithms have PPVs close to these thresholds and are sufficient to identify patients with AE-IP and acute ILDs, rare diseases with no established treatment.Although the algorithms developed in this study generally have PPVs above 80%, many of the false-positive patients were characterized by having IP but not AE or acute ILDs, which requires caution when interpreting eligible patients.Studies that examine drug efficacy and identify drug toxicities require large sample sizes.We believe our algorithms will be useful for database studies on drug efficacy and toxicity that require large sample sizes.
However, in this study, we did not calculate the sensitivity or specificity, but only calculated the PPV.The algorithms are not designed to include all of the target population, as there may be other patients with AE-IP and acute ILDs than those included in the present study.In other words, if the results of the present study are used in a database study, the entire population of patients with AE-IP and acute ILDs cannot be included.Therefore, further studies using random or "all possible cases" sampling are needed to calculate the sensitivity and specificity.Furthermore, large-scale studies are required to develop an algorithm for PAH detection, which we could not achieve due to the small number of TPs.

Strength and limitations
Our study has several strengths.First, this is the first study to develop algorithms for AE-IP and acute ILDs using Japanese administrative data.Second, we have studied a large number of patients and developed algorithms with a high PPV.
The present study has several limitations.First, it was conducted at a single tertiary hospital in Japan, which may limit the generalizability of the insurance claims data and DPC in Japan as a whole.Furthermore, the results of this study are naturally not applicable to patients in other countries.Therefore, our results should be validated in other Japanese medical institutions and other countries.Second, AE-IP, acute ILDs, and PAH are sometimes difficult to accurately diagnose, and the decision of the two respiratory physicians depended on the tests conducted by the physicians who treated patients.Therefore, misclassification cannot be completely ruled out.Third, although patients with acute ILDs comprise a heterogeneous population, we considered them as a single group because it was clinically challenging to precisely identify the underlying cause in many cases.Therefore, the findings for this group should be interpreted with caution.

Conclusions
The algorithms developed in this study had a high PPV for identifying patients with AE-IP and acute ILDs.
Incorporating information on steroid treatment may enhance the specificity of patient selection for AE-IP and acute ILD cases.However, the recording of administered drugs, procedures, and tests in hospital claim data was found to be infrequently documented.Therefore, reliance on claim data alone for these data is not recommended for inpatient studies.These insights are valuable for future database research in Japan.The algorithms we have created can be used in future epidemiological studies using Japanese administrative data to investigate the efficacy and toxicity of treatments in patients with AE-IP and acute ILDs.

Appendices
Appendix A describes the checklist of reporting criteria for studies validating health administrative data algorithms (Table 8).

FIGURE 1 :
FIGURE 1: The contents of the entire DPC data and receipt data ADL, activities of daily living; DPC, Diagnosis Procedure Combination

2024
Anan et al.Cureus 16(1): e53073.DOI 10.7759/cureus.53073 NOT APPLICABLE TITLE, KEYWORDS, ABSTRACT Identify article as study of assessing diagnostic accuracy ✔︎ Identify article as study of administrative data ✔︎ INTRODUCTION State disease identification & validation one of goals of study ✔︎ METHODS Participants in validation cohort Describe validation cohort (Cohort of patients to which reference standard was applied) ✔︎ Describe recruitment procedure of validation cohort • Inclusion criteria ✔︎ • Exclusion criteria ✔︎ Describe patient sampling (random, consecutive, all, etc.) ✔︎ Describe data collection ✔︎ • Who identified patients and did the selection adhere to patient recruitment criteria?✔︎ • Who collected data ✔︎ • A priori data collection form ✔︎ • Disease classification ✔︎ • Split sample (i.e.re-validation using a separate cohort) ✔︎ Test Methods: Describe number, training and expertise of persons reading reference standard ✔︎ If >1 person reading reference standard, quote measure of consistency (e.g.kappa) ✔︎ Blinding of interpreters of reference standard to results of classification by administrative data e.g.Chart abstractor blinded to how that chart was coded ✔︎ Statistical Methods: Describe methods of calculating/comparing diagnostic accuracy ✔︎ RESULTS: Participants: Report when study done, start/end dates of enrollment ✔︎ Describe number of people who satisfied the inclusion/exclusion criteria ✔︎ Study flow diagram ✔︎ Test results: Report distribution of disease severity ✔︎ Report cross-tabulation of index tests by results of reference standard ✔︎ Estimates: Report at least 4 estimates of diagnostic accuracy ✔︎ Diagnostic Accuracy Measures Reported:

TABLE 8 : Checklist of reporting criteria for studies validating health administrative data algorithms Additional Information
Report accuracy for subgroups (e.g.age, geography, different sex, etc.) ✔︎If PPV/NPV reported, ratio of cases/controls of validation cohort approximate prevalence of condition in the population ✔︎