Utilization of Electronic Health Records for Chronic Disease Surveillance: A Systematic Literature Review

This study reviews the current utilization of electronic health records (EHRs) for chronic disease surveillance, discusses approaches that are used in obtaining EHR-derived disease prevalence estimates, and identifies health indicators that have been studied using EHR-based surveillance methods. PubMed was searched for relevant keywords: (electronic health records [Title/Abstract] AND surveillance [Title/Abstract]) OR (electronic medical records [Title/Abstract] AND surveillance [Title/Abstract]). Articles were assessed based on detailed inclusion and exclusion criteria and organized by common themes, as per the PRISMA review protocol. The study period was limited to 2015-2021 due to the wider adoption of EHR in the U.S. only since 2015. The review included only US studies and only those that focused on chronic disease surveillance. 17 studies were included in the review. The most common approaches the review identified focused on validating EHR-derived estimates against those from traditional national surveys. The most studied conditions were diabetes, obesity, and hypertension. The majority of reviewed studies demonstrated comparable prevalence estimates with traditional population health surveillance surveys. The most common approach for the estimation of chronic disease conditions was to use small-area estimation by geographic patterns, neighborhoods, or census tracts. The use of EHR-based surveillance systems for public health purposes is feasible, and the population health estimates appear comparable to those obtained through traditional surveillance surveys. The application of EHRs for public health surveillance appears promising and could offer a real-time alternative to traditional surveillance methods. A timely assessment of population health at local and regional levels would ensure a more targeted allocation of public health and healthcare resources as well as more effective intervention and prevention initiatives.


Introduction And Background
The wide adoption of electronic health records (EHR) in the United States in recent years presented an opportunity to consider additional use of the vast clinical data collected through EHR in population health surveillance [1][2][3]. To date, most population health surveillance is conducted through national surveys, such as the Behavioral Health and Risk Surveillance System (BRFSS) and the National Health and Nutrition Examination Survey (NHANES). A wide variety of EHR data has already been used in infectious disease surveillance [4]. However, the use of EHRs for chronic disease surveillance is less established and is a novel area of inquiry and knowledge.
In contrast to traditional surveillance methods, which involve self-reporting and a time lag in disseminating collected data and public health responses, EHR-based chronic disease surveillance presents two main advantages: timeliness and population-specific disease prevalence estimates that could inform locally relevant programs and interventions [5]. Currently, a few EHR-based surveillance systems already exist. It is yet to be established whether such systems would be able to provide reliable disease prevalence estimates at the state and local levels [6].
A recent study identified and classified challenges and solutions in the application of EHR to disease surveillance systems worldwide, with no specific focus on either infectious or chronic disease surveillance models [7]. The purpose of this study is to specifically examine the uptake of EHRs in chronic disease surveillance in the United States and explore the themes and patterns that have emerged so far in the approaches used and the health indicators studied.
The findings of this study were previously presented as a poster at the 2023 American Medical Informatics Association Informatics Summit on March 14, 2023. disease surveillance nationwide; (2) to examine the approaches that are used within EHR-based surveillance systems or undertakings; and (3) to determine what chronic conditions have been studied through EHRbased surveillance methods.

Review Methods
This study followed the PRISMA review protocol; however, it has not yet been registered.
The eligibility criteria for this systematic review include an exhaustive list of inclusion and exclusion criteria. The eligibility criteria were strictly followed to provide an accurate assessment of the current uptake of EHR-based population health surveillance models for chronic conditions nationwide. Due to the fact that EHRs were not widely adopted until 2015, we chose to limit the study period to the years 2015-2021. Only U.S. studies were included in the final review ( Table 1).

Inclusion criteria Exclusion criteria
Studies that report using EHR data for disease surveillance purposes Studies with publication years prior to 2015 Studies that focus on chronic conditions only Opinion articles that do not discuss practical applications Studies that report non-communicable disease prevalence or incidence Systematic reviews and/or conference proceedings

Screening by relevant titles/topics and abstract content
Screening involved removing articles that were systematic reviews, clearly identified as such, and opinion statements, clearly identified in the title. Screening by irrelevant titles -with infectious disease themes, a chronic disease without mention of EHRs or electronic medical records (EMR), or EHR data use other than for disease surveillance purposes. Examples of screening by irrelevant titles include assessing the quality of primary care, using EHR data for predictive modeling, or using EHRs as a source of data to answer research questions that are not relevant to population health surveillance. We identified 135 articles based on the initial screening process.
Further screening by relevant abstracts resulted in identifying initial patterns for data synthesis and organization: (1) EHR data to assess the prevalence of chronic conditions; (2) evaluation, development, and/or validation of a new or existing surveillance system; (3) EHR data repositories used for analyzing disease characteristics; and (4) evaluation of EHR primary care data for generalizability, sensitivity, and specificity of disease detection.

Data collection process
Data were independently extracted from the reports. Variables of interest in the review studies are shown in Table 2.

Risk of bias across studies
The studies included in the review may not be representative of all existing EHR-based chronic disease surveillance efforts that are ongoing, introducing publication bias. A few studies were retrospective by design, so there may be a time lag in reporting surveillance estimates or the results of validation measures that have been achieved.
We identified 596 records after using relevant keywords for the search criteria. After initial screening, we selected 135 articles, which were further screened for relevance and US-only utilization of EHR-based chronic disease surveillance. From 44 articles that were screened based on full-text content, 20 articles were identified for final review (Figure 1). Of those articles, four were combined into one study as they were reporting estimates on the same health indicators from the same surveillance system [5,[23][24][25].   Only health indicators of interest were included in this table; *when no study design was specified or could be identified, EHR-based public health surveillance was used; **one of the three most common diagnostic categories in the college student population. Table 4 presents all the variables of interest for each reviewed study. We identified eight health indicators across all studies. The most common chronic diseases to have been reported with EHR-based disease prevalence estimations were obesity, diabetes, and hypertension. The two most common study approaches used were the validation of EHR-derived estimates with those from traditional surveillance surveys, such as the BRFSS or the NHANES, and the use of small-area estimation by geographic patterns, neighborhoods, or census tracts.
The most common limitation addressed in the studies was the sampling bias of EHR-derived population health estimates. The majority of the studies reviewed herein demonstrated comparable prevalence estimates between EHR-derived and traditional population surveillance surveys.

Summary of Evidence
The utilization of EHR-based chronic disease surveillance is still nascent, with most studies conducting proof-of-concept, feasibility, or validation studies. The most common validation approach is to use estimates from EHR-derived data to compare them to those from traditional nationally representative surveillance surveys (i.e., NHANES, BRFFS, and ACS). This approach has shown comparable results across studies. There are only a small number (n=4) of studies that address disease prevalence estimates in children. Hence, there may be an opportunity and a need to develop more robust chronic disease surveillance models in the pediatric population.
The approach of small-area disease prevalence estimates with EHR-derived data is well recognized and shows promise in the estimation of local disease burdens and subsequent intervention programs and policy initiatives.
Obesity and diabetes, the most studied health indicators in both adults and children, suggest that there is a great need for timely prevalence estimates and emerging trends on the local level to determine the best path forward in addressing those chronic disease epidemics more effectively.
Also of note, only three studies have included both EHR and claims data to derive chronic condition prevalence estimates [13,16,21]. Two of those studies were part of the same clinical data research network, OneFlorida [16,21]. Given the fact that EHR data often presents as incomplete, it would be important to explore whether adding claims data to chronic disease surveillance systems would further improve the sensitivity of such systems and their concordance with traditional surveillance survey estimates.

Limitations
This study may not have been able to identify all EHR-based surveillance systems or models currently in use or development in the US due to a lag in reporting and inclusion criteria only focusing on the last five to six years of available data.

Conclusions
The use of detailed clinical data from electronic health records for chronic disease surveillance is still limited but is gaining traction. Future studies are needed to assess whether EHR-based chronic disease surveillance can augment or entirely replace current estimates derived using traditional surveillance methods, such as national surveys, in the coming years. To date, national surveys remain problematic due to the time lag between data collection and the dissemination of relevant prevalence estimates. Based on the reviewed surveillance systems, the estimates derived from the electronic health records for population health surveillance appear to be comparable to those from the national surveys while promising a timely assessment of population health and a more targeted allocation of public health resources.

Conflicts of interest:
In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.