- Very Large and Varied Population:
- More than 1.8 million discrete patients meeting the COVID-19 case definition included in sample since February 1, 2020
- 21 different health systems from across the nation with diverse populations
- The Potential of Longitudinal Data: Because each patient maintains a unique identifier and health system data were updated regularly, we can:
- Evaluate the frequency and types of re-treatment (inpatient or outpatient) for both COVID-19 and non-COVID-19 illnesses after the original COVID-19 diagnosis over the 24 months of data collection.
- Assess the evolution of the disease and its treatments on outcomes (e.g., changes in death rates within and across health systems).
- Limited Data Set: In calendar year 2020, data were entirely de-identified, HIPPA compliant. In calendar year 2021, the data set was transformed to a limited data set allowing for collection of dates of service and five-digit zip code.
- Extensive Array of Data Elements: Approximately 250 discrete EHR data elements provide the capacity to examine relations between these variables and outcomes.
- Regular Data Pulls Retrospective to February 1, 2020: Health system data has been extracted regularly since February 1, 2020. These repeated data extractions allow for the addition and refinement of data elements over time to:
- Add data elements of interest (e.g., vaccine data, Remdesivir, neurologic symptoms).
- Improve accuracy of data elements collected (e.g., COVID-19 diagnostic criteria, death definition).
- COVID-19 Cases Only: The cohort is limited to individuals diagnosed with COVID-19; thus, we are unable to examine the risk of developing COVID-19 or compare findings of cohort members to individuals without a COVID-19 diagnosis.
- Discrete Data Elements Only: The data pulled is almost exclusively limited to discrete EHR data elements. It typically does not include data from text fields within the EHR.
- Missing Data: For many EHR data elements (e.g., vaping), a large proportion of patients have no data reported.
- Harmonizing EHR Data across Health Systems: Merging data from different health systems highlights the idiosyncratic nature of how each system records EHR data, even if on the same EHR platform (e.g., Epic)
- Patient Data is limited to Health System Where COVID-19 Was Diagnosed: Thus, if patient dies or subsequently receives care at an alternative health system, those events are not captured.