The rapid pace of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2; COVID-19) pandemic presents challenges to the real-time collection of population-scale data to inform near-term public health needs as well as future investigations. We established the COronavirus Pandemic Epidemiology (COPE) consortium to address this unprecedented crisis on behalf of the epidemiology research community. As a central component of this initiative, we have developed a COVID Symptom Study (previously known as the COVID Symptom Tracker) mobile application as a common data collection tool for epidemiologic cohort studies with active study participants. This mobile application collects information on risk factors, daily symptoms, and outcomes through a user-friendly interface that minimizes participant burden. Combined with our efforts within the general population, data collected from nearly 3 million participants in the United States and United Kingdom are being used to address critical needs in the emergency response, including identifying potential hot spots of disease and clinically actionable risk factors. The linkage of symptom data collected in the app with information and biospecimens already collected in epidemiology cohorts will position us to address key questions related to diet, lifestyle, environmental, and socioeconomic factors on susceptibility to COVID-19, clinical outcomes related to infection, and long-term physical, mental health, and financial sequalae. We call upon additional epidemiology cohorts to join this collective effort to strengthen our impact on the current health crisis and generate a new model for a collaborative and nimble research infrastructure that will lead to more rapid translation of our work for the betterment of public health.
In facing the unprecedented challenge posed by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) disease pandemic (COVID-19), there has perhaps never been a moment in history when the importance of epidemiologic research has been clearer. Within the medical community, leaders have described an “urgent need to expand public health activities to elucidate the epidemiology of the novel [SARS-CoV-2] virus and characterize its potential impact” (1). For the most part, infectious disease epidemiologists have been answering this call. The call to action for cancer and, more broadly, chronic disease epidemiologists has been less obvious. Work-as-usual has been sidelined with many standard activities of cohort studies or clinical trials unable to proceed. Study visits for participants have been suspended. Biospecimens are left uncollected and unanalyzed. The short-term consequence is that research into cancer, an important cause of mortality and chronic disease with a long-time horizon, has received less attention given the immediacy and lethality of the COVID-19 threat before us. The cancer epidemiology community is left with many questions. To what extent should we adjust our research priorities? What long-term impact will COVID-19 have on cancer-related or other outcomes due to the virus itself or delays in screening, diagnosis, and treatment? What new tools should we use and what infrastructure needs to be set in place to assess such outcomes? Can we leverage ongoing cohort studies to address the COVID-19 crisis in the present while laying a foundation for the future? What can the cancer epidemiology community do together in ways that we cannot do alone? We herein describe the development of the COronavirus Pandemic Epidemiology (COPE) consortium (www.monganinstitute.org/cope-consortium), a collaborative effort in the cancer epidemiology and clinical research community that has come together to address these and other questions posed by the COVID-19 crisis.
Adjusting Our Research Priorities
While COVID has acutely become a leading cause of death in the United States (2), cardiovascular disease, cancer and other chronic diseases retain their position as leading causes of death and will remain there upon resolution of the pandemic. Nonetheless, COVID-19 will have an indelible impact on both ongoing and future studies. We must understand how this highly prevalent infection with substantial near-term risk will impact our study participants, particularly those with chronic underlying health conditions. Practically, this has implications on participant response to our usual assessments and loss-to-follow-up. Scientifically, this has implications on shorter-term clinical outcomes related to infection as well as downstream consequences. This may include chronic complications in patients with cancer related to COVID-19 disease, as well as indirect effects related to deferring medical care (including delays in screening, preventive interventions, or cancer treatment), the short- and long-term effects of stress on mental and physical health (and related health disparities), changes in important behaviors (e.g., diet and/or physical activity), and economic and social disruptions related to the COVID-19 public health response (3–5). Indeed, the long-term potential impact of such disruptions may be realized as spikes in cancer incidence or mortality in the years to come. There is also intriguing potential mechanistic connections between cancer oncogenes (e.g., TMPRSS2) and COVID-19 that may not yet be fully realized (6). Moreover, as we consider shifting research priorities, we must also remain cognizant of the primary goal of existing cohort studies. We must strike a balance where we address the unique effects of COVID-19 on cohort participants, without compromising the original mission of the cohorts. For example, we must carefully develop instruments for our cohorts that are not so burdensome that they lead to decreased future response rates or greater drop-out rates among critical subgroups.
Novel Approaches to Data Collection to Address Unprecedented Questions during a Health Crisis
The usual approaches to data collection in cancer epidemiology cohorts leverage instruments that are methodologically rigorous but typically slow to deploy. For example, traditional surveys or biospecimen collection protocols usually require a stepwise approach to pilot testing and validation to ensure scientific rigor and participant engagement. Even after this initial, often time-consuming process, population-wide implementation is purposefully gradual. For the study of a slowly evolving, noncommunicable disease process such as cancer, such a deliberative approach makes sense. However, the speed at which the COVID-19 pandemic is unfolding poses an unprecedented challenge to rapidly collect data to characterize the full breadth of COVID-19 disease and its impact on chronic disease–related exposures and outcomes.
Although the urgency of the situation may require some compromise in our approach, we must also ensure scientific rigor and minimize bias while maximizing relevance (7). Therefore, the true challenge to epidemiology in the time of COVID-19 is identifying the optimal balance of speed and precision in our data collection and analysis. We recently discussed the need for novel tools that deploy an adaptable real-time data-capture platform to rapidly and prospectively collect actionable data that encompass the spectrum of subclinical and acute presentations of COVID-19 (8). To that end, we at Massachusetts General Hospital and the Harvard T.H. Chan School of Public Health partnered with physicians and epidemiologists at King's College London and software engineers at Zoe Global Ltd., a health data science company to develop a free mobile app (COVID Symptom Study, previously known as COVID Symptom Tracker; http://covid.joinzoe.com/us) to collect real-time data on COVID-19 symptoms, baseline health factors, infection status, and clinical outcomes, especially where they may lead to insights quickly enough to change practices that can improve the prognosis of persons with COVID-19 (8).
The app enables quick (less than 3–5 minutes initially and less than 1 minute per day on follow-up) self-report of data related to COVID-19 exposure and infections (Fig. 1). At first use, the app queries location, age, and core health risk factors. Daily prompts query for updates on interim symptoms, health care visits, and COVID-19 testing results. In those who are self-isolating or seeking health care, the level of intervention and related outcomes are collected. Individuals without symptoms are also encouraged to use the app so as to allow for prospective collection and sensitive capture of onset of symptoms. This prospective approach can overcome some of the challenges with cross-sectional assessment during an ongoing health crisis where variation at in the individuals participating at multiple timepoints may contribute to bias. Through pushed software updates, we can add or modify questions in real-time to test emerging hypotheses about COVID-19 or fine-tune data capture. Importantly, participants enrolled in ongoing epidemiology studies, clinical trials, or patient registries can provide informed consent to link survey information collected through the app through a one-way flow of data back to cohorts in a Health Insurance Portability and Accountability Act (HIPAA)–compliant manner. Cohorts can then link these data to their preexisting study cohort data. A specific module is also provided for participants who identify as health care workers to determine the intensity and type of their direct patient care experiences, the availability and use of personal protective equipment (PPE), and work-related stress and anxiety. In future modules, it may be possible to ask additional questions from individuals living with cancer (e.g., treatment and survivorship), for which current cohort studies may not be well positioned to collect. Through deployment of this tool, we can rapidly gain critical insights into population dynamics of the disease (Fig. 2).
We have considered individual privacy at every step of planning for our approach and this is central to the one-way data flow structure to protect existing cohort data. Furthermore, we have minimized the number of personal identifiers collected by the app. For example, we collect year of birth, not day or month. Emails are used to register an individual to allow for secure access to the app via a user-set password, password recovery, and downstream linkage with existing cohort data. Participants may provide their names and phone numbers to assist with linkage, but these are optional data collection fields. Precise device location data are not collected; however, participants are asked to report their five-digit zip code to help with geolocations and public health recommendations based on area. All personal identifiers will be stripped in aggregate data repositories.
Leveraging Cancer and Other Large-scale Epidemiology Cohorts
With our collective experience in designing and maintaining large prospective studies, investigators with epidemiologic cohorts are uniquely positioned to leverage the goodwill of a large number of highly engaged study participants across the United States. Thus, we moved rapidly to establish an international consortium of epidemiologic cohort studies, the COPE team, to bring together multidisciplinary expertise in “big data” research and state-of-the-art epidemiologic methods. (www.monganinstitute.org/cope-consortium). Within COPE, we have made the COVID Symptom Tracker app available for investigators to use in their cohorts. These cohorts range in size from small clinical populations, to large population cohort studies. Members of the general public can also download the app. To date, we have deployed the COVID Symptom Tracker app in long-standing U.S. prospective cohorts that have contributed significantly to cancer epidemiology research, including the Nurses' Health Study (NHS), NHSII, NHS3 (9), the Growing Up Today Study (GUTS; ref. 10), the Health Professionals Follow-up Study (HPFS; ref. 11), the Multiethnic Cohort Study (MEC; ref. 12), the American Cancer Society Cancer Prevention Study-3 (CPS-3; ref. 13), the California Teachers Study (CTS; ref. 14), the Black Women's Health Study (BWHS; ref. 15), the Sister Study (16), the CHASING COVID Cohort Study (17), Aspirin in Reducing Events in the Elderly (ASPREE; ref. 18), the PREDETERMINE study, and the Predicting Progression of Developing Myeloma in a High-Risk Screened Population (PROMISE; ref. 19), and Precursor Crowdsourcing (PCROWD) Studies (20). Several other cohorts with a focus on nutrition (Stanford Nutrition Studies) and environmental health [the Gulf Long-term Follow-up (GuLF) Study (21), the Agricultural Health Study (22), and the NIEHS Environmental Polymorphisms Registry (EPR; ref. 23)] have also joined our efforts. Our partners in the United Kingdom have enrolled participants within the TwinsUK Study, and U.K. Biobank, and collaborators in Canada, Australia, and Sweden are planning to launch versions of the COVID Symptom Study in their respective countries. Those already enrolled in participating cohort studies were sent an invitation to download the COVID Symptom Tracker and asked to indicate their involvement in these cohorts. They were then invited to complete a study consent form that allows for the return of data collected through the app to the parent cohort study [under Institutional Review Board (IRB) for the specific institution and data sharing guidelines] for data linkage and future analysis.
COPE: Addressing the Near-term COVID-19 Crisis
In the near term, COPE's goal is to provide data on the COVID-19 pandemic that can be used to contribute in real time to the emergency response and address critical needs raised by this outbreak. The use of the COVID Symptom Study app within COPE is embedded within a larger effort to encourage members of the general public to self-report symptoms through the app. This has also benefited from outreach led by organizations such as Stand Up to Cancer, the Susan Love Foundation for Breast Cancer Research, the Damon Runyon Cancer Research Foundation, and the American Cancer Society. Deidentified data collected through the app from the combined population cohort members with contributors from the general public will facilitate real-time assessment of the spread of COVID-19, including recognition of symptom presentation patterns and identification of new “hot spots” or regions at high risk for COVID-19 disease that may benefit from greater testing or expansion of hospital capacity. This information can also be used to inform guidelines around isolation measures. For example, within the United Kingdom, the app has grown to include over 2.7 million participants with the data collected being used to inform the U.K. government about areas predicted to experience a high burden of COVID approximately 5 days later (8). The health authorities, in turn, have acted upon the information and further promoted use of the app to its communities. In the United States, investigators at University of Texas School of Public Health aim to use data collected from the COVID Symptom Study app to conduct state-wide surveillance of COVID-19 symptoms in Texas to support public health decision making, especially as mitigation strategies are being deimplemented. Tools like these trackers are not only important for short-term tracking, but for long-term surveillance after areas attempt to reopen to normal activity.
Similarly, we are committed to working together with similar data participatory infectious disease surveillance efforts in the United States (24, 25), including collaborations with CovidNearYou and the HowWeFeel Project (26). Although several other digital collection tools for COVID-19 symptoms have been launched in the United States, these applications have largely been configured to offer a single assessment of symptoms to tailor recommendations for further evaluation. Others have been developed for researchers to report information on behalf of patients enrolled in clinical registries. While these approaches offer critical public health insights, they may not be tailored for scalable longitudinal data required for epidemiologists to perform comprehensive, well-powered investigations. Moreover, this tool is distinct from current efforts underway by large technology companies to embed digital contact tracing within cell phones, which are capable of alerting individuals who were in close contact with cell phone users who became infected with COVID-19, which may lead to additional surveillance or privacy concerns among potential app users. Nonetheless, a collective effort will be required to advance our understanding of COVID-19 and reach additional populations. For example, our app is currently available only in English and in the United States and the United Kingdom, and while we are working to make this available in additional regions or languages, this takes significant development time and engineering resources. This creates a gap that complementary approaches may fill through continued collaboration, including translating the COVID Symptom Tracker questions and offering these via other platforms.
COPE: A Foundation for Future Research
A longer-term goal of COPE is to link real-time data collected on COVID-19 symptoms with previously collected data on lifestyle, diet, and mental and physical health conditions as well as germline genetic and other biomarker information among the individuals participating in cohort studies. Based largely on reports from other countries, COVID-19 incidence and outcomes appear to vary according to age, sex, race, and comorbidities (5, 27–31). Conflicting reports that angiotensin-converting enzyme (ACE) inhibitors, thiazolidinediones, and ibuprofen may contribute to disease severity (32–34) led the European Medicines Agency to call for “epidemiology studies to be conducted in a timely manner” (35). Moreover, the risk of COVID-19 associated mortality appears highest in individuals with extensive comorbidities. In Wuhan, China, the case fatality rate was elevated among patients with pre-existing comorbid conditions–10.5% for cardiovascular disease, 7.3% for diabetes, 6.3% for chronic respiratory disease, 6.0% for hypertension, and 5.6% for cancer (36) compared with 2.3% in all confirmed cases. Additional evidence suggests infection risk differs by blood group (37), implicating genetic susceptibility in pathogenesis. Other clinical biomarkers (e.g., IL6/IFNγ ratio; ref. 38) may be useful to understand risk and/or severity of COVID-19. Although there is no evidence yet for COVID-19, data from pooled randomized trials suggests vitamin D supplementation may lower the risk of respiratory infections (39). The influence of air pollution and climate on SARS-CoV-2 outcomes also requires investigation. These critical questions are hypotheses that can be readily tested in the living laboratories offered by cancer epidemiology cohorts with populations that have been extensively phenotyped. In COPE, investigators can pursue such studies within their own cohorts. These cohorts are uniquely positioned to address other key questions such as the impact of socioeconomic factors and health disparities, and potential risk modifiers such as diet, exercise, and supplement use. Specific cohorts will have the ability to link COVID Symptom app data with existing data on environmental and neighborhood level data (e.g., socioeconomic status, air pollution, poverty, rurality, etc.). However, we hope that cohorts can leverage the power of larger sample sizes by working together on joint analyses. We will also have an opportunity to investigate the long-term sequelae of COVID-19 infection and the response to the pandemic. We will want to understand the long-term impact of COVID-induced damage to multiple organ systems that may contribute to mechanisms of tumorigenesis or how interruptions to screening and surveillance may impact future cancer incidence. Continued follow-up within these cohorts will allow us to study the long-term impact on mental health and chronic disease incidence and management, mortality, and social and financial strain and how they relate to cancer incidence and survival. This may require collaborations that incorporate data collected through other efforts, including supplementary questionnaires or the collection of biospecimens that might be relevant to assess disease status (e.g., serologies). Eventually, such work within COPE can be further connected with other large-scale collaborations, including the COVID-19 Host Genetics Initiative https://www.covid19hg.org/data-sharing/. Taken together, COPE will provide critical insights into the symptoms of infection, epidemiologic, and molecular predictors of disease, its underlying biology, and impact on long-term health outcomes.
Overcoming Collective Challenges
Typical challenges in developing consortia include identifying committed leaders within each cohort who can facilitate human subjects' approval and data use agreements. In COPE, we have encouraged participation through the development of a “toolkit” to help facilitate human subjects IRB approval and establishment of clear channels for data collection and return of information. We are encouraged that most IRBs and academic medical center contracting authorities have streamlined their approval of research protocols and agreements related to COVID-19 to minimize delays. Some IRBs have also determined that initial promotion of a data collection tool does not require approval; study protocols can be modified at a later time for data linkage. Concerns have also been raised about participant burden during a time of crisis, particularly for cohorts that have already planned their own data collection initiatives for COVID-19 related outreach. However, as cohort leaders we bear a responsibility to offer a few opportunities for our participants to report on their experiences during this unprecedented life-changing event. It is especially important to prioritize approaches that may lead to near-term data to address the public health emergency.
A Call to Action
We encourage additional cohorts to join COPE, particularly as it has become clear that the COVID-19 crisis will likely be prolonged. We request member cohorts to promote the COVID Symptom Tracker to their study participants through local IRB approval. To enhance the potential impact on public health decision-making, we hope to recruit more participants across the U.S. We have made targeted efforts through outreach to individual investigators, the AACR Molecular Epidemiology Working Group, and Stand Up To Cancer. As of April 28, 2020 there were approximately 200,000 users in the United States and additional users are being added daily. We anticipate that these numbers will continue to increase as cohorts at the earliest stages of joining this effort reach out to additional participants. We hope that more investigators will break out of traditional research silos to join us in deploying new technologies to overcome the usual barriers of establishing consortia. Collectively, we will have a stronger voice in advocating for the needs of the cancer epidemiology community, particularly if we can demonstrate that our work can have near-term impact. Our colleagues leading clinical trials of experimental agents for the treatment of COVID-19 are experiencing unprecedented fast-tracking of their protocols for approval, funding, and dissemination of results. Cohort studies should also demand that the usual rules of engagement for conducting our research be adapted to accommodate a quick response and be adequately resourced to be successful. This call to action for the cancer epidemiology community to mount a collective response will not only have an impact on the current health crisis but also generate a new model for a collaborative and nimble research infrastructure that will lead to more rapid translation of our work for the betterment of public health.
Disclosure of Potential Conflicts of Interest
A.T. Chan reports receiving a commercial research grant from Zoe Global Ltd. P.W. Franks is a consultant for Zoe Global Ltd. C.R. Marinac reports receiving other commercial research support from GRAIL, Inc. S. Ourselin is a consultant for Johnson & Johnson. J. Wolf is CEO of and has ownership interest (including patents) in Zoe Global Ltd. T. Spector is a consultant for Zoe Global Ltd. No potential conflicts of interest were disclosed by the other authors.
Zoe Global Ltd. provided in kind support for all aspects of building, running, and supporting the app and service to all users worldwide. S. Ourselin, C.J. Steves, and T. Spector were supported by the Wellcome Trust and Engineering and Physical Sciences Research Council (WT212904/Z/18/Z, WT203148/Z/16/Z, WT213038/Z/18/Z), the National Institute for Health Research (NIHR) Guy's and St Thomas'/King's College London Biomedical Research Centre, the Medical Research Council/British Heart Foundation (MR/M016560/1), the NIHR, and the Alzheimer's Society (AS-JF-17-011). A.T. Chan is the Stuart and Suzanne Steele Massachusetts General Hospital (MGH) Research Scholar. The Massachusetts Consortium on Pathogen Readiness (MassCPR) and Mark and Lisa Schwartz supported MGH investigators (A.T. Chan, D.A. Drew, L.H. Nguyen, A.D. Joshi, W. Ma, C.-G. Guo, C.-H. Lo, R.S. Mehta, S. Kwon, D.R. Sikavi, and M.V. Magicheva-Gupta).