Abstract
Physical activity and sleep are behavioral risk factors for cancer that may be influenced by environmental exposures, including built and natural environments. However, many studies in this area are limited by residence-based exposure assessment and/or self-reported, time-aggregated measures of behavior.
The Nurses' Health Study 3 (NHS3) Mobile Health Substudy is a pilot study of 500 participants in the prospective NHS3 cohort who use a smartphone application and a Fitbit for seven-day periods, four times over a year, to measure minute-level location, physical activity, heart rate, and sleep.
We have collected data on 435 participants, comprising over 6 million participant-minutes of heart rate, step, sleep, and location. Over 90% of participants had five days of ≥600 minutes of Fitbit wear-time in their first sampling week, and this percentage dropped to 70% for weeks 2 to 4. Over 819 sampling weeks, we observed an average of 7,581 minutes of heart rate and step data [interquartile range (IQR): 6,651–9,645] per participant-week, and >2 million minutes of sleep in over 5,700 sleep bouts. We have recorded location data for 5,237 unique participant-days, averaging 104 location observations per participant-day (IQR: 103–107).
This study describes a protocol to incorporate mobile health technology into a nationwide prospective cohort to measure high-resolution objective data on environment and behavior.
This project could provide translational insights into interventions for urban planning to optimize opportunities for physical activity and healthy sleep patterns to reduce cancer risk.
See all articles in this CEBP Focus section, “Modernizing Population Science.”
Introduction
Physical activity, sleep patterns, and obesity are major behavioral risk factors for cancer that may be influenced by environmental factors, including the built environment and green spaces. Inadequate physical activity in the United States contributes to over 12% of breast and colon cancers, on par with the disease burden of smoking (1). However, 80% of Americans do not report meeting the guideline of 150 minutes of moderate-intensity aerobic activity per week (2). In addition, 50–70 million U.S. adults have chronic sleep and wakefulness disorders: 35% sleep <7 hours on average per night and 38% unintentionally fall asleep during the day (3, 4). Epidemiologic studies have reported that sleep disturbances are linked to breast (5–11), colon (12, 13), prostate (14–16), and endometrial cancer.(17) In addition, lack of physical activity and sleep act together to drive obesity (18, 19), also a known risk factor for cancer (20–22).
Health behaviors are strongly influenced by the dynamic social and physical environments within which individuals interact (23). Built and natural environments can provide “walkable” neighborhoods with safe, shaded, and protected parks that provide opportunities for routine physical activity, which may subsequently drive sleep patterns. These environments may also buffer individuals from exposure to noise, light at night, and air pollution, which may in turn lead to improved sleep quality. Research is evolving on how the built and natural environments influence health behaviors; however, most of these studies are cross-sectional and therefore limited by the resolution of the data collected. Using only a participant's residential address to define environmental exposure ignores the fact that many people spend most of their day outside the home (24, 25). Physical activity and sleep are often based on time-aggregated, self-reported data, which do not provide objective information on timing or duration of these behaviors (26–30). Measurement error in environmental exposure assessment and self-report of physical activity and sleep may lead to bias in assessing the impact of environmental factors on health behaviors.
New consumer technologies such as smartphones and wearable accelerometer devices have unlocked the potential to gather data on location-based behavior at a greatly increased spatiotemporal resolution. High precision global positioning systems (GPS) data from a smartphone can be integrated with spatial datasets (e.g., built and natural environments, noise, and air pollution) (31–36), to create personalized, intraday, dynamic environmental exposure metrics. High precision activity data from consumer wearable devices containing miniaturized accelerometers, such as Fitbits, can be used as detailed, objective measures of participants' physical activity. Fitbits measure physical activity as well as research-grade accelerometry in lab settings (37), are comparable with electrocardiography (ECG) for heart rate during routine physical activity (38–42), and have been tested for validity and usability in free-living subjects (43–47), although it is worth noting that not all studies are consistent (41, 46, 48). In measuring sleep, Fitbit performs similarly to research-grade accelerometry when compared with the gold standard of polysomnography (49–51), although Fitbit may systematically overestimate sleep (51) and some sleep metrics (e.g., total sleep time) may be more valid than others (e.g., sleep efficiency; ref. 49). Wearable devices are often designed to communicate with a smartphone and securely upload data in near real-time, alleviating the need to mail devices back to researchers, which is costly and burdensome (52). Modern mobile health (mHealth) technologies offer a passive, low-cost opportunity to address current limitations in measuring the impact of environment on cancer-related health behaviors.
In this protocol article, we describe our methodology to integrate a custom smartphone application and a consumer-wearable device protocol into the ongoing Nurses' Health Study 3 (NHS3), as well as use the data generated from this study. This pilot protocol, known as the NHS3 Mobile Health Substudy, aims to pilot approaches that will provide detailed information on environmental exposures and health behaviors using modern technology. The substudy collects data on minute-level smartphone-based GPS, and Fitbit-based physical activity, heart rate, and sleep over 7-day sampling periods four times across a year for a maximum of 28 days of sampling per participant. We have provided some preliminary statistics on our data gathering to date; however, future analyses will assess the success of this pilot study and implications for scaling up measures to the full NHS3 cohort. Embedding these novel mHealth measures within an ongoing prospective cohort study will enable researchers to (i) examine how these mHealth measures compare with traditional environmental (GPS- vs. residence-based) and behavioral (Fitbit vs. self-report) measures; (ii) understand the distribution of and variability in physical activity, heart rate, and sleep both between participants as well as within participant; and (iii) analyze the longitudinal associations between minute-level environmental exposure and health behaviors.
Materials and Methods
NHS3 mobile health substudy
The NHS3 is an ongoing internet-based open cohort study of male and female nurses in the United States and Canada that began in 2010 (53). To be eligible for the study, participants have to be either a registered nurse, licensed practical/vocational nurse, or nursing student and born on or after January 1, 1965. As of February 2020, 48,809 participants had joined the study. Participants complete questionnaires approximately every 6 months on lifestyle and medical characteristics. The response rate for the second questionnaire is currently at 72%; for participants who have completed at least two questionnaires, subsequent response rates exceed 80%.
The NHS3 Mobile Health Substudy is a pilot study that asks a subset of NHS3 participants (N = 500) to download a custom smartphone application and to wear a consumer-wearable fitness tracker. The aims of the substudy are to quantify the relationships between dynamic measures of geographic context and objective measures of physical activity and sleep, and to develop measurement correction models for geographic context, physical activity, and sleep that can be applied to the full NHS3 cohort. This study is intended as a pilot to see whether the use of smartphone applications and consumer-wearable devices will be accepted by NHS3 participants. In the substudy, participants undertake 7-day sampling periods four times across a year, spaced 3 months apart to capture seasonal variability in behaviors. Consistent with other physical activity and GPS studies, we chose to conduct a 7-day protocol to capture behaviors and exposures on both work and nonwork days (32). During this sampling period, they wear a researcher-provided Fitbit to measure physical activity, heart rate, and sleep, and run an application on their smartphones. Participants were e-mailed instructions to run the smartphone application on their phone and to wear the Fitbit simultaneously during all 7-day sampling periods. During these sampling periods, they were asked to have their smartphone on them during waking hours and to wear the Fitbit for 24 hours a day (except when showering, bathing, or swimming). Participants were asked to download the Fitbit app and to sync their data with the app every 3–4 days. They were also asked to charge their Fitbit before each sampling period, as well as every 3–4 days during sampling periods. Participants were free to keep their Fitbit devices after the study concluded. We worked with app developers (Overlap Health, Inc.) to develop a customized NHS3 iPhone app that enables us to distribute short surveys, collect smartphone accelerometer and location (GPS) data, and sync with the Fitbit Application Programming Interface (API) to gather Fitbit wearable device data. The Substudy began enrollment on March 2018 and data collection is ongoing.
To be eligible for the substudy, participants had to have been 21 years old as of March 12, 2018, have to have reported height and weight on their first questionnaire, have to have completed the physical activity questions on their second questionnaire (6 months into the main NHS3 study), have to have completed the sleep questions on their fourth questionnaire (one and a half years into the study), and must have reported that they do not have a doctor-diagnosed sleep disorder on the fourth questionnaire. Furthermore, participants must own an iPhone (for the pilot, our app was only designed for iOS), and had to live in the contiguous United States (where data on the environmental exposures of interest were available).
Potential participants were e-mailed an invitation to participate in the substudy and were sent a link to an eligibility screener to determine whether they had an iPhone and to confirm that they did not have a doctor-diagnosed sleep disorder. If participants were determined to be eligible, they were sent an electronic informed consent form. Once consented, they were mailed a Fitbit and sent substudy instructions that included a link to download the custom NHS3 smartphone application. The substudy was approved by the Institutional Review Board of the Brigham and Women's Hospital (Boston, MA) and Harvard Pilgrim Health Care Institute (Boston, MA).
NHS3 smartphone application
The NHS3 iOS application walks participants through the onboarding process and obtains permission to gather location services data and to send notifications. See Supplementary Fig. S1 for screenshots of the app on the participant facing side. The application also directly connects participants to Fitbit, so that they can give permission to access their Fitbit data. Over the course of one year, substudy participants are sent a notification by the smartphone application four times that asks them if the upcoming week is a typical week for the participant. If the participant responds yes, then sampling begins and the participant is advised to wear their Fitbit. If the participant responds no, they can delay up to four times until the sampling period is considered missed.
Before and after the sampling period, participants fill out a brief survey assessing whether the week and their sleep patterns were typical for them. The survey includes questions about stress level, factors effecting sleep (including a range of responses from “young children who don't sleep through the night,” to “partner snoring,” to “had a cold”), night shifts, and breastfeeding. At the end of the year, the app provides several questions on the quality of the experience participating in the study and opportunities for improvement. After completing the four sampling sessions, participants were e-mailed a link to a questionnaire on self-reported physical activity (54), sleep, and sedentary behavior (ref. 55; Supplementary Table S1). Data from these questionnaires will be used to compare self-reported data on these behaviors to objectively collected data from wearable devices. The combination of self-report and Fitbit data will be used to inform regression calibration methods (56).
The NHS3 application utilizes smartphone location services, which employ a hybrid of assisted GPS, WiFi positioning, and cellular network positioning to precisely estimate location (57–59). The NHS3 app uses smartphone location services to record the latitude and longitude of participants during the sampling periods at 10-minute intervals. We chose this interval to reduce battery consumption while maintaining the ability to impute participant trajectories throughout the day (60). Each geolocation coordinate also receives a measure of horizontal accuracy in meters. On the basis of the distribution of horizontal accuracy from our participant data, we chose 65 meters as a cut-off point for adequate accuracy.
Fitbit: consumer-wearable devices
We have provided our participants with a range of Fitbit devices over the course of enrollment, including the Fitbit Charge HR, the Fitbit Charge 2, and the Fitbit Charge 3. All of these devices measure physical activity, heart rate, and sleep, and last up to 7 days on a single charge. Physical activity and sleep data are measured through miniaturized accelerometers in the Fitbit, and proprietary algorithms parse accelerometry data into steps, activity intensity, and sleep duration at the minute level. Heart rate is measured by the Fitbit device through photoplethysmography. Photoplethysmography uses light to measure blood flow by measuring the absorption of green light by the blood flowing under the skin. Higher absorption means more blood pumping through the veins. The device uses photodiodes to measure the light absorption and uses the information from the photodiodes to calculate heart rate through a proprietary algorithm, and has been shown to be reasonably accurate (38, 61).
After participants provide permission, the Fitbit API enables researchers to download minute-level datasets. These datasets contain timestamped information on steps (continuous), activity intensity (categorical), heart rate (continuous), and sleep duration (continuous).
Data processing
The volume and structure of data provided by the customized app posed unique challenges for processing and analysis. Each of the approximately 40 million records downloaded from the mobile application comes in JSON format, a syntax for structured data that is highly standardized, but is not tabulated, and therefore gives no way to obtain aggregate numbers, statistics, or visualizations. In addition, the data provided to us by the app developers contained excess information that was not part of our study protocol and needed to be discarded.
Storage for the data was the first problem we had to solve. We chose to store raw JSON files by uploading to a MongoDB database, an open-source noSQL database that is designed to store, process and query very large collections of JSON documents. Overlap Health database schemas follow the mobile health standards recommended by Open mHealth (https://www.openmhealth.org/) to ensure interoperability. The MongoDB provides a repository for the raw JSON, but gave no way to obtain aggregate numbers, statistics, or visualizations.
To create the graphics and statistics in this article, it was necessary to restructure all the data into tabulated format, but due to the size it was not feasible to pull all the data at once and work with it in aggregate in R. To filter and restructure the data, we wrote custom software tools in R to interact with the MongoDB and extract, transform, and save the reformatted data in smaller chunks. We used some existing functionality in well-vetted packages such as mongolite and jsonlite, but also wrote specific reusable functions in R for this project. The reusability of the custom functions allowed us to create a download system that is reproducible and well documented.
We calculated nonwear time based on missing heart rate data. If there was no observation for heart rate in the JSON file, we considered that minute to be nonwear time. For step data, if there was a heart rate observation and no step data, we set the step count to zero and considered it wear time with no steps.
Statistical analysis plan
In this manuscript, we present response rates for each step of substudy participant recruitment, and show demographic characteristics (means and frequencies) comparing Substudy participants to the full NHS3 cohort. Although data collection is ongoing, we also present preliminary data on compliance for Fitbit wear time using two cutoffs: (i) five days with at least 600 minutes (10 hours) per day, which is commonly used in Actigraph studies (62), and (ii) five days with at least 1,200 minutes (20 hours) per day, which we chose to ascertain how well participants did at wearing the Fitbit for the full 24-hour period. Compliance percentages are presented by sampling week. We also present means, medians, and IQRs for step, heart rate, sleep, and GPS data. To illustrate the distribution of GPS observations by participant, we present a histogram of GPS observations per participant in one sampling week.
To achieve study aims, in future analyses, data will be examined at the minute level to estimate associations between contemporaneous environmental exposures (e.g., spatial datasets on built and natural environments) derived from GPS and physical activity/heart rate outcomes. For sleep analyses, GPS data linked to environmental data will be aggregated to create dynamic measures of daily average exposure across the day, and nightly sleep measures will be linked to daily average exposures based on the day prior to the sleep period. We will conduct hierarchical mixed models that account for the correlation of data within each individual. A priori, based on previous research, we will examine biological sex, age, race, and individual- and area-level socioeconomic status as potential confounders and effect modifiers. To identify subpopulations at higher susceptibility to environmental exposures, we will conduct stratified analyses and conduct likelihood ratio tests comparing models with and without interaction terms.
For measurement error correction analyses, data from the substudy will be used as an internal validation study in which the error-prone measures (self-reported sleep, physical activity, sedentary behavior, and residential addresses) are validated against the reference measures (Fitbit behavioral data and smartphone GPS). The noniterative regression calibration method can be used to obtain consistent point estimates and valid interval estimates of associations in regression models with measurement error in one or more continuous covariates (56). The substudy data will be used to estimate the regression model for E(x|X) and true covariates will be predicted for all NHS3 study participants using this model.
Security and confidentiality
All smartphone application and Fitbit data are encrypted prior to transmission. The Overlap platform is protected by two-factor authentication, and all data are downloaded from the platform directly to study servers hosted at Brigham and Women's Hospital (Boston, MA). To protect the identity of participants, no identifiable data nor health data are shared with app developers or Fitbit beyond a substudy ID, which in turn is not directly linkable to all other participant data.
Results
As of August 12, 2019, we have invited 1,337 NHS3 participants to complete the eligibility screener. Of those invited, 743 completed the eligibility screener (56%) and 597 of these individuals were found eligible (80%; Supplementary Fig. S2). Subsequently, 500 participants completed the consent process, were mailed a Fitbit and sent instructions to download the NHS3 app, and are actively participating in the study. Demographics of the Mobile Health Substudy subsample compared with the full NHS3 cohort are shown in Table 1,Table 2. In general, Mobile Health Substudy participants were of similar age (35.7 vs. 35.4 years), had lower body mass index (25.6 vs. 26.8), and were more likely to be white (90% vs. 85%), married (59% vs. 47%), and to never smoke (80% vs. 69%) compared with the full NHS3 cohort. Compared with those who were invited and chose not to participate in the substudy, participants were younger, had lower BMI, and were more likely to never smoke, but had similar distribution by race and marital status (Table 1). These differences were generally consistent comparing those who were invited and found eligible, but chose not to participate.
Variable . | Full NHS3 cohort, excluding Mobile Health Substudy participants (N = 48,320) Mean (SD) or n (%) . | NHS3 participants invited to complete an eligibility screener, excluding Mobile Health Substudy participants (n = 837) . | NHS3 participants eligible to participate in the substudy, excluding Mobile Health Substudy participants (n = 97) . | Mobile Health Substudy participants (N = 500) Mean (SD) or n (%) . |
---|---|---|---|---|
Age (years) | 35.4 (7.9) | 36.5 (7.4) | 37.0 (7.3) | 35.7 (7.2) |
Weight (lbs) | 162.2 (41.8) | 161.0 (41.3) | 160.0 (40.7) | 154.4 (34.4) |
BMI (kg/m2) | 26.8 (6.6) | 26.7 (6.7) | 26.5 (6.3) | 25.6 (5.6) |
Race | ||||
White | 40,877 (85%) | 757 (90%) | 93 (96%) | 452 (90%) |
Black | 1,714 (4%) | 11 (1%) | 0 | 8 (2%) |
Hispanic | 2,526 (5%) | 30 (4%) | 3 (3%) | 23 (5%) |
Asian | 1,634 (3%) | 2 (3%) | 1 (1%) | 6 (1%) |
American Indian | 200 (<1%) | 1 (<1%) | 0 | 0 |
Mixed race | 970 (2%) | 12 (1%) | 0 | 7 (1%) |
Missing/not provided | 399 (1%) | 4 (<1%) | 0 | 4 (1%) |
Married | 22,916 (47%) | 503 (60%) | 59 (61%) | 297 (59%) |
Smoking status | ||||
Never | 33,204 (69%) | 638 (76%) | 68 (70%) | 399 (80%) |
Current | 2,733 (6%) | 41 (5%) | 5 (5%) | 16 (3%) |
Former | 8,622 (18%) | 155 (19%) | 4 (25%) | 84 (17%) |
Missing/not provided | 3,761 (8%) | 3 (<1%) | 0 | 1 (<1%) |
Variable . | Full NHS3 cohort, excluding Mobile Health Substudy participants (N = 48,320) Mean (SD) or n (%) . | NHS3 participants invited to complete an eligibility screener, excluding Mobile Health Substudy participants (n = 837) . | NHS3 participants eligible to participate in the substudy, excluding Mobile Health Substudy participants (n = 97) . | Mobile Health Substudy participants (N = 500) Mean (SD) or n (%) . |
---|---|---|---|---|
Age (years) | 35.4 (7.9) | 36.5 (7.4) | 37.0 (7.3) | 35.7 (7.2) |
Weight (lbs) | 162.2 (41.8) | 161.0 (41.3) | 160.0 (40.7) | 154.4 (34.4) |
BMI (kg/m2) | 26.8 (6.6) | 26.7 (6.7) | 26.5 (6.3) | 25.6 (5.6) |
Race | ||||
White | 40,877 (85%) | 757 (90%) | 93 (96%) | 452 (90%) |
Black | 1,714 (4%) | 11 (1%) | 0 | 8 (2%) |
Hispanic | 2,526 (5%) | 30 (4%) | 3 (3%) | 23 (5%) |
Asian | 1,634 (3%) | 2 (3%) | 1 (1%) | 6 (1%) |
American Indian | 200 (<1%) | 1 (<1%) | 0 | 0 |
Mixed race | 970 (2%) | 12 (1%) | 0 | 7 (1%) |
Missing/not provided | 399 (1%) | 4 (<1%) | 0 | 4 (1%) |
Married | 22,916 (47%) | 503 (60%) | 59 (61%) | 297 (59%) |
Smoking status | ||||
Never | 33,204 (69%) | 638 (76%) | 68 (70%) | 399 (80%) |
Current | 2,733 (6%) | 41 (5%) | 5 (5%) | 16 (3%) |
Former | 8,622 (18%) | 155 (19%) | 4 (25%) | 84 (17%) |
Missing/not provided | 3,761 (8%) | 3 (<1%) | 0 | 1 (<1%) |
. | % of participants with ≥5 days of ≥10 hours/day . | % of participants with ≥5 days of ≥20 hours/day . |
---|---|---|
Sampling week 1 | 90 | 75 |
Sampling week 2 | 71 | 51 |
Sampling week 3 | 70 | 51 |
Sampling week 4 | 71 | 47 |
. | % of participants with ≥5 days of ≥10 hours/day . | % of participants with ≥5 days of ≥20 hours/day . |
---|---|---|
Sampling week 1 | 90 | 75 |
Sampling week 2 | 71 | 51 |
Sampling week 3 | 70 | 51 |
Sampling week 4 | 71 | 47 |
Fitbit data
While some devices were still in transit and data gathering will not end until March 2020, as of August 2019, we had received data from 435 participants, comprising over 6 million participant-minutes of heart rate and step data. A summary of Fitbit wear-time based on minutes with recorded heart rate values is shown in Supplementary Fig. S3 and Supplementary Table S2. In sampling week 1, 90% participants had at least one study period with 5 days of at least 600 minutes each of Fitbit wear-time, which is a commonly used standard in research-grade Actigraph studies (62). This percentage declined to about 70% in sampling week 2, but remained steady for all following sampling weeks. Using a more stringent cutoff of 5 days with at least 1,200 minutes of Fitbit wear-time, 75% of participants met these criteria in sampling week 1. Again, this number decreased to about 50% in sampling week 2 and remained at that level for all later sampling weeks. Over 819 completed study period weeks, we observed an average of 7,581 minutes of heart rate and step data (IQR: 6,651–9,645) per participant-week. The mean number of steps per week per participant was 86,143 (IQR: 71,520–110,302), and average steps per minute ranged from 0 to 225. We have also recorded a total of over 2 million minutes of sleep in over 5,700 unique sleep bouts, including naps and main sleep periods. For sampling week 1, sleep bouts were on average 6 hours long (IQR: 5 hours–7.5 hours).
The collected and parsed data provide an extremely high resolution, external record of activity, and rest patterns. Figure 1 shows a visualization of one participant-week of data and illustrates the day-to-day variability in heart rate and step that appear for each person at the week level. Data for all figures are from a test subject collected as part of protocol development. Figure 2 provides a visualization of rest–activity patterns for one participant-week on a 24-hour clock. Variations in peak activity levels between days can be teased out, along with variability in sleep timing and duration.
Geolocation data
We have collected continuous smartphone GPS data for 5,237 unique participant-days. Our participants are drawn from across the United States (Supplementary Fig. S4), and the geolocation for each participant provides information on daily routines, including commuting, home, and work locations (Fig. 3). The geolocation data from the smartphone app is precise to 5 decimal degrees for both latitude and longitude, and also provides an estimate of accuracy of each GPS estimate in meters. Horizontal accuracy ranged from very accurate (5 meters) to highly inaccurate (24,000 meters). In a visual examination of multiple participants' geolocation data accuracy over time, we found that when accuracy was poorer than 65 meters, the app would repeatedly assess location until accuracy was 65 meters or better. After filtering for a minimum accuracy of 65 meters or better, we had approximately 550,000 GPS data points. There were on average 104 GPS observations with accuracy of 65 meters or less per participant per day (IQR: 103–107; median: 104), which roughly corresponds to one GPS observation every 15 minutes. The distribution of GPS observations per participant for the first sampling week is shown in Supplementary Fig. S5. For this first week of sampling, the median ranged from 107 to 116 observations per day per person, and the lowest number of observations in a single participant day was zero, while the maximum number of observations in a single participant day was 280. Examining multiple participants' geolocation data in relation to the observations' time stamps confirmed that the observations were spread evenly throughout the day.
Discussion
Physical inactivity and inadequate sleep are major behavioral risk factors for cancer that are widespread in the U.S. population and may be driven by environmental exposures. In parallel, mobile health technology is becoming increasingly prevalent and enables novel perspectives on physical activity and sleep. The NHS3 Mobile Health Substudy integrates high spatiotemporal resolution measurement of environmental exposures with objective physical activity, heart rate, and sleep measurement into a nationwide prospective cohort.
This study brings mobile health technology into epidemiologic research to quantify relationships between minute-level environmental factors and health behaviors measured with smartphones and wearable devices. This integration is notable for the high granularity of information it provides on participant's behaviors as well as the objectivity of the measurements. It offers an opportunity to examine the features of places in which individuals are physically active, estimate timing and duration of sleep, derive metrics of circadian patterns, or asses social jetlag (misalignment between social and biological times measured through sleep onset on free days compared with workdays; refs. 63, 64). In addition, we plan to use these data to create regression calibration methods that can be applied to correct for residence-based measures of environmental exposure and self-reported physical activity, and sleep in the full NHS3 cohort. Data from this project will advance the field of environmental exposure assessment and behavioral measurement within cohort studies and will provide novel insights on how environment and behavior drive cancer.
Mobile health data are an important new resource for health and epidemiology research, but gathering and analyzing these data is not without challenges and limitations. Challenges include the investment required in developing a custom smartphone application, the number of staff required to ship Fitbits and communicate with participants, as well as managing the data once it is collected. While the volume and velocity of streaming data from mobile health technologies is a strength of this approach, researchers must not underestimate the learning curve of dealing with these “big data.” Parsing, cleaning, and visualizing these high-dimensional data are no small task. To effectively make sense of these data, it is fundamental to partner with transdisciplinary teams of computer scientists, app developers, behavioral scientists, and epidemiologists. Major limitations beyond the challenges above mainly have to do with the Fitbit device itself. Fitbits are designed to increase physical activity and alter behavior, so the distribution of physical activity, heart rate, and sleep may not be representative of normal behavior. However, within-participant analyses of the relationship between environmental exposures and health behaviors should be internally valid. Another major concern is the proprietary “black-box” algorithms Fitbit uses to process the raw sensor data. These algorithms can change at any time unbeknownst to researchers, and the lack of raw data precludes the sharing of approaches across studies that utilize different devices.
However, the potential strengths of incorporating mobile health technology into prospective cohorts are substantial. Our preliminary data suggests that compliance with mobile health protocols is high and remains relatively high over follow-up, likely because participants already carry their smartphones with them at most times and seem to like the wearable devices, but we will reevaluate this when data collection is complete. Because there is no need for participants to mail devices back to researchers, researchers have near-real time access to the data and can assess compliance while the study is ongoing. Consumer wearables have battery life up to a week on one charge, are low cost and low burden for participants and researchers, and are easy to use. This low burden approach creates a potentially scalable methodology that might be expanded to tens of thousands of participants to efficiently gather objective, passive data on environmental exposures and health behaviors over long time periods.
The data collected in this pilot study will be used to inform efforts to expand a scalable mobile health protocol to the entire NHS3 cohort. For example, we will compare data across different Fitbit devices to assess the feasibility of a bring-your-own-device protocol where participants volunteer the data from wearable devices that they already own. As we move forward, we envision creating digital phenotypes, or the “moment-by-moment quantification of the individual-level human phenotype in situ using data from personal digital devices” to understand patterns of behavior predictive of health outcomes within the full NHS3 cohort (65, 66). We also hope to incorporate geotagged micro-surveys, or ecologic momentary assessment (67–69), protocols to validate health behaviors and to assess mental health and positive health outcomes in real-time, as participants experience them. Finally, we may capitalize on other sensors on the phone to assess, for instance, noise or light exposure and the influence of these exposures on sleep or stress.
The NHS3 Mobile Health Substudy has substantial implications for public health. Physical inactivity and inadequate sleep are dominant risk factors for cancer that are widespread in the U.S. population, and new insight into what influences these behaviors, as well as their inter-relationships, have great implications for public health and society. This project may provide translational data to inform interventions to improve urban planning policies and green space development to optimize opportunities for increased physical activity, healthy sleep patterns, and lowered obesity prevalence. Ultimately, mobile health technology is growing in popularity, and tremendous amounts of data are gathered each day. The NHS3 Mobile Health Substudy provides viable next steps to advance methods to find meaningful signals in the noise of streaming, semi-continuous health data. These approaches hold great promise for advancing epidemiologic research on contextual factors that promote healthy behaviors and reduce cancer risk.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: J.E. Hart, F. Laden, J.E. Chavarro, P. James
Development of methodology: P. James
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): C. Choirat, J.W. Thompson, F. Laden, J.E. Chavarro, P. James
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): R. Fore, C. Choirat, J.W. Thompson, P. James
Writing, review, and/or revision of the manuscript: R. Fore, J.E. Hart, C. Choirat, F. Laden, J.E. Chavarro, P. James
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): R. Fore, J.W. Thompson, K. Lynch, P. James
Study supervision: J.E. Chavarro, P. James
Acknowledgments
This research was supported by grant R00CA201542 (to P. James) from NCI and U01HL145386 (to J.E. Chavarro) from NHLBI. The authors acknowledge Overlap Health, Inc., including David Haddad and Emerson Farrugia, for their support in developing the NHS3 smartphone application, and for managing the platform for this study.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.