Background:

Physical activity and sleep are behavioral risk factors for cancer that may be influenced by environmental exposures, including built and natural environments. However, many studies in this area are limited by residence-based exposure assessment and/or self-reported, time-aggregated measures of behavior.

Methods:

The Nurses' Health Study 3 (NHS3) Mobile Health Substudy is a pilot study of 500 participants in the prospective NHS3 cohort who use a smartphone application and a Fitbit for seven-day periods, four times over a year, to measure minute-level location, physical activity, heart rate, and sleep.

Results:

We have collected data on 435 participants, comprising over 6 million participant-minutes of heart rate, step, sleep, and location. Over 90% of participants had five days of ≥600 minutes of Fitbit wear-time in their first sampling week, and this percentage dropped to 70% for weeks 2 to 4. Over 819 sampling weeks, we observed an average of 7,581 minutes of heart rate and step data [interquartile range (IQR): 6,651–9,645] per participant-week, and >2 million minutes of sleep in over 5,700 sleep bouts. We have recorded location data for 5,237 unique participant-days, averaging 104 location observations per participant-day (IQR: 103–107).

Conclusions:

This study describes a protocol to incorporate mobile health technology into a nationwide prospective cohort to measure high-resolution objective data on environment and behavior.

Impact:

This project could provide translational insights into interventions for urban planning to optimize opportunities for physical activity and healthy sleep patterns to reduce cancer risk.

See all articles in this CEBP Focus section, “Modernizing Population Science.”

Physical activity, sleep patterns, and obesity are major behavioral risk factors for cancer that may be influenced by environmental factors, including the built environment and green spaces. Inadequate physical activity in the United States contributes to over 12% of breast and colon cancers, on par with the disease burden of smoking (1). However, 80% of Americans do not report meeting the guideline of 150 minutes of moderate-intensity aerobic activity per week (2). In addition, 50–70 million U.S. adults have chronic sleep and wakefulness disorders: 35% sleep <7 hours on average per night and 38% unintentionally fall asleep during the day (3, 4). Epidemiologic studies have reported that sleep disturbances are linked to breast (511), colon (12, 13), prostate (14–16), and endometrial cancer.(17) In addition, lack of physical activity and sleep act together to drive obesity (18, 19), also a known risk factor for cancer (20–22).

Health behaviors are strongly influenced by the dynamic social and physical environments within which individuals interact (23). Built and natural environments can provide “walkable” neighborhoods with safe, shaded, and protected parks that provide opportunities for routine physical activity, which may subsequently drive sleep patterns. These environments may also buffer individuals from exposure to noise, light at night, and air pollution, which may in turn lead to improved sleep quality. Research is evolving on how the built and natural environments influence health behaviors; however, most of these studies are cross-sectional and therefore limited by the resolution of the data collected. Using only a participant's residential address to define environmental exposure ignores the fact that many people spend most of their day outside the home (24, 25). Physical activity and sleep are often based on time-aggregated, self-reported data, which do not provide objective information on timing or duration of these behaviors (26–30). Measurement error in environmental exposure assessment and self-report of physical activity and sleep may lead to bias in assessing the impact of environmental factors on health behaviors.

New consumer technologies such as smartphones and wearable accelerometer devices have unlocked the potential to gather data on location-based behavior at a greatly increased spatiotemporal resolution. High precision global positioning systems (GPS) data from a smartphone can be integrated with spatial datasets (e.g., built and natural environments, noise, and air pollution) (31–36), to create personalized, intraday, dynamic environmental exposure metrics. High precision activity data from consumer wearable devices containing miniaturized accelerometers, such as Fitbits, can be used as detailed, objective measures of participants' physical activity. Fitbits measure physical activity as well as research-grade accelerometry in lab settings (37), are comparable with electrocardiography (ECG) for heart rate during routine physical activity (38–42), and have been tested for validity and usability in free-living subjects (43–47), although it is worth noting that not all studies are consistent (41, 46, 48). In measuring sleep, Fitbit performs similarly to research-grade accelerometry when compared with the gold standard of polysomnography (49–51), although Fitbit may systematically overestimate sleep (51) and some sleep metrics (e.g., total sleep time) may be more valid than others (e.g., sleep efficiency; ref. 49). Wearable devices are often designed to communicate with a smartphone and securely upload data in near real-time, alleviating the need to mail devices back to researchers, which is costly and burdensome (52). Modern mobile health (mHealth) technologies offer a passive, low-cost opportunity to address current limitations in measuring the impact of environment on cancer-related health behaviors.

In this protocol article, we describe our methodology to integrate a custom smartphone application and a consumer-wearable device protocol into the ongoing Nurses' Health Study 3 (NHS3), as well as use the data generated from this study. This pilot protocol, known as the NHS3 Mobile Health Substudy, aims to pilot approaches that will provide detailed information on environmental exposures and health behaviors using modern technology. The substudy collects data on minute-level smartphone-based GPS, and Fitbit-based physical activity, heart rate, and sleep over 7-day sampling periods four times across a year for a maximum of 28 days of sampling per participant. We have provided some preliminary statistics on our data gathering to date; however, future analyses will assess the success of this pilot study and implications for scaling up measures to the full NHS3 cohort. Embedding these novel mHealth measures within an ongoing prospective cohort study will enable researchers to (i) examine how these mHealth measures compare with traditional environmental (GPS- vs. residence-based) and behavioral (Fitbit vs. self-report) measures; (ii) understand the distribution of and variability in physical activity, heart rate, and sleep both between participants as well as within participant; and (iii) analyze the longitudinal associations between minute-level environmental exposure and health behaviors.

NHS3 mobile health substudy

The NHS3 is an ongoing internet-based open cohort study of male and female nurses in the United States and Canada that began in 2010 (53). To be eligible for the study, participants have to be either a registered nurse, licensed practical/vocational nurse, or nursing student and born on or after January 1, 1965. As of February 2020, 48,809 participants had joined the study. Participants complete questionnaires approximately every 6 months on lifestyle and medical characteristics. The response rate for the second questionnaire is currently at 72%; for participants who have completed at least two questionnaires, subsequent response rates exceed 80%.

The NHS3 Mobile Health Substudy is a pilot study that asks a subset of NHS3 participants (N = 500) to download a custom smartphone application and to wear a consumer-wearable fitness tracker. The aims of the substudy are to quantify the relationships between dynamic measures of geographic context and objective measures of physical activity and sleep, and to develop measurement correction models for geographic context, physical activity, and sleep that can be applied to the full NHS3 cohort. This study is intended as a pilot to see whether the use of smartphone applications and consumer-wearable devices will be accepted by NHS3 participants. In the substudy, participants undertake 7-day sampling periods four times across a year, spaced 3 months apart to capture seasonal variability in behaviors. Consistent with other physical activity and GPS studies, we chose to conduct a 7-day protocol to capture behaviors and exposures on both work and nonwork days (32). During this sampling period, they wear a researcher-provided Fitbit to measure physical activity, heart rate, and sleep, and run an application on their smartphones. Participants were e-mailed instructions to run the smartphone application on their phone and to wear the Fitbit simultaneously during all 7-day sampling periods. During these sampling periods, they were asked to have their smartphone on them during waking hours and to wear the Fitbit for 24 hours a day (except when showering, bathing, or swimming). Participants were asked to download the Fitbit app and to sync their data with the app every 3–4 days. They were also asked to charge their Fitbit before each sampling period, as well as every 3–4 days during sampling periods. Participants were free to keep their Fitbit devices after the study concluded. We worked with app developers (Overlap Health, Inc.) to develop a customized NHS3 iPhone app that enables us to distribute short surveys, collect smartphone accelerometer and location (GPS) data, and sync with the Fitbit Application Programming Interface (API) to gather Fitbit wearable device data. The Substudy began enrollment on March 2018 and data collection is ongoing.

To be eligible for the substudy, participants had to have been 21 years old as of March 12, 2018, have to have reported height and weight on their first questionnaire, have to have completed the physical activity questions on their second questionnaire (6 months into the main NHS3 study), have to have completed the sleep questions on their fourth questionnaire (one and a half years into the study), and must have reported that they do not have a doctor-diagnosed sleep disorder on the fourth questionnaire. Furthermore, participants must own an iPhone (for the pilot, our app was only designed for iOS), and had to live in the contiguous United States (where data on the environmental exposures of interest were available).

Potential participants were e-mailed an invitation to participate in the substudy and were sent a link to an eligibility screener to determine whether they had an iPhone and to confirm that they did not have a doctor-diagnosed sleep disorder. If participants were determined to be eligible, they were sent an electronic informed consent form. Once consented, they were mailed a Fitbit and sent substudy instructions that included a link to download the custom NHS3 smartphone application. The substudy was approved by the Institutional Review Board of the Brigham and Women's Hospital (Boston, MA) and Harvard Pilgrim Health Care Institute (Boston, MA).

NHS3 smartphone application

The NHS3 iOS application walks participants through the onboarding process and obtains permission to gather location services data and to send notifications. See Supplementary Fig. S1 for screenshots of the app on the participant facing side. The application also directly connects participants to Fitbit, so that they can give permission to access their Fitbit data. Over the course of one year, substudy participants are sent a notification by the smartphone application four times that asks them if the upcoming week is a typical week for the participant. If the participant responds yes, then sampling begins and the participant is advised to wear their Fitbit. If the participant responds no, they can delay up to four times until the sampling period is considered missed.

Before and after the sampling period, participants fill out a brief survey assessing whether the week and their sleep patterns were typical for them. The survey includes questions about stress level, factors effecting sleep (including a range of responses from “young children who don't sleep through the night,” to “partner snoring,” to “had a cold”), night shifts, and breastfeeding. At the end of the year, the app provides several questions on the quality of the experience participating in the study and opportunities for improvement. After completing the four sampling sessions, participants were e-mailed a link to a questionnaire on self-reported physical activity (54), sleep, and sedentary behavior (ref. 55; Supplementary Table S1). Data from these questionnaires will be used to compare self-reported data on these behaviors to objectively collected data from wearable devices. The combination of self-report and Fitbit data will be used to inform regression calibration methods (56).

The NHS3 application utilizes smartphone location services, which employ a hybrid of assisted GPS, WiFi positioning, and cellular network positioning to precisely estimate location (57–59). The NHS3 app uses smartphone location services to record the latitude and longitude of participants during the sampling periods at 10-minute intervals. We chose this interval to reduce battery consumption while maintaining the ability to impute participant trajectories throughout the day (60). Each geolocation coordinate also receives a measure of horizontal accuracy in meters. On the basis of the distribution of horizontal accuracy from our participant data, we chose 65 meters as a cut-off point for adequate accuracy.

Fitbit: consumer-wearable devices

We have provided our participants with a range of Fitbit devices over the course of enrollment, including the Fitbit Charge HR, the Fitbit Charge 2, and the Fitbit Charge 3. All of these devices measure physical activity, heart rate, and sleep, and last up to 7 days on a single charge. Physical activity and sleep data are measured through miniaturized accelerometers in the Fitbit, and proprietary algorithms parse accelerometry data into steps, activity intensity, and sleep duration at the minute level. Heart rate is measured by the Fitbit device through photoplethysmography. Photoplethysmography uses light to measure blood flow by measuring the absorption of green light by the blood flowing under the skin. Higher absorption means more blood pumping through the veins. The device uses photodiodes to measure the light absorption and uses the information from the photodiodes to calculate heart rate through a proprietary algorithm, and has been shown to be reasonably accurate (38, 61).

After participants provide permission, the Fitbit API enables researchers to download minute-level datasets. These datasets contain timestamped information on steps (continuous), activity intensity (categorical), heart rate (continuous), and sleep duration (continuous).

Data processing

The volume and structure of data provided by the customized app posed unique challenges for processing and analysis. Each of the approximately 40 million records downloaded from the mobile application comes in JSON format, a syntax for structured data that is highly standardized, but is not tabulated, and therefore gives no way to obtain aggregate numbers, statistics, or visualizations. In addition, the data provided to us by the app developers contained excess information that was not part of our study protocol and needed to be discarded.

Storage for the data was the first problem we had to solve. We chose to store raw JSON files by uploading to a MongoDB database, an open-source noSQL database that is designed to store, process and query very large collections of JSON documents. Overlap Health database schemas follow the mobile health standards recommended by Open mHealth (https://www.openmhealth.org/) to ensure interoperability. The MongoDB provides a repository for the raw JSON, but gave no way to obtain aggregate numbers, statistics, or visualizations.

To create the graphics and statistics in this article, it was necessary to restructure all the data into tabulated format, but due to the size it was not feasible to pull all the data at once and work with it in aggregate in R. To filter and restructure the data, we wrote custom software tools in R to interact with the MongoDB and extract, transform, and save the reformatted data in smaller chunks. We used some existing functionality in well-vetted packages such as mongolite and jsonlite, but also wrote specific reusable functions in R for this project. The reusability of the custom functions allowed us to create a download system that is reproducible and well documented.

We calculated nonwear time based on missing heart rate data. If there was no observation for heart rate in the JSON file, we considered that minute to be nonwear time. For step data, if there was a heart rate observation and no step data, we set the step count to zero and considered it wear time with no steps.

Statistical analysis plan

In this manuscript, we present response rates for each step of substudy participant recruitment, and show demographic characteristics (means and frequencies) comparing Substudy participants to the full NHS3 cohort. Although data collection is ongoing, we also present preliminary data on compliance for Fitbit wear time using two cutoffs: (i) five days with at least 600 minutes (10 hours) per day, which is commonly used in Actigraph studies (62), and (ii) five days with at least 1,200 minutes (20 hours) per day, which we chose to ascertain how well participants did at wearing the Fitbit for the full 24-hour period. Compliance percentages are presented by sampling week. We also present means, medians, and IQRs for step, heart rate, sleep, and GPS data. To illustrate the distribution of GPS observations by participant, we present a histogram of GPS observations per participant in one sampling week.

To achieve study aims, in future analyses, data will be examined at the minute level to estimate associations between contemporaneous environmental exposures (e.g., spatial datasets on built and natural environments) derived from GPS and physical activity/heart rate outcomes. For sleep analyses, GPS data linked to environmental data will be aggregated to create dynamic measures of daily average exposure across the day, and nightly sleep measures will be linked to daily average exposures based on the day prior to the sleep period. We will conduct hierarchical mixed models that account for the correlation of data within each individual. A priori, based on previous research, we will examine biological sex, age, race, and individual- and area-level socioeconomic status as potential confounders and effect modifiers. To identify subpopulations at higher susceptibility to environmental exposures, we will conduct stratified analyses and conduct likelihood ratio tests comparing models with and without interaction terms.

For measurement error correction analyses, data from the substudy will be used as an internal validation study in which the error-prone measures (self-reported sleep, physical activity, sedentary behavior, and residential addresses) are validated against the reference measures (Fitbit behavioral data and smartphone GPS). The noniterative regression calibration method can be used to obtain consistent point estimates and valid interval estimates of associations in regression models with measurement error in one or more continuous covariates (56). The substudy data will be used to estimate the regression model for E(x|X) and true covariates will be predicted for all NHS3 study participants using this model.

Security and confidentiality

All smartphone application and Fitbit data are encrypted prior to transmission. The Overlap platform is protected by two-factor authentication, and all data are downloaded from the platform directly to study servers hosted at Brigham and Women's Hospital (Boston, MA). To protect the identity of participants, no identifiable data nor health data are shared with app developers or Fitbit beyond a substudy ID, which in turn is not directly linkable to all other participant data.

As of August 12, 2019, we have invited 1,337 NHS3 participants to complete the eligibility screener. Of those invited, 743 completed the eligibility screener (56%) and 597 of these individuals were found eligible (80%; Supplementary Fig. S2). Subsequently, 500 participants completed the consent process, were mailed a Fitbit and sent instructions to download the NHS3 app, and are actively participating in the study. Demographics of the Mobile Health Substudy subsample compared with the full NHS3 cohort are shown in Table 1,Table 2. In general, Mobile Health Substudy participants were of similar age (35.7 vs. 35.4 years), had lower body mass index (25.6 vs. 26.8), and were more likely to be white (90% vs. 85%), married (59% vs. 47%), and to never smoke (80% vs. 69%) compared with the full NHS3 cohort. Compared with those who were invited and chose not to participate in the substudy, participants were younger, had lower BMI, and were more likely to never smoke, but had similar distribution by race and marital status (Table 1). These differences were generally consistent comparing those who were invited and found eligible, but chose not to participate.

Table 1.

Demographic characteristics of the full NHS3 cohort, participants invited to complete an eligibility screener, participants who were eligible to participate in the substudy, and participants in the mobile health substudy.

VariableFull NHS3 cohort, excluding Mobile Health Substudy participants (N = 48,320) Mean (SD) or n (%)NHS3 participants invited to complete an eligibility screener, excluding Mobile Health Substudy participants (n = 837)NHS3 participants eligible to participate in the substudy, excluding Mobile Health Substudy participants (n = 97)Mobile Health Substudy participants (N = 500) Mean (SD) or n (%)
Age (years) 35.4 (7.9) 36.5 (7.4) 37.0 (7.3) 35.7 (7.2) 
Weight (lbs) 162.2 (41.8) 161.0 (41.3) 160.0 (40.7) 154.4 (34.4) 
BMI (kg/m226.8 (6.6) 26.7 (6.7) 26.5 (6.3) 25.6 (5.6) 
Race     
 White 40,877 (85%) 757 (90%) 93 (96%) 452 (90%) 
 Black 1,714 (4%) 11 (1%) 8 (2%) 
 Hispanic 2,526 (5%) 30 (4%) 3 (3%) 23 (5%) 
 Asian 1,634 (3%) 2 (3%) 1 (1%) 6 (1%) 
 American Indian 200 (<1%) 1 (<1%) 
 Mixed race 970 (2%) 12 (1%) 7 (1%) 
 Missing/not provided 399 (1%) 4 (<1%) 4 (1%) 
Married 22,916 (47%) 503 (60%) 59 (61%) 297 (59%) 
Smoking status     
 Never 33,204 (69%) 638 (76%) 68 (70%) 399 (80%) 
 Current 2,733 (6%) 41 (5%) 5 (5%) 16 (3%) 
 Former 8,622 (18%) 155 (19%) 4 (25%) 84 (17%) 
 Missing/not provided 3,761 (8%) 3 (<1%) 1 (<1%) 
VariableFull NHS3 cohort, excluding Mobile Health Substudy participants (N = 48,320) Mean (SD) or n (%)NHS3 participants invited to complete an eligibility screener, excluding Mobile Health Substudy participants (n = 837)NHS3 participants eligible to participate in the substudy, excluding Mobile Health Substudy participants (n = 97)Mobile Health Substudy participants (N = 500) Mean (SD) or n (%)
Age (years) 35.4 (7.9) 36.5 (7.4) 37.0 (7.3) 35.7 (7.2) 
Weight (lbs) 162.2 (41.8) 161.0 (41.3) 160.0 (40.7) 154.4 (34.4) 
BMI (kg/m226.8 (6.6) 26.7 (6.7) 26.5 (6.3) 25.6 (5.6) 
Race     
 White 40,877 (85%) 757 (90%) 93 (96%) 452 (90%) 
 Black 1,714 (4%) 11 (1%) 8 (2%) 
 Hispanic 2,526 (5%) 30 (4%) 3 (3%) 23 (5%) 
 Asian 1,634 (3%) 2 (3%) 1 (1%) 6 (1%) 
 American Indian 200 (<1%) 1 (<1%) 
 Mixed race 970 (2%) 12 (1%) 7 (1%) 
 Missing/not provided 399 (1%) 4 (<1%) 4 (1%) 
Married 22,916 (47%) 503 (60%) 59 (61%) 297 (59%) 
Smoking status     
 Never 33,204 (69%) 638 (76%) 68 (70%) 399 (80%) 
 Current 2,733 (6%) 41 (5%) 5 (5%) 16 (3%) 
 Former 8,622 (18%) 155 (19%) 4 (25%) 84 (17%) 
 Missing/not provided 3,761 (8%) 3 (<1%) 1 (<1%) 
Table 2.

Percentage of participants meeting compliance cutoffs by sampling week.

% of participants with ≥5 days of ≥10 hours/day% of participants with ≥5 days of ≥20 hours/day
Sampling week 1 90 75 
Sampling week 2 71 51 
Sampling week 3 70 51 
Sampling week 4 71 47 
% of participants with ≥5 days of ≥10 hours/day% of participants with ≥5 days of ≥20 hours/day
Sampling week 1 90 75 
Sampling week 2 71 51 
Sampling week 3 70 51 
Sampling week 4 71 47 

Fitbit data

While some devices were still in transit and data gathering will not end until March 2020, as of August 2019, we had received data from 435 participants, comprising over 6 million participant-minutes of heart rate and step data. A summary of Fitbit wear-time based on minutes with recorded heart rate values is shown in Supplementary Fig. S3 and Supplementary Table S2. In sampling week 1, 90% participants had at least one study period with 5 days of at least 600 minutes each of Fitbit wear-time, which is a commonly used standard in research-grade Actigraph studies (62). This percentage declined to about 70% in sampling week 2, but remained steady for all following sampling weeks. Using a more stringent cutoff of 5 days with at least 1,200 minutes of Fitbit wear-time, 75% of participants met these criteria in sampling week 1. Again, this number decreased to about 50% in sampling week 2 and remained at that level for all later sampling weeks. Over 819 completed study period weeks, we observed an average of 7,581 minutes of heart rate and step data (IQR: 6,651–9,645) per participant-week. The mean number of steps per week per participant was 86,143 (IQR: 71,520–110,302), and average steps per minute ranged from 0 to 225. We have also recorded a total of over 2 million minutes of sleep in over 5,700 unique sleep bouts, including naps and main sleep periods. For sampling week 1, sleep bouts were on average 6 hours long (IQR: 5 hours–7.5 hours).

The collected and parsed data provide an extremely high resolution, external record of activity, and rest patterns. Figure 1 shows a visualization of one participant-week of data and illustrates the day-to-day variability in heart rate and step that appear for each person at the week level. Data for all figures are from a test subject collected as part of protocol development. Figure 2 provides a visualization of rest–activity patterns for one participant-week on a 24-hour clock. Variations in peak activity levels between days can be teased out, along with variability in sleep timing and duration.

Figure 1.

Visualization of one hypothetical participant week, where red represents heart beats per minute, blue represents step counts per minute, and green is latitude as a proxy for GPS availability. Data are from a test subject collected as part of protocol development.

Figure 1.

Visualization of one hypothetical participant week, where red represents heart beats per minute, blue represents step counts per minute, and green is latitude as a proxy for GPS availability. Data are from a test subject collected as part of protocol development.

Close modal
Figure 2.

One week of heart rate and sleep data for a hypothetical participant represented on a 24-hour clock. Thin colored lines represent heart rate across each day, while the thick blue line is average heart rate. The shaded gray areas represent sleep periods, where overlapping sleep periods are darker gray. Data are from a test subject collected as part of protocol development.

Figure 2.

One week of heart rate and sleep data for a hypothetical participant represented on a 24-hour clock. Thin colored lines represent heart rate across each day, while the thick blue line is average heart rate. The shaded gray areas represent sleep periods, where overlapping sleep periods are darker gray. Data are from a test subject collected as part of protocol development.

Close modal

Geolocation data

We have collected continuous smartphone GPS data for 5,237 unique participant-days. Our participants are drawn from across the United States (Supplementary Fig. S4), and the geolocation for each participant provides information on daily routines, including commuting, home, and work locations (Fig. 3). The geolocation data from the smartphone app is precise to 5 decimal degrees for both latitude and longitude, and also provides an estimate of accuracy of each GPS estimate in meters. Horizontal accuracy ranged from very accurate (5 meters) to highly inaccurate (24,000 meters). In a visual examination of multiple participants' geolocation data accuracy over time, we found that when accuracy was poorer than 65 meters, the app would repeatedly assess location until accuracy was 65 meters or better. After filtering for a minimum accuracy of 65 meters or better, we had approximately 550,000 GPS data points. There were on average 104 GPS observations with accuracy of 65 meters or less per participant per day (IQR: 103–107; median: 104), which roughly corresponds to one GPS observation every 15 minutes. The distribution of GPS observations per participant for the first sampling week is shown in Supplementary Fig. S5. For this first week of sampling, the median ranged from 107 to 116 observations per day per person, and the lowest number of observations in a single participant day was zero, while the maximum number of observations in a single participant day was 280. Examining multiple participants' geolocation data in relation to the observations' time stamps confirmed that the observations were spread evenly throughout the day.

Figure 3.

GPS observations for one participant-week by day based on data from a test subject collected as part of protocol development.

Figure 3.

GPS observations for one participant-week by day based on data from a test subject collected as part of protocol development.

Close modal

Physical inactivity and inadequate sleep are major behavioral risk factors for cancer that are widespread in the U.S. population and may be driven by environmental exposures. In parallel, mobile health technology is becoming increasingly prevalent and enables novel perspectives on physical activity and sleep. The NHS3 Mobile Health Substudy integrates high spatiotemporal resolution measurement of environmental exposures with objective physical activity, heart rate, and sleep measurement into a nationwide prospective cohort.

This study brings mobile health technology into epidemiologic research to quantify relationships between minute-level environmental factors and health behaviors measured with smartphones and wearable devices. This integration is notable for the high granularity of information it provides on participant's behaviors as well as the objectivity of the measurements. It offers an opportunity to examine the features of places in which individuals are physically active, estimate timing and duration of sleep, derive metrics of circadian patterns, or asses social jetlag (misalignment between social and biological times measured through sleep onset on free days compared with workdays; refs. 63, 64). In addition, we plan to use these data to create regression calibration methods that can be applied to correct for residence-based measures of environmental exposure and self-reported physical activity, and sleep in the full NHS3 cohort. Data from this project will advance the field of environmental exposure assessment and behavioral measurement within cohort studies and will provide novel insights on how environment and behavior drive cancer.

Mobile health data are an important new resource for health and epidemiology research, but gathering and analyzing these data is not without challenges and limitations. Challenges include the investment required in developing a custom smartphone application, the number of staff required to ship Fitbits and communicate with participants, as well as managing the data once it is collected. While the volume and velocity of streaming data from mobile health technologies is a strength of this approach, researchers must not underestimate the learning curve of dealing with these “big data.” Parsing, cleaning, and visualizing these high-dimensional data are no small task. To effectively make sense of these data, it is fundamental to partner with transdisciplinary teams of computer scientists, app developers, behavioral scientists, and epidemiologists. Major limitations beyond the challenges above mainly have to do with the Fitbit device itself. Fitbits are designed to increase physical activity and alter behavior, so the distribution of physical activity, heart rate, and sleep may not be representative of normal behavior. However, within-participant analyses of the relationship between environmental exposures and health behaviors should be internally valid. Another major concern is the proprietary “black-box” algorithms Fitbit uses to process the raw sensor data. These algorithms can change at any time unbeknownst to researchers, and the lack of raw data precludes the sharing of approaches across studies that utilize different devices.

However, the potential strengths of incorporating mobile health technology into prospective cohorts are substantial. Our preliminary data suggests that compliance with mobile health protocols is high and remains relatively high over follow-up, likely because participants already carry their smartphones with them at most times and seem to like the wearable devices, but we will reevaluate this when data collection is complete. Because there is no need for participants to mail devices back to researchers, researchers have near-real time access to the data and can assess compliance while the study is ongoing. Consumer wearables have battery life up to a week on one charge, are low cost and low burden for participants and researchers, and are easy to use. This low burden approach creates a potentially scalable methodology that might be expanded to tens of thousands of participants to efficiently gather objective, passive data on environmental exposures and health behaviors over long time periods.

The data collected in this pilot study will be used to inform efforts to expand a scalable mobile health protocol to the entire NHS3 cohort. For example, we will compare data across different Fitbit devices to assess the feasibility of a bring-your-own-device protocol where participants volunteer the data from wearable devices that they already own. As we move forward, we envision creating digital phenotypes, or the “moment-by-moment quantification of the individual-level human phenotype in situ using data from personal digital devices” to understand patterns of behavior predictive of health outcomes within the full NHS3 cohort (65, 66). We also hope to incorporate geotagged micro-surveys, or ecologic momentary assessment (67–69), protocols to validate health behaviors and to assess mental health and positive health outcomes in real-time, as participants experience them. Finally, we may capitalize on other sensors on the phone to assess, for instance, noise or light exposure and the influence of these exposures on sleep or stress.

The NHS3 Mobile Health Substudy has substantial implications for public health. Physical inactivity and inadequate sleep are dominant risk factors for cancer that are widespread in the U.S. population, and new insight into what influences these behaviors, as well as their inter-relationships, have great implications for public health and society. This project may provide translational data to inform interventions to improve urban planning policies and green space development to optimize opportunities for increased physical activity, healthy sleep patterns, and lowered obesity prevalence. Ultimately, mobile health technology is growing in popularity, and tremendous amounts of data are gathered each day. The NHS3 Mobile Health Substudy provides viable next steps to advance methods to find meaningful signals in the noise of streaming, semi-continuous health data. These approaches hold great promise for advancing epidemiologic research on contextual factors that promote healthy behaviors and reduce cancer risk.

No potential conflicts of interest were disclosed.

Conception and design: J.E. Hart, F. Laden, J.E. Chavarro, P. James

Development of methodology: P. James

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): C. Choirat, J.W. Thompson, F. Laden, J.E. Chavarro, P. James

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): R. Fore, C. Choirat, J.W. Thompson, P. James

Writing, review, and/or revision of the manuscript: R. Fore, J.E. Hart, C. Choirat, F. Laden, J.E. Chavarro, P. James

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): R. Fore, J.W. Thompson, K. Lynch, P. James

Study supervision: J.E. Chavarro, P. James

This research was supported by grant R00CA201542 (to P. James) from NCI and U01HL145386 (to J.E. Chavarro) from NHLBI. The authors acknowledge Overlap Health, Inc., including David Haddad and Emerson Farrugia, for their support in developing the NHS3 smartphone application, and for managing the platform for this study.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Lee
IM
,
Shiroma
EJ
,
Lobelo
F
,
Puska
P
,
Blair
SN
,
Katzmarzyk
PT
. 
Effect of physical inactivity on major non-communicable diseases worldwide: an analysis of burden of disease and life expectancy
.
Lancet
2012
;
380
:
219
29
.
2.
Adult participation in aerobic and muscle-strengthening physical activities–United States, 2011
.
MMWR Morb Mortal Wkly Rep
2013
;
62
:
326
30
.
3.
Institute of Medicine
. 
Sleep disorders and sleep deprivation: an unmet public health problem
.
Washington (DC)
:
National Academies Press
; 
2006
.
4.
Unhealthy sleep-related behaviors–12 States, 2009
.
MMWR Morb Mortal Wkly Rep
2011
;
60
:
233
8
.
5.
Hansen
J
. 
Increased breast cancer risk among women who work predominantly at night
.
Epidemiology
2001
;
12
:
74
7
.
6.
Davis
S
,
Mirick
DK
,
Stevens
RG
. 
Night shift work, light at night, and risk of breast cancer
.
J Natl Cancer Inst
2001
;
93
:
1557
62
.
7.
Schernhammer
ES
,
Kroenke
CH
,
Laden
F
,
Hankinson
SE
. 
Night work and risk of breast cancer
.
Epidemiology
2006
;
17
:
108
11
.
8.
Schernhammer
ES
,
Laden
F
,
Speizer
FE
,
Willett
WC
,
Hunter
DJ
,
Kawachi
I
, et al
Rotating night shifts and risk of breast cancer in women participating in the Nurses' Health Study
.
J Natl Cancer Inst
2001
;
93
:
1563
8
.
9.
Wang
F
,
Yeung
KL
,
Chan
WC
,
Kwok
CC
,
Leung
SL
,
Wu
C
, et al
A meta-analysis on dose-response relationship between night shift work and the risk of breast cancer
.
Ann Oncol
2013
;
24
:
2724
32
.
10.
Lie
JA
,
Kjuus
H
,
Zienolddiny
S
,
Haugen
A
,
Kjaerheim
K
. 
Breast cancer among nurses: is the intensity of night work related to hormone receptor status?
Am J Epidemiol
2013
;
178
:
110
7
.
11.
He
C
,
Anand
ST
,
Ebell
MH
,
Vena
JE
,
Robb
SW
. 
Circadian disrupting exposures and breast cancer risk: a meta-analysis
.
Int Arch Occup Environ Health
2014
;
88
:
533
47
.
12.
Schernhammer
ES
,
Laden
F
,
Speizer
FE
,
Willett
WC
,
Hunter
DJ
,
Kawachi
I
, et al
Night-shift work and risk of colorectal cancer in the nurses' health study
.
J Natl Cancer Inst
2003
;
95
:
825
8
.
13.
Papantoniou
K
,
Kogevinas
M
,
Martin Sanchez
V
,
Moreno
V
,
Pollan
M
,
Moleon
JJ
, et al
0058 Colorectal cancer risk and shift work in a population-based case-control study in Spain (MCC-Spain)
.
Occup Environ Med
2014
;
71
:
A5
6
.
14.
Kubo
T
,
Ozasa
K
,
Mikami
K
,
Wakai
K
,
Fujino
Y
,
Watanabe
Y
, et al
Prospective cohort study of the risk of prostate cancer among rotating-shift workers: findings from the Japan collaborative cohort study
.
Am J Epidemiol
2006
;
164
:
549
55
.
15.
Gapstur
SM
,
Diver
WR
,
Stevens
VL
,
Carter
BD
,
Teras
LR
,
Jacobs
EJ
. 
Work schedule, sleep duration, insomnia, and risk of fatal prostate cancer
.
Am J Prev Med
2014
;
46
(3 Suppl 1):
S26
33
.
16.
Sigurdardottir
LG
,
Valdimarsdottir
UA
,
Fall
K
,
Rider
JR
,
Lockley
SW
,
Schernhammer
E
, et al
Circadian disruption, sleep loss, and prostate cancer risk: a systematic review of epidemiologic studies
.
Cancer Epidemiol Biomarkers Prev
2012
;
21
:
1002
11
.
17.
Viswanathan
AN
,
Hankinson
SE
,
Schernhammer
ES
. 
Night shift work and the risk of endometrial cancer
.
Cancer Res
2007
;
67
:
10618
22
.
18.
Patel
SR
,
Hu
FB
. 
Short sleep duration and weight gain: a systematic review
.
Obesity
2008
;
16
:
643
53
.
19.
Patel
SR
,
Malhotra
A
,
White
DP
,
Gottlieb
DJ
,
Hu
FB
. 
Association between reduced sleep and weight gain in women
.
Am J Epidemiol
2006
;
164
:
947
54
.
20.
Calle
EE
,
Kaaks
R
. 
Overweight, obesity and cancer: epidemiological evidence and proposed mechanisms
.
Nat Rev Cancer
2004
;
4
:
579
91
.
21.
Calle
EE
,
Thun
MJ
. 
Obesity and cancer
.
Oncogene
2004
;
23
:
6365
78
.
22.
Arnold
M
,
Pandeya
N
,
Byrnes
G
,
Renehan
AG
,
Stevens
GA
,
Ezzati
M
, et al
Global burden of cancer attributable to high body-mass index in 2012: a population-based study
.
Lancet Oncol
2015
;
16
:
36
46
.
23.
Krieger
N
. 
Embodiment: a conceptual glossary for epidemiology
.
J Epidemiol Community Health
2005
;
59
:
350
5
.
24.
Perchoux
C
,
Chaix
B
,
Cummins
S
,
Kestens
Y
. 
Conceptualization and measurement of environmental exposure in epidemiology: accounting for activity space related to daily mobility
.
Health Place
2013
;
21
:
86
93
.
25.
Chaix
B
. 
Geographic life environments and coronary heart disease: a literature review, theoretical contributions, methodological updates, and a research agenda
.
Annu Rev Public Health
2009
;
30
:
81
105
.
26.
James
P
,
Troped
PJ
,
Hart
JE
,
Joshu
CE
,
Colditz
GA
,
Brownson
RC
, et al
Urban sprawl, physical activity, and body mass index: Nurses' Health Study and Nurses' Health Study II
.
Am J Public Health
2013
;
103
:
369
75
.
27.
Lee
IM
,
Ewing
R
,
Sesso
HD
. 
The built environment and physical activity levels: the Harvard alumni health study
.
Am J Prev Med
2009
;
37
:
293
8
.
28.
Hurley
S
,
Goldberg
D
,
Bernstein
L
,
Reynolds
P
. 
Sleep duration and cancer risk in women
.
Cancer Causes Control
2015
;
26
:
1037
45
.
29.
Girschik
J
,
Heyworth
J
,
Fritschi
L
. 
Self-reported sleep duration, sleep quality, and breast cancer risk in a population-based case-control study
.
Am J Epidemiol
2013
;
177
:
316
27
.
30.
Wu
Y
,
Zhang
D
,
Kang
S
. 
Physical activity and risk of breast cancer: a meta-analysis of prospective studies
.
Breast Cancer Res Treat
2013
;
137
:
869
82
.
31.
James
P
,
Kioumourtzoglou
MA
,
Hart
JE
,
Banay
RF
,
Kloog
I
,
Laden
F
. 
Interrelationships between walkability, air pollution, greenness, and body mass index
.
Epidemiology
2017
;
28
:
780
8
.
32.
James
P
,
Hart
JE
,
Hipp
JA
,
Mitchell
JA
,
Kerr
J
,
Hurvitz
PM
, et al
GPS-based exposure to greenness and walkability and accelerometry-based physical activity
.
Cancer Epidemiol Biomarkers Prev
2017
;
26
:
525
32
.
33.
James
P
,
Hart
JE
,
Banay
RF
,
Laden
F
. 
Exposure to greenness and mortality in a nationwide prospective cohort study of women
.
Environ Health Perspect
2016
;
124
:
1344
52
.
34.
Rudolph
KE
,
Shev
A
,
Paksarian
D
,
Merikangas
KR
,
Mennitt
DJ
,
James
P
, et al
Environmental noise and sleep and mental health outcomes in a nationally representative sample of urban US adolescents
.
Environ Epidemiol
2019
;
3
:
e056
.
35.
VoPham
T
,
Bertrand
KA
,
DuPre
NC
,
James
P
,
Vieira
VM
,
Tamimi
RM
, et al
Ultraviolet radiation exposure and breast cancer risk in the Nurses' Health Study II
.
Environ Epidemiol
2019
;
3
:
pii:
e057
.
36.
VoPham
T
,
Hart
JE
,
Bertrand
KA
,
Sun
Z
,
Tamimi
RM
,
Laden
F
. 
Spatiotemporal exposure modeling of ambient erythemal ultraviolet radiation
.
Environ Health
2016
;
15
:
111
.
37.
Lee
JM
,
Kim
Y
,
Welk
GJ
. 
Validity of consumer-based physical activity monitors
.
Med Sci Sports Exerc
2014
;
46
:
1840
8
.
38.
Haghayegh
S
,
Khoshnevis
S
,
Smolensky
MH
,
Diller
KR
. 
Accuracy of purepulse photoplethysmography technology of fitbit charge 2 for assessment of heart rate during sleep
.
Chronobiol Int
2019
;
36
:
927
33
.
39.
Nelson
BW
,
Allen
NB
. 
Accuracy of consumer wearable heart rate measurement during an ecologically valid 24-hour period: intraindividual validation study
.
JMIR Mhealth Uhealth
2019
;
7
:
e10828
.
40.
Gorny
AW
,
Liew
SJ
,
Tan
CS
,
Muller-Riemenschneider
F
. 
Fitbit charge HR wireless heart rate monitor: validation study conducted under free-living conditions
.
JMIR Mhealth Uhealth
2017
;
5
:
e157
.
41.
Benedetto
S
,
Caldato
C
,
Bazzan
E
,
Greenwood
DC
,
Pensabene
V
,
Actis
P
. 
Assessment of the Fitbit charge 2 for monitoring heart rate
.
PLoS One
2018
;
13
:
e0192691
.
42.
Bai
Y
,
Hibbing
P
,
Mantis
C
,
Welk
GJ
. 
Comparative evaluation of heart rate-based monitors: Apple watch vs. Fitbit charge HR
.
J Sports Sci
2018
;
36
:
1734
41
.
43.
Vooijs
M
,
Alpay
LL
,
Snoeck-Stroband
JB
,
Beerthuizen
T
,
Siemonsma
PC
,
Abbink
JJ
, et al
Validity and usability of low-cost accelerometers for internet-based self-monitoring of physical activity in patients with chronic obstructive pulmonary disease
.
Interact J Med Res
2014
;
3
:
e14
.
44.
Adam Noah
J
,
Spierer
DK
,
Gu
J
,
Bronner
S
. 
Comparison of steps and energy expenditure assessment in adults of Fitbit tracker and ultra to the actical and indirect calorimetry
.
J Med Eng Technol
2013
;
37
:
456
62
.
45.
Diaz
KM
,
Krupka
DJ
,
Chang
MJ
,
Peacock
J
,
Ma
Y
,
Goldsmith
J
, et al
Fitbit(R): An accurate and reliable device for wireless physical activity tracking
.
Int J Cardiol
2015
;
185
:
138
40
.
46.
Nelson
MB
,
Kaminsky
LA
,
Dickin
DC
,
Montoye
AH
. 
Validity of consumer-based physical activity monitors for specific activity types
.
Med Sci Sports Exerc
2016
;
48
:
1619
28
.
47.
Evenson
KR
,
Goto
MM
,
Furberg
RD
. 
Systematic review of the validity and reliability of consumer-wearable activity trackers
.
Int J Behav Nutr Phys Act
2015
;
12
:
159
.
48.
Feehan
LM
,
Geldman
J
,
Sayre
EC
,
Park
C
,
Ezzat
AM
,
Yoo
JY
, et al
Accuracy of fitbit devices: systematic review and narrative syntheses of quantitative data
.
JMIR Mhealth Uhealth
2018
;
6
:
e10527
.
49.
Mantua
J
,
Gravel
N
,
Spencer
RM
. 
Reliability of sleep measures from four personal health monitoring devices compared to research-based actigraphy and polysomnography
.
Sensors (Basel)
2016
;
16
.
pii: E646
.
50.
Montgomery-Downs
HE
,
Insana
SP
,
Bond
JA
. 
Movement toward a novel activity monitoring device
.
Sleep Breath
2012
;
16
:
913
7
.
51.
de Zambotti
M
,
Baker
FC
,
Willoughby
AR
,
Godino
JG
,
Wing
D
,
Patrick
K
, et al
Measures of sleep and cardiac functioning during sleep using a multi-sensory commercially-available wristband in adolescents
.
Physiol Behav
2016
;
158
:
143
9
.
52.
Lee
IM
,
Shiroma
EJ
. 
Using accelerometers to measure physical activity in large-scale epidemiological studies: issues and challenges
.
Br J Sports Med
2014
;
48
:
197
201
.
53.
Bao
Y
,
Bertoia
ML
,
Lenart
EB
,
Stampfer
MJ
,
Willett
WC
,
Speizer
FE
, et al
Origin, methods, and evolution of the three nurses' health studies
.
Am J Public Health
2016
;
106
:
1573
81
.
54.
Wolf
AM
,
Hunter
DJ
,
Colditz
GA
,
Manson
JE
,
Stampfer
MJ
,
Corsano
KA
, et al
Reproducibility and validity of a self-administered physical activity questionnaire
.
Int J Epidemiol
1994
;
23
:
991
9
.
55.
Marshall
AL
,
Miller
YD
,
Burton
NW
,
Brown
WJ
. 
Measuring total and domain-specific sitting: a study of reliability and validity
.
Med Sci Sports Exerc
2010
;
42
:
1094
102
.
56.
Spiegelman
D
,
Carroll
RJ
,
Kipnis
V
. 
Efficient regression calibration for logistic regression in main study/internal validation study designs with an imperfect reference instrument
.
Stat Med
2001
;
20
:
139
60
.
57.
Zandbergena
PA
,
Barbeaua
SJ
. 
Positional accuracy of assisted GPS data from high-sensitivity GPS-enabled mobile phones
.
J Navig
2011
;
64
:
381
99
.
58.
Zandbergen
PA
. 
Accuracy of iPhone locations: a comparison of assisted GPS, WiFi and cellular positioning
.
Transactions in GIS
2009
;
13
:
5
25
.
59.
Seto
E
,
Yan
P
,
Kuryloski
P
,
Bajcsy
R
,
Abresch
T
,
Henricson
E
, et al
Mobile phones as personal environmental sensing platforms: development of the CalFit system
.
The 23rd Annual Conference of the International Society of Environmental Epidemiology; 2011 Sep 13-16; Barcelona, Spain
.
60.
Barnett
I
,
Onnela
JP
. 
Inferring mobility measures from GPS traces with missing data
.
Biostatistics 2018;kxy059
.
61.
Stahl
SE
,
An
HS
,
Dinkel
DM
,
Noble
JM
,
Lee
JM
. 
How accurate are the wrist-based heart rate monitors during walking and running activities? Are they accurate enough?
BMJ Open Sport Exerc Med
2016
;
2
:
e000106
.
62.
Mitchell
JA
,
Godbole
S
,
Moran
K
,
Murray
K
,
James
P
,
Laden
F
, et al
No evidence of reciprocal associations between daily sleep and physical activity
.
Med Sci Sports Exerc
2016
;
48
:
1950
6
.
63.
Jankowski
KS
. 
Actual versus preferred sleep times as a proxy of biological time for social jet lag
.
Chronobiol Int
2017
;
34
:
1175
6
.
64.
Jankowski
KS
. 
Social jet lag: sleep-corrected formula
.
Chronobiol Int
2017
;
34
:
531
5
.
65.
Torous
J
,
Staples
P
,
Onnela
JP
. 
Realizing the potential of mobile mental health: new methods for new data in psychiatry
.
Curr Psychiatry Rep
2015
;
17
:
602
.
66.
Onnela
JP
,
Rauch
SL
. 
Harnessing smartphone-based digital phenotyping to enhance behavioral and mental health
.
Neuropsychopharmacology
2016
;
41
:
1691
6
.
67.
Dunton
GF
. 
Sustaining health-protective behaviors such as physical activity and healthy eating
.
JAMA
2018
;
320
:
639
40
.
68.
Dunton
GF
. 
Ecological momentary assessment in physical activity research
.
Exerc Sport Sci Rev
2017
;
45
:
48
54
.
69.
Liao
Y
,
Intille
SS
,
Dunton
GF
. 
Using ecological momentary assessment to understand where and with whom adults' physical and sedentary activity occur
.
Int J Behav Med
2015
;
22
:
51
61
.