Abstract
Serum miRNAs are potential biomarkers for ovarian cancer; however, many factors may influence miRNA expression. To understand potential confounders in miRNA analysis, we examined how sociodemographic factors and comorbidities, including known ovarian cancer risk factors, influence serum miRNA levels in women without ovarian cancer.
Data from 1,576 women from the Mass General Brigham Biobank collected between 2012 and 2019, excluding subjects previously or subsequently diagnosed with ovarian cancer, were examined. Using a focused panel of 179 miRNA probes optimized for serum profiling, miRNA expression was measured by flow cytometry using the Abcam FirePlex assay and correlated with subjects’ electronic medical records.
The study population broadly reflected the New England population. The median age of subjects was 49 years, 34% were current or prior smokers, 33% were obese (body mass index > 30 kg/m2), 49% were postmenopausal, and 11% had undergone prior bilateral oophorectomy. Significant differences in miRNA expression were observed among ovarian risk factors such as age, obesity, menopause, BRCA1 or BRCA2 germline mutations, or existence of breast cancer in family history. Additionally, miRNA expression was significantly altered by prior bilateral oophorectomy, hypertension, and hypercholesterolemia. Other variables, such as smoking; parity; age at menarche; hormonal replacement therapy; oral contraception; breast, endometrial, or colon cancer; and diabetes, were not associated with significant changes in the panel when corrected for multiple testing.
Serum miRNA expression patterns are significantly affected by patient demographics, exposure history, and medical comorbidities.
Understanding confounders in serum miRNA expression is important for refining clinical assays for cancer screening.
Introduction
Ovarian cancer is the fifth most common cause of cancer-related deaths and the leading cause of gynecologic cancer deaths among American women. More than 70% of ovarian cancer cases are diagnosed at an advanced stage, which carries a poor prognosis with 5-year overall survival rates of only 20% to 25% (1). Detecting ovarian cancer at earlier stages has the potential to reduce morbidity and mortality, but a noninvasive test with sufficient specificity and sensitivity for population-level screening remains elusive (2, 3). Although the glycoprotein CA125 is currently used as a biomarker to determine the response to treatment and to monitor for recurrence in ovarian cancer, its use for screening purposes is limited owing to its low sensitivity for early-stage disease and high false-positive rate (4–8). Similarly, combining CA125 with transvaginal ultrasound for early detection failed to improve ovarian cancer survival in both the US and UK clinical trials (9, 10).
Recently, miRNAs have attracted interest as novel biomarkers for ovarian cancer. miRNAs are small ncRNA molecules that modify gene expression through posttranscriptional regulation (11). Research over the last two decades has shown that miRNAs can act as oncogenes as well as tumor suppressors depending on the cancer type and that miRNA signatures are associated with diagnosis, prognosis, progression, and response to cancer treatment (11–16). Besides being found in tissue samples, miRNAs have been stably detected in various body fluids, including serum, making them particularly suitable for minimally invasive screening purposes (17, 18). Several groups have proposed a role for circulating miRNAs in the early detection of ovarian cancer (19–24). However, these case–control and cohort studies did not assess the broad variations in miRNA expression patterns across the general population.
Because serum miRNAs are influenced by a variety of patient-specific factors, a comprehensive analysis of the factors that may alter miRNA networks is essential prior to the clinical implementation of miRNA-based testing (11, 13, 25, 26). Developing a clinical assay based on miRNA expression requires consideration of a wide range of variables including ovarian cancer risk factors, sociodemographic factors, and gynecologic, reproductive, and chronic medical conditions. To understand the potential impact of these factors on serum miRNAs as an ovarian cancer screening tool, we aimed to characterize the confounding effects on miRNA expression in a large cohort drawn from the general population without ovarian cancer.
Materials and Methods
Study population
The study participants included women treated at a medical facility affiliated with Mass General Brigham, a regional network of hospitals and healthcare centers based in New England, who elected to participate in the Mass General Brigham Biobank between 2012 and 2019 (27). All study subjects provided electronic consent to donate dedicated blood samples collected during routine medical care for biomarker studies and to correlate their electronic health records with the study analyses. Participants in the current cohort were identified using medical record queries with the Research Patient Data Registry tool based on at least one gynecologic or obstetric encounter documented (28). An additional subset (n = 85) of women participating in the Mass General Brigham Biobank was added to the overall cohort, based on known BRCA1/2 germline mutations. Subjects were filtered for those with at least 500 µL of serum available in the biobank. Subjects with ovarian cancer prior to the index blood draw or those diagnosed with ovarian cancer subsequent to biobank enrollment were excluded. Clinical data were manually retrieved from electronic health records after Institutional Review Board approval (Brigham and Women’s Hospital Institutional Review Board Protocol 2018P001680) and managed using REDCap electronic data capture tools hosted at Mass General Brigham (29, 30).
Clinical data
We collected 33 different clinical variables based on sociodemographic, reproductive, and gynecologic factors with an emphasis on factors influencing ovarian cancer development as well as some chronic medical conditions with a high prevalence in the general population. The date of the first study blood draw was used as a reference date (i.e., chronic conditions or demographic factors were assumed to be present at the time of blood collection). Any type of cancer, family history of cancer, genetics, and COVID-19 infections were recorded when diagnosed, even if diagnosed after the study blood draw. Data were censored on December 31, 2020. Sociodemographic factors included age at the time of blood draw, ethnicity, race, body mass index (BMI, kg/m2; obesity defined as BMI ≥ 30 kg/m2), hypertension, hypercholesterolemia, diabetes mellitus, and smoking status (including current and former smokers). With regard to reproductive and gynecologic history, we recorded parity, age at menarche (≤12 or >13 years), use of hormonal replacement therapy (HRT, including former and current use of oral or transdermal therapy), use of oral contraceptives, tubal ligation, high-risk human papillomavirus (HPV) positivity on cervical swab, endometriosis, bilateral salpingo-oophorectomy (BSO), hysterectomy, and menopausal status. Women with prior hysterectomy or whose menopausal status was not explicitly mentioned in the health records were assumed to be postmenopausal if they were aged ≥50 years. Additional data were collected for a more detailed medical history focusing on malignancies, genetics, and family history. Additional clinical data can be found in the supplementary data (Supplementary Table S1).
miRNA profiling
Serum samples were stored at −80°C until use, brought to room temperature (RT), and gently vortexed prior to analysis. All samples were analyzed using a custom panel of 179 miRNA probes produced by Abcam, Inc. using FirePlex particle technology (Abcam, Inc.; ref. 31). The probe list is presented in Supplementary Table S2. The FirePlex assay works directly on crude biological fluids without RNA purification and can accommodate 68 miRNA probes per well in a 96-well assay plate. The hydrogel particle carries a combination of fluorophores at the two ends for particle recognition by flow cytometry and probes for miRNA capture and adapter ligation. Processing includes a PCR step after the initial capture for signal amplification, after which nucleic acids are recaptured and analyzed. The sensitivity of the assay is similar to RT-PCR (32).
The 179 miRNA probes were distributed across three panels for each sample and were processed on a single plate. A reference panel of miRNA probes provided by Abcam was included in all plates. The experimental probe list of 179 miRNAs was determined by examining miRNA expression among all known human miRNA species using the FirePlex assay with a pilot cohort of 100 samples selected from a previous study that included healthy controls, benign adnexal masses, and ovarian cancer cases (24). Only miRNAs detectable in at least 50% of the samples above the limit of detection and with coefficients of variation of less than 20% were included. The probe list for each well also included a positive control probe for a miRNA-like target, X-control, which was present in the FirePlex Hybridization Buffer. This control indicated that the assay was successfully performed in each well. Blank particles bearing no probe were included in each well to measure the level of background fluorescence in every well, as well as two off-species controls, to test for nonspecific signals. The miRNA probe list for each well also included seven overlapping miRNAs (hsa-let-7a-5p, hsa-let-7d-5p, hsa-mir-17-5p, hsa-20b-5p, hsa-mir-93-5p, hsa-mir-16-5p, and hsa-mir-122-5p) across all three panels for signal normalization. Each complete miRNA profile required 25 µL human serum. The samples were processed 29 at a time using a liquid handling robot (STARlet, Hamilton Robotics). Each plate also included a pooled human serum control to serve as an interplate calibrator.
Samples were prepared and analyzed according to the FirePlex miRNA assay protocol. First, samples were digested at 60°C for 45 minutes at 100 revolutions per minute (rpm). Three custom panels and one reference panel were added to the sample filter plates. Each panel was vortexed for 10 seconds before being added to the plates. Twenty µL of the X-control spike-in circulating reference RNA was added after vortexing for 10 seconds. The samples were incubated at 37°C for 60 minutes at 100 rpm. After hybridization, a labeling mix was prepared and added to each well. The plates were then incubated at RT for 60 minutes while shaking at 750 rpm. After labeling, the plates were prepared for elution and incubated for 30 minutes at 62.5°C at 100 rpm. The eluant was captured from all the wells in catch plates, and then PCR reagents were vortexed and centrifuged at 12,000 rpm for 10 seconds each, before pooling and preparation of the PCR master mix. The PCR master mix was then added to each sample. Then the samples were subjected to PCR according to the FirePlex protocol. After PCR, all samples were prepared for rehybridization at 37°C for 30 minutes at 100 rpm. Post-rehybridization, the reporter mix was prepared and added to all wells. The samples were then transferred to a post-PCR section in the laboratory for scanning.
The samples were scanned using a Guava easyCyte 5HT flow cytometer (Luminex). Data were recorded in .fcs files. The FirePlex Analysis Workbench software platform was used to decode miRNA data from .fcs files, and the results were extracted into .csv files. The .csv files were then used for analysis. Any samples failing to meet quality control as determined by minimal signal using the X-control or exceeding maximal background levels using the blank or off-species probes were run again and excluded if they failed on more than three attempts.
Statistical analysis
For the univariate analyses, we used Pearson correlations to investigate the linear relationships between the continuous metadata variables (e.g., age and parity) and miRNA and two-sample t tests for categorical metadata variables (e.g., menopausal status and BRCA status). The correlations and corresponding P values were calculated using the MATLAB (MathWorks) function “corr.” The t test P values were calculated using the MATLAB function “ttest2.” The samples variances were not assumed to be equal. For the 179 miRNAs considered in our panel, we found that the log2 expression values were distributed following an approximate bell curve (see Supplementary Fig. S1). Thus, a t test was deemed appropriate for assessing statistical significance. A Bonferroni correction was applied for multiple testing. In addition to the univariate correlation and t test analyses, we used the t-distributed stochastic neighborhood embedding (tSNE) algorithm to visualize the miRNA and metadata in two dimensions (33). tSNE cluster observations are based on similarity or “closeness” in a high-dimensional space. Thus, neighboring observations in the tSNE space share a similar metadata profile (e.g., similar age and smoking status). We applied k-means clustering (34) to the reduced-dimension metadata and labeled the miRNA data using fitted cluster memberships. We did this to visually identify, in a global sense, whether patients with similar metadata profiles also shared similar miRNA profiles.
Data availability
Raw data are available in supplementary materials and methods.
Results
From an initial cohort of 1,613 potential study participants, 1,576 women were included in our study after excluding women diagnosed with ovarian cancer. The median age of the study population was 49 years, with an age distribution of 21 to 87 years. Among the study participants, 34% were smokers, 33% were obese (BMI ≥ 30 kg/m2), 49% were postmenopausal, and 11% had undergone prior bilateral oophorectomy. The study population consisted of 78% White and 19% non-White subjects, with 80% non-Hispanic and 13% Hispanic subjects (Table 1). Additional information about patient characteristics and comorbidities can be found in the supplementary data (Supplementary Table S1).
Subject characteristics.
. | All participants n = 1,576 n (%) . |
---|---|
Age (years) | |
Mean (SD) | 49.36 (15.22) |
<30 | 178 (11) |
30 ≤ 39 | 354 (22) |
40 ≤ 49 | 285 (18) |
50 ≤ 59 | 330 (21) |
60 ≤ 69 | 261 (17) |
70 ≤ 79 | 147 (9) |
>80 | 21 (1) |
Race | |
White | 1,225 (78) |
Black or African American | 95 (6) |
American Indian or Alaska Native | 2 (0) |
Asian | 42 (3) |
Native Hawaiian or other Pacific Islander | 2 (0) |
Other | 158 (10) |
Unknown/unavailable | 35 (2) |
Declined | 17 (1) |
Ethnicity | |
Hispanic or Latino | 210 (13) |
Not Hispanic or Latino | 1,267 (80) |
N/A | 99 (6) |
Postmenopausal | |
No | 811 (51) |
Yes | 765 (49) |
. | All participants n = 1,576 n (%) . |
---|---|
Age (years) | |
Mean (SD) | 49.36 (15.22) |
<30 | 178 (11) |
30 ≤ 39 | 354 (22) |
40 ≤ 49 | 285 (18) |
50 ≤ 59 | 330 (21) |
60 ≤ 69 | 261 (17) |
70 ≤ 79 | 147 (9) |
>80 | 21 (1) |
Race | |
White | 1,225 (78) |
Black or African American | 95 (6) |
American Indian or Alaska Native | 2 (0) |
Asian | 42 (3) |
Native Hawaiian or other Pacific Islander | 2 (0) |
Other | 158 (10) |
Unknown/unavailable | 35 (2) |
Declined | 17 (1) |
Ethnicity | |
Hispanic or Latino | 210 (13) |
Not Hispanic or Latino | 1,267 (80) |
N/A | 99 (6) |
Postmenopausal | |
No | 811 (51) |
Yes | 765 (49) |
For our analysis, we focused on 33 clinical variables (Table 2) and their impact on the serum levels of 179 miRNAs (list of all miRNAs in Supplementary Table S2). Using unsupervised machine learning techniques, the clustering of metadata (33 variables) in a tSNE plot revealed six groups (Fig. 1A). Subjects in each cluster share a similar metadata profile which is demonstrated in detail in Supplementary Table S3. Using the same color-coded labeling from the groups in Fig. 1A, we did not observe a clustering effect in the reduced-dimension miRNA data (Fig. 1B), but in the univariate analyses, miRNA profiles were shown to vary significantly based on several metadata variables (Fig. 2A–D; Table 3; Supplementary Fig. S2). We present the correlation of each miRNA with age (Fig. 2A) and parity (Fig. 2B) as examples, with specific miRNAs highlighted in each figure, whereas the heatmaps in Fig. 2C and D summarize all of the metadata variables and miRNAs.
List of 33 clinical variables included for the univariate analyses.
Index . | Metavariable . |
---|---|
1 | Age |
2 | Obesity (BMI ≥ 30 kg/m2) |
3 | Smoking |
4 | Parity |
5 | Age at menarche |
6 | Use of HRT |
7 | Menopause |
8 | Use of oral contraceptives |
9 | Tubal ligation |
10 | Endometriosis |
11 | Bilateral oophorectomy |
12 | High-risk HPV positivity |
13 | Ovarian cancer in family (FDR) |
14 | Breast cancer in family (FDR) |
15 | Endometrial cancer in family (FDR) |
16 | Colon cancer in family (FDR) |
17 | Ovarian cancer in family (SDR) |
18 | Breast cancer in family (SDR) |
19 | Endometrial cancer in family (SDR) |
20 | Colon cancer in family (SDR) |
21 | BRCA1 mutation in family (FDR + SDR) |
22 | BRCA2 mutation in family (FDR + SDR) |
23 | Lynch syndrome in family (FDR + SDR) |
24 | BRCA1/2 mutation in family (FDR + SDR) |
25 | BRCA1 mutation |
26 | BRCA2 mutation |
27 | Lynch syndrome |
28 | Breast cancer |
29 | Endometrial cancer |
30 | Colon cancer |
31 | Hypertension |
32 | Hypercholesterolemia/atherosclerosis |
33 | Diabetes mellitus |
Index . | Metavariable . |
---|---|
1 | Age |
2 | Obesity (BMI ≥ 30 kg/m2) |
3 | Smoking |
4 | Parity |
5 | Age at menarche |
6 | Use of HRT |
7 | Menopause |
8 | Use of oral contraceptives |
9 | Tubal ligation |
10 | Endometriosis |
11 | Bilateral oophorectomy |
12 | High-risk HPV positivity |
13 | Ovarian cancer in family (FDR) |
14 | Breast cancer in family (FDR) |
15 | Endometrial cancer in family (FDR) |
16 | Colon cancer in family (FDR) |
17 | Ovarian cancer in family (SDR) |
18 | Breast cancer in family (SDR) |
19 | Endometrial cancer in family (SDR) |
20 | Colon cancer in family (SDR) |
21 | BRCA1 mutation in family (FDR + SDR) |
22 | BRCA2 mutation in family (FDR + SDR) |
23 | Lynch syndrome in family (FDR + SDR) |
24 | BRCA1/2 mutation in family (FDR + SDR) |
25 | BRCA1 mutation |
26 | BRCA2 mutation |
27 | Lynch syndrome |
28 | Breast cancer |
29 | Endometrial cancer |
30 | Colon cancer |
31 | Hypertension |
32 | Hypercholesterolemia/atherosclerosis |
33 | Diabetes mellitus |
Abbreviation: FDR, first-degree relative; SDR, second-degree relative.
tSNE plots of miRNA and clinical data. tSNE scatter plot of (A) 33 clinical variables and of (B) 179 miRNAs. Categories 0–5 refer to sample clusters in A and are designated by color. The clusters were determined using k-means clustering on the tSNE plot in A. Supplementary Table S3 summarizes the patient characteristics in each cluster.
tSNE plots of miRNA and clinical data. tSNE scatter plot of (A) 33 clinical variables and of (B) 179 miRNAs. Categories 0–5 refer to sample clusters in A and are designated by color. The clusters were determined using k-means clustering on the tSNE plot in A. Supplementary Table S3 summarizes the patient characteristics in each cluster.
Correlation and univariate analysis of clinical variables with 179 miRNAs. A and B, Correlation plot for the continuous variables age and parity. Compare Table 3 with regard to the indicated miRNAs that have a significant effect of age on miRNA levels. C, Plot with log2 fold-change (FC) of 33 clinical variables with 179 miRNAs. FC for age (≤50 vs. >50 years) and parity (0 vs. > 0 births) were calculated as categorical data here. D, Univariate analysis of the same 33 clinical variables and 179 miRNAs using t tests for categorical data and Pearson correlation for the continuous variables age and parity. Significant results after Bonferroni correction presented in yellow (P < 0.05). The miRNA variable index in the heatmaps (C and D) refers to the panel of 179 miRNAs analyzed using the FirePlex assay. Please see Supplementary Table S2 for a list of miRNA variable indices and miRNAs. The arrows highlight the significant miRNAs as listed in Table 3. A univariate analysis plot with raw P values is shown in Supplementary Fig. S2. BSO, bilateral salpingo-oophorectomy; FDR, first-degree relative; SDR, second-degree relative.
Correlation and univariate analysis of clinical variables with 179 miRNAs. A and B, Correlation plot for the continuous variables age and parity. Compare Table 3 with regard to the indicated miRNAs that have a significant effect of age on miRNA levels. C, Plot with log2 fold-change (FC) of 33 clinical variables with 179 miRNAs. FC for age (≤50 vs. >50 years) and parity (0 vs. > 0 births) were calculated as categorical data here. D, Univariate analysis of the same 33 clinical variables and 179 miRNAs using t tests for categorical data and Pearson correlation for the continuous variables age and parity. Significant results after Bonferroni correction presented in yellow (P < 0.05). The miRNA variable index in the heatmaps (C and D) refers to the panel of 179 miRNAs analyzed using the FirePlex assay. Please see Supplementary Table S2 for a list of miRNA variable indices and miRNAs. The arrows highlight the significant miRNAs as listed in Table 3. A univariate analysis plot with raw P values is shown in Supplementary Fig. S2. BSO, bilateral salpingo-oophorectomy; FDR, first-degree relative; SDR, second-degree relative.
miRNAs significantly associated with clinical variables.
Variables with significant miRNA serum level changes . | Number of significant miRNAs . | Top miRNAa . | Adjusted P values . | Log FCb . |
---|---|---|---|---|
Agec | 34 | miR-205-5p miR-574-3p miR-199a-3p miR-342-3p miR-200c-3p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.95d 0.97d 0.97d 0.98d 0.96d |
Menopausee | 26 | miR-205-5p miR-574-3p miR-200c-3p miR-191-5p miR-199a-5p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.95 0.97 0.95 0.98 0.97 |
Obesitye | 4 | miR-193a-5p miR-483-5p miR-122-5p miR-487b-3p | <0.001 <0.001 <0.001 0.001 | 1.03 1.04 1.03 0.96 |
Bilateral oophorectomye | 44 | miR-425-5p miR-185-3p miR-32-5p miR-18b-5p miR-143-3p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.95 0.91 0.87 0.92 0.93 |
Bilateral oophorectomye (in the non-BRCA cohort) | 5 | miR-32-5p miR-93-5p miR-17-5p miR-18b-5p miR-425-5p | 0.019 0.021 0.023 0.031 0.037 | 0.019 0.98 0.98 0.94 0.97 |
BRCA1 mutatione | 10 | miR-1237-3p miR-185-3p miR-211-5p miR-133a-3p miR-503-5p | <0.001 <0.001 0.001 0.001 0.001 | 0.86 0.80 0.88 0.82 0.86 |
BRCA2 mutatione | 48 | miR-1237-3p miR-133a-3p miR-185-3p miR-151b miR-143-3p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.84 0.80 0.78 0.84 0.89 |
Breast cancer in family (FDR)e | 2 | miR-185-3p miR-1237-3p | 0.015 0.030 | 0.95 0.96 |
Hypercholesterolemiae | 17 | miR-574-3p miR-766-3p miR-205-5p miR-199a-3p miR-191-5p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.97 0.95 0.95 0.97 0.98 |
Hypertensione | 39 | miR-199a-3p miR-574-3p miR-191-5p miR-205-5p miR-17-5p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.96 0.97 0.98 0.95 0.99 |
Variables with significant miRNA serum level changes . | Number of significant miRNAs . | Top miRNAa . | Adjusted P values . | Log FCb . |
---|---|---|---|---|
Agec | 34 | miR-205-5p miR-574-3p miR-199a-3p miR-342-3p miR-200c-3p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.95d 0.97d 0.97d 0.98d 0.96d |
Menopausee | 26 | miR-205-5p miR-574-3p miR-200c-3p miR-191-5p miR-199a-5p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.95 0.97 0.95 0.98 0.97 |
Obesitye | 4 | miR-193a-5p miR-483-5p miR-122-5p miR-487b-3p | <0.001 <0.001 <0.001 0.001 | 1.03 1.04 1.03 0.96 |
Bilateral oophorectomye | 44 | miR-425-5p miR-185-3p miR-32-5p miR-18b-5p miR-143-3p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.95 0.91 0.87 0.92 0.93 |
Bilateral oophorectomye (in the non-BRCA cohort) | 5 | miR-32-5p miR-93-5p miR-17-5p miR-18b-5p miR-425-5p | 0.019 0.021 0.023 0.031 0.037 | 0.019 0.98 0.98 0.94 0.97 |
BRCA1 mutatione | 10 | miR-1237-3p miR-185-3p miR-211-5p miR-133a-3p miR-503-5p | <0.001 <0.001 0.001 0.001 0.001 | 0.86 0.80 0.88 0.82 0.86 |
BRCA2 mutatione | 48 | miR-1237-3p miR-133a-3p miR-185-3p miR-151b miR-143-3p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.84 0.80 0.78 0.84 0.89 |
Breast cancer in family (FDR)e | 2 | miR-185-3p miR-1237-3p | 0.015 0.030 | 0.95 0.96 |
Hypercholesterolemiae | 17 | miR-574-3p miR-766-3p miR-205-5p miR-199a-3p miR-191-5p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.97 0.95 0.95 0.97 0.98 |
Hypertensione | 39 | miR-199a-3p miR-574-3p miR-191-5p miR-205-5p miR-17-5p | <0.001 <0.001 <0.001 <0.001 <0.001 | 0.96 0.97 0.98 0.95 0.99 |
Top five miRNAs with the smallest adjusted P value.
FC represent the ratio of mean miRNA levels in affected subjects divided by unaffected subjects, e.g., obese women/nonobese women and calculated on a log2 scale.
Pearson correlation.
FC for ratio ≤50 vs. >50 years.
Student t test.
Among the 33 clinical variables selected for the study, 9 showed a significant relationship with at least 1/179 circulating miRNAs (Fig. 2D). The top miRNAs (i.e., the five with the smallest P values) of all variables with significant changes in the miRNA profiles are presented in Table 3. Our univariate analyses showed significant changes in the miRNA serum levels for ovarian cancer risk factors advancing age, obesity, menopause, BRCA1 or 2 mutations, and existence of breast cancer history in family [first-degree relative (FDR)] although the fold changes (FC) were small (Fig. 2C and D; Table 3). As indicated by a log FC below 1 in Table 3, the miRNA expression was lower for most of the miRNAs in all variables except obesity. Many miRNAs (34 and 26, respectively) showed changes with advancing age and in postmenopausal women compared with premenopausal women. For example, both variables were associated with lower serum levels of miR-205-5p and miR-574-3p (Table 3; Fig. 3A and B). Significant variations in 44 miRNAs were identified when comparing ovarian status (intact ovaries vs. bilateral oophorectomy; Table 3; Fig. 3F). Because a high rate of BRCA1 or 2 mutation–positive subjects had undergone BSO, we stratified them according to BRCA mutational status. Five miRNAs varied significantly according to ovarian status in the non-BRCA–mutated group (miR-32-5p, miR-95-5p, miR-17-5p, miR-18b-5p, and miR-425-5p, all adjusted P < 0.05; n = 1,491; Fig. 3G), but no changes were associated with oophorectomy specifically in the BRCA-positive cohort. A personal BRCA1 or 2 mutation was associated with lower levels of several circulating miRNAs, most notably showing lower miR-1237-3p levels (all adjusted P < 0.001; Table 3; Fig. 3D and E). A family history of breast cancer in FDR showed a significant change in the levels of miR-185-3p and miR-1237-3p (adjusted P < 0.05; Table 3). The comorbidities obesity, hypertension, and hypercholesterolemia had a significant influence on the expression of miRNAs with up to 39 significantly altered miRNAs (Table 3; Fig. 3C). Other variables, such as smoking, parity, age at menarche, HRT, oral contraception, HPV positivity, endometriosis, tubal ligation, and diabetes, did not significantly influence any of the 179 miRNAs analyzed when corrected for multiple testing. Furthermore, a personal or family history of non–ovarian cancer, such as endometrial or colon cancer, did not affect the serum miRNA profiles (Fig. 2C).
Significant miRNA changes for age, menopause, BRCA1/2 mutations, and ovarian status. (A) Correlation of miR-205-5p with advancing age. B–G, Significant changes in miRNA levels for different clinical variables, (B) menopausal status, (C) obesity, (D and E) BRCA1 or 2 mutations, (F) ovarian status, and (G) ovarian status in the non-BRCA cohort (all adjusted P < 0.05). In the box plots, the boxes represent the interquartile range (IQR), the red line represents the median, the whiskers represent scores outside the middle 50% between the minimum and maximum values, and outliers are shown as red crosses.
Significant miRNA changes for age, menopause, BRCA1/2 mutations, and ovarian status. (A) Correlation of miR-205-5p with advancing age. B–G, Significant changes in miRNA levels for different clinical variables, (B) menopausal status, (C) obesity, (D and E) BRCA1 or 2 mutations, (F) ovarian status, and (G) ovarian status in the non-BRCA cohort (all adjusted P < 0.05). In the box plots, the boxes represent the interquartile range (IQR), the red line represents the median, the whiskers represent scores outside the middle 50% between the minimum and maximum values, and outliers are shown as red crosses.
Discussion
Circulating miRNAs have emerged as promising biomarkers for the early detection of ovarian cancer. In our previous work, we showed that using neural network analysis, a miRNA algorithm could help diagnose epithelial ovarian cancer with high specificity and thereby distinguish cases of epithelial ovarian cancer from benign ovarian tumors, borderline tumors, and healthy controls. The model also outperformed CA125 with a higher sensitivity for stage I/II ovarian cases (24). However, to advance an miRNA-based algorithm for the early detection of ovarian cancer in the clinic, knowledge about confounding factors is crucial as miRNAs have been associated with a variety of physiologic and pathophysiologic conditions that could bias the interpretation of a miRNA-based test (11, 13, 25, 26, 35). For example, in a subsequent study, we described that BRCA mutations are associated with unique miRNA profiles, even in the absence of malignancy (36).
In the current study, we identified ovarian cancer risk factors and common medical conditions that significantly influence serum miRNA profiles in a non–ovarian cancer female cohort. Among the focused panel of 179 circulating miRNAs, we observed a significant change in the serum levels of up to 48 miRNAs in 9 of 33 different clinical variables. Specifically, the ovarian cancer risk factors advancing age, menopause, obesity, BRCA 1/2 mutations in the subject, and existence of breast cancer history in the family had a significant impact on circulating miRNA expression. Chronic diseases, such as hypertension and hypercholesterolemia, also showed differentially expressed miRNA levels. Twenty-four additional variables that we studied, such as oral contraception, history of breast cancer, and age at menarche, seem unlikely to be confounding factors as they had no significant influence on our panel of 179 miRNAs.
Heterogeneity in the methodology of individual studies on circulating miRNAs is important when comparing our results with those of previous studies. Differences in the starting material (whole blood, plasma, and serum), use of RNA extraction steps, detection methods, endogenous controls, and assay platforms can strongly impact miRNA measurements (37). For example, Zhang and colleagues detected significant age-dependent changes in miR-142-5p in serum samples in accordance with our findings but observed changes in other miRNAs (miR-92, miR-222, miR-375, miR-29b, miR-106, and miR-130) that were unchanged in our study. Differences in the study population and methodology could further explain these discrepancies in the findings (38). However, the large size of our sample set provides a substantial and internally consistent dataset for other groups to explore.
We noted several miRNAs that seemed to be related to estrogen exposure. miRNAs (such as miR-17-5p, miR-18b-5p, and miR-93-5p) differentially expressed after menopause were significantly altered in older women and after BSO as well (Supplementary Fig. S3), thus raising the question whether these miRNAs share gene targets and target pathways and if estrogen receptors might play a role. Notably, estrogen receptor 1 (ER1) and ER2 are predicted target genes of miR-93-5p, miR-18b-5p, and miR-17-5p according to the databases TargetScanHuman 8.0 and miRPathDB 2.0 (39, 40). The potential of ER as a target gene of miR-93, miR-18b, and miR-17 is supported by discoveries from several groups on the differential expression of these miRNAs depending on the ER status in breast cancer (40–47). Interestingly, we did not observe significant changes in miR-93-5p, miR-18b-5p, and miR-17-5p expression among the breast cancer subjects in our study population, but we did not stratify for ER status in these cases as detailed pathology reports were not always available.
Our interest in miR-93-5p, miR-18b-5p, and miR-17-5p was further promoted by the knowledge that these miRNAs have been described to carry either tumor-suppressive or oncogenic features in ovarian cancer. Furthermore, the sex steroid hormone receptors ER1 and ER2 are found in ovarian tissue and influence ovarian carcinogenesis. ER1 and ER2 mediate effects on ovarian cancer cell proliferation and apoptosis in contrary ways, and the ER1/ER2 ratio seems to correlate with malignant progression in the ovary (48, 49). Thus, the roles of ERs and estrogens in the miR-93-5p, miR-18b-5p, and miR-17-5p target pathways in ovarian cancer should be further studied to elucidate their potential as diagnostic or therapeutic options.
A miRNA enrichment analysis was performed using the publicly available miRNA Enrichment Analysis and Annotation Tool from Saarland University (50). The miRNAs significantly associated with menopause and/or bilateral salpingo-oophorectomy were examined for overrepresented relationships with target proteins, organs, disease states, or signaling pathways. The top 100 categories by the lowest P value can be found in Supplementary Fig. S4. Interestingly, there were very similar profiles noted for the three miRNAs, miR-17-5p, miR-32-5p, and miR-93-5p, suggesting mechanistic convergence of these three miRNAs. Among the miRNAs examined, associations were noted with induced pluripotent stem cells and upregulation in colon cancer and dilated cardiomyopathy.
An important limitation of our study is that we limited our analysis to a panel of 179 miRNAs. The association of our metadata with other miRNAs that were not analyzed by our panel remains unknown and should be evaluated in future research. However, our analyses focused on miRNAs that were reliably detectable in serum [in a cohort of healthy controls, benign adnexal masses, and patients with ovarian cancer (24)] as indicated by the low number of absent miRNA signals in our dataset. Other studies have chosen a different selection of miRNAs as biomarker candidates, thus limiting the comparability and integration of study results in the miRNA biomarker research field that should be addressed in the future. For example, the miRNA selection of the study from Yokoi and colleagues who selected their miRNAs in a dataset of ovarian cancer cell lines and ovarian cancer serum samples has limited overlap with our miRNA panel (51). Similarly, we were limited to data present in electronic health records. Focusing on subjects with prior encounters in obstetrics or gynecology helped increase the number of subjects with relevant reproductive health information available. Finally, although it contained large absolute numbers of minority groups, nonetheless the relative proportions of these groups in New England are small, and these results should be compared with populations from regions with different proportions of racial and ethnic groups.
Our study has several strengths. Most prior studies have focused on the association of a handful of miRNAs with only one or a few medical conditions, whereas the strength of our study is the comprehensive analysis of more than 30 clinically relevant variables on a panel of 179 miRNAs. Most of these variables can be easily evaluated in medical encounters and are part of the routine workup for ovarian tumors. This study provides novel information about the proposed miRNAs as potential biomarkers for ovarian cancer with respect to possible confounding factors. Furthermore, our study included a large sample size of 1,576 women with an age distribution of 21 to 87 years.
In summary, our study strengthens our understanding of the impact of clinical variables, particularly ovarian cancer risk factors, on serum miRNA profiles in women without ovarian cancer. This knowledge of possible confounders will improve the design of miRNA-based tools for ovarian cancer detection.
Authors’ Disclosures
D. Chowdhury reports grants from NIH and Support from Cancer Foundations outside the submitted work. K.M. Elias reports other support from Abcam, Inc. during the conduct of the study. D. Chowdhury, K.M. Elias, and W. Fendler report other support from Aspira Women's Health and a patent for Circulating miRNAs for Diagnosis of Ovarian Cancer issued and licensed to Aspira Women’s Health. No disclosures were reported by the other authors.
Authors’ Contributions
L. Wollborn: Conceptualization, data curation, formal analysis, methodology, writing–original draft, writing–review and editing. J.W. Webber: Formal analysis, visualization, methodology, writing–review and editing. S. Alimena: Data curation. S. Mishra: Data curation. C.B. Sussman: Data curation. C.E. Comrie: Data curation. D.G. Packard: Data curation. M. Williams: Data curation. T. Russell: Data curation. W. Fendler: Writing–review and editing. D. Chowdhury: Writing–review and editing. K.M. Elias: Conceptualization, resources, formal analysis, supervision, funding acquisition, validation, methodology, project administration, writing–review and editing.
Acknowledgments
The authors acknowledge funding support for this work from Deutsche Forschungsgemeinschaft (German Research Foundation)—project 450518177 (L. Wollborn); the Massachusetts Life Sciences Center Bits to Bytes grant (K.M. Elias); the Honorable Tina Brozman Foundation (K.M. Elias and D. Chowdhury); the Deborah and Robert First Family fund (K.M. Elias and D. Chowdhury); Team Detect Me If You Can (K.M. Elias and D. Chowdhury); the Mighty Moose 5K Foundation (K.M. Elias); the V Foundation (K.M. Elias and D. Chowdhury); and the Dana-Farber/Harvard Cancer Center Ovarian Cancer SPORE grant from the NCI at the NIH under award number P50CA240243 (K.M. Elias and D. Chowdhury).
Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).
References
Supplementary data
Significant miRNA changes for age, menopause and bilateral salpingo-oophorectomy (BSO)
Gene set enrichment analysis
Patient characteristics
List of miRNA panel
Raw data, miRNAs and clinical data