Abstract
Prediction of spatially variant response to cancer therapies can inform risk-adaptive management within precision oncology. We developed the “Voxel Forecast” multiscale regression framework for predicting spatially variant tumor response to chemoradiotherapy on fluorodeoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT) imaging.
Twenty-five patients with locally advanced non–small cell lung cancer, enrolled on the FLARE-RT phase II trial (NCT02773238), underwent FDG PET/CT imaging prior to (PETpre) and during week 3 (PETmid) of concurrent chemoradiotherapy. Voxel Forecast was designed to predict tumor voxel standardized uptake value (SUV) on PETmid from baseline patient-level and voxel-level covariates using a custom generalized least squares (GLS) algorithm. Matérn covariance matrices were fit to patient- specific empirical variograms of distance-dependent intervoxel correlation. Regression coefficients from variogram-based weights and corresponding standard errors were estimated using the jackknife technique. The framework was validated using statistical simulations of known spatially variant tumor response. Mean absolute prediction errors (MAEs) of Voxel Forecast models were calculated under leave-one-patient-out cross-validation.
Patient-level forecasts resulted in tumor voxel SUV MAE on PETmid of 1.5 g/mL while combined patient- and voxel-level forecasts achieved lower MAE of 1.0 g/mL (P < 0.0001). PETpre voxel SUV was the most important predictor of PETmid voxel SUV. Patients with a greater percentage of under-responding tumor voxels were classified as PETmid nonresponders (P = 0.030) with worse overall survival prognosis (P < 0.001).
Voxel Forecast multiscale regression provides a statistical framework to predict voxel-wise response patterns during therapy. Voxel Forecast can be extended to predict spatially variant response on multimodal quantitative imaging and may eventually guide optimized spatial–temporal dose distributions for precision cancer therapy.
Advanced prediction algorithms based on imaging biomarkers, as complements to tissue and peripheral biomarkers, enhance clinical decision support for patient risk stratification and response assessment. We demonstrate that multiscale patient factors and imaging volume element (voxel) variables on fluorodeoxyglucose positron emission tomography combine to predict spatially variant response during the third week of concurrent chemoradiotherapy in patients with locally advanced non–small cell lung cancer. Our novel Voxel Forecast tool can highlight tumor voxels that are responding well and those that are not, empowering cancer medicine providers to initiate personalized treatment adaptations. Forecasting spatially variant cancer response dynamics in individual patients provides powerful imaging biomarker–based guidance to the growing therapeutic arsenal in precision oncology.
Introduction
Precision oncology is predicated on delivering the right set of treatments at the right time to each patient (1, 2). Tissue and peripheral biomarkers establish baseline tumor pan-“omic” profiles for selection of therapeutic regimens, which continue to expand with the emergence of next-generation molecularly targeted agents (3), immune-checkpoint inhibitors (4–10), and cancer vaccines (11–14). Quantitative imaging is gaining prominence as a noninvasive biomarker of early response to cancer therapies, providing rich high-dimensional radiomic data (15) for tumor phenotyping (16, 17) in precision oncology (18). Among a variety of imaging applications to cancer therapy assessment, the quantitative evaluation of early tumor response on fluorodeoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT) stands out as a success story for predicting outcomes in patients with lung cancer and enabling more precise patient risk stratification (19–25). Unlike conventional tissue or peripheral biomarkers, FDG PET/CT as an imaging biomarker contains spatial information over the entire disease burden that can be correlated with lung cancer recurrence patterns (26–29).
Mounting clinical evidence spurred the launch of several FDG PET/CT imaging biomarker-guided clinical trials in lung cancer (30–33), which seek to deliver risk-adapted radiation therapies in combination with systemic therapies to improve efficacy. However, current trial designs limit their assessments to global patient or tumor region-of-interest response that ignore intratumoral spatial heterogeneity in response. Analogous to tumor genomic heterogeneity driving clonal evolution and treatment resistance (34), tumor spatial radiomic heterogeneity captures both interpatient and intratumoral variation in response (35). There are potential advantages to examining tumor response on multiple spatial scales of imaging, spanning individual imaging volume elements (voxels) to voxel clusters, lesions within the same patient, and lesions between patients. Multiscale response prediction can incorporate information at all spatial scales to better capture tumor radiomic heterogeneity, providing clinical decision support based on the differentiation of response between patients, between tumors, and within tumors.
Multiscale prediction of tumor voxel response poses technical challenges. Conventional statistical methods such as ordinary least squares (OLS) regression to predict tumor voxel response have been investigated. Prior exploratory studies applied linear and linear-mixture regression models to characterize preliminary spatial relationships between longitudinal PET images, including identification of spatially coherent radioresistent tumor voxel subpopulations (36). Some investigations used logistic regression to predict FDG PET tumor metabolic control probabilities (37); others compared linear regression, logistic regression, and classification and regression trees to predict voxel locations of tumor recurrence (38). These initial studies did not fully account for the spatially variant correlation between all voxels within the same tumor relative to independent data between patients. Underestimation of statistical variability by OLS regression can inflate false discoveries and relative importance of predictor variables. Furthermore, OLS regression does not leverage the spatial structure of voxel data to efficiently power prediction modeling.
Predicting spatially variant response to cancer therapy on longitudinal imaging with a novel statistical framework can inform individualized risk-adaptive treatment strategies. The purpose of this investigation was to develop a multiscale regression framework for predicting spatially variant tumor response at the voxel scale while accounting for intervoxel correlation, described henceforth as “Voxel Forecast.” We first conducted simulations with known ground truth voxel response and spatial correlation structure to evaluate the statistical accuracy and precision of Voxel Forecast. We then applied Voxel Forecast to a cohort of patients with locally advanced non–small cell lung cancer enrolled on the FLARE-RT protocol (NCT02773238) to predict week 3 mid-treatment FDG PET/CT tumor voxel response. We focused on mid-treatment PET/CT response prediction rather than posttreatment PET/CT response as this application is conducive to timely alterations in patient management strategies. Lastly, we explored the clinical implications of Voxel Forecast as a decision support system by correlating prediction residuals with clinical factors, treatment factors, and patient outcomes.
Materials and Methods
Patient population and clinical trial protocol
The phase II FLARE-RT clinical trial (NCT02773238) eligibility criteria included patients with locally advanced, inoperable stage IIB–IIIB non–small cell lung cancer (AJCC v7) and performance status ECOG 0–1, as well pulmonary function test, liver function test, and renal function test adequacy. All patients received 6 weeks of concurrent chemotherapy and radiotherapy (RT) with functional lung avoidance planning strategies, while a select subgroup of patients (n = 9) received adaptive and concomitant dose escalation based on mid-treatment PET imaging response assessment. Pre-RT FDG PET/CT imaging occurred at a median time of 10 days (range, 1–33 days) prior to treatment start. Twenty-eight consecutive patients were enrolled on this study following obtained written consent, approval by the University of Washington/Fred Hutchinson Cancer Consortium institutional review board (CC9599), and compliance with the Declaration of Helsinki ethical guidelines. After excluding patients with metastatic disease on repeat baseline FDG PET/CT (n = 2) or those who withdrew from the protocol (n = 1), a total of 25 patients underwent mid-RT FDG PET/CT imaging at median prescribed dose of 24 Gy in 12 fractions and range of 20–26 Gy in 10–13 fractions during the third week of radiotherapy. These patients formed the clinical data set for Voxel Forecast analysis.
Image acquisition and processing
All baseline and mid-treatment FDG PET/CT imaging was acquired with patients immobilized in RT position and adhered to standard imaging protocols on the same or cross-calibrated PET/CT scanners. Between imaging time points, the majority of patients (n = 21) were scanned on the Discovery STE (GE Healthcare), while the remainder (n = 4) were scanned on the Discovery MI (GE Healthcare). In compliance with a 12-hour fasting protocol, patients were injected with median 10.0 mCi (IQR, 9.6–10.6 mCi) [18F]FDG following pre- and postinjection assays. Baseline PET scans were started at a median time postinjection of 60 minutes (IQR, 60–63 minutes), and week 3 mid-treatment PET scan start delays were matched to baseline scan start delays for each patient with a median time difference of 0 minutes (IQR, 0–2 minutes). Patients scanned on the GE DSTE PET/CT underwent acquisitions of 5 minutes per bed position [15 cm axial field-of-view (aFOV)], spanning from base-of-skull to mid-thigh (6–7 aFOV, 30–35 minutes); patients scanned on the GE DMI PET/CT underwent acquisitions of 2.5 minutes per bed position (25 cm aFOV), spanning from base-of-skull to mid-thigh (4–5 aFOV, 10–13 minutes). All images were reconstructed using an ordered subset expectation maximization iterative algorithm without time-of-flight or resolution recovery corrections to improve harmonization of patient data between DSTE and DMI PET/CT scanners. Standard quantitative PET data corrections, including CT-based attenuation correction, were applied.
The primary metabolic tumor volumes (MTVs) were semiautomatically defined by the maximum gradient descent algorithm PET Edge (MIM Software). Interobserver variability was minimized by independent generation of MTV by two observers followed by group review and consensus from a board-certified radiation oncologist. Localized rigid alignment of primary MTV between the planning respiratory-correlated (4D) CT, pre-RT PET/CT, mid-RT PET/CT, and RT dose DICOM objects was performed for all patients using a built-in mutual information algorithm (MIM Software). Co-registered PET voxel grids were resampled to match the RT dose 2-mm isotropic voxel grids, which guaranteed isomorphic voxel mapping between baseline and mid-treatment time points. We did not utilize deformable image registration of bulky locally advanced NSCLC tumors in order to minimize changes in the native FDG PET uptake spatial distribution, as well as mitigate uncertainties secondary to the image deformation algorithm. Example patient images with variable mid-treatment FDG PET response patterns are displayed in Fig. 1.
Baseline (pre-RT, top row) and week 3 response (mid-RT, bottom row) FDG PET/CT imaging during chemoradiotherapy in two example FLARE-RT protocol patients. FDG PET imaging SUV hot-metal color scale is overlaid on treatment planning CT imaging with rainbow color scale planned isodose lines between the pre-RT and mid-RT time points. Pre-RT MTV is displayed as a khaki contour. Patient (A) presents with large spatially heterogeneous response, while patient (B) presents with modest spatially homogeneous response that preserves the baseline ring uptake pattern.
Baseline (pre-RT, top row) and week 3 response (mid-RT, bottom row) FDG PET/CT imaging during chemoradiotherapy in two example FLARE-RT protocol patients. FDG PET imaging SUV hot-metal color scale is overlaid on treatment planning CT imaging with rainbow color scale planned isodose lines between the pre-RT and mid-RT time points. Pre-RT MTV is displayed as a khaki contour. Patient (A) presents with large spatially heterogeneous response, while patient (B) presents with modest spatially homogeneous response that preserves the baseline ring uptake pattern.
Voxel Forecast of simulated data
A schematic of the Voxel Forecast workflow for predicting spatially variant tumor response using customized multiscale regression is shown in Fig. 2. Full details of the Voxel Forecast modeling framework, the Matérn variogram estimation to account for intervoxel spatial dependence, and the simulation scenarios with ground truth for benchmarking prediction algorithm performance can be found in the Supplementary Material. At each iteration with a random sample of 10–75 simulated patient tumors characterized by a known spatially variant voxel response model, regression coefficients β of multiscale patient-level and voxel-level predictors were estimated with the following: (i) standard OLS algorithm for independent data (naïve method), (ii) the pure generalized least squares approach of the spatial modeling framework in which all parameters are weighted using the population-averaged variogram (GLS), and (iii) the custom OLS–GLS hybrid approach of the spatial modeling framework in Fig. 2, where only voxel-level parameters are weighted using the population-averaged variogram. The primary metrics to measure the performance of these three algorithms (naïve OLS, GLS, OLS–GLS hybrid) were the bias, relative to the true values of β specified when simulating the data; the standard error of β, estimated empirically as the standard deviation of the distribution across the 250 simulations for each sample size; and the 95% confidence interval coverage, estimated empirically as the percentage of times the confidence intervals for β included the true values specified when simulating the data.
Schematic of the Voxel Forecast variogram-weighted multiscale regression framework to predict tumor voxel response. Multiscale patient- and voxel-level predictors generate OLS residuals to initialize iterative OLS–GLS (generalized least squares) hybrid regression with Matérn variogram to account for spatial autocorrelation of voxels under leave-one-out cross-validation. Regression coefficient standard errors (SE) are estimated using the jackknife technique.
Schematic of the Voxel Forecast variogram-weighted multiscale regression framework to predict tumor voxel response. Multiscale patient- and voxel-level predictors generate OLS residuals to initialize iterative OLS–GLS (generalized least squares) hybrid regression with Matérn variogram to account for spatial autocorrelation of voxels under leave-one-out cross-validation. Regression coefficient standard errors (SE) are estimated using the jackknife technique.
Voxel Forecast of patient data
Univariate Voxel Forecast models of week 3 mid-RT PET standardized uptake value (SUV) within the MTV were generated using each patient-level and voxel-level predictor. Patient-level predictors included standard clinical characteristics and treatment factors, as well as quantitative PET SUV features that have demonstrated prognostic value for treatment response and clinical outcome stratification in locally advanced non–small cell lung cancer patient series (22, 39). Voxel-level predictors included pre-RT voxel SUV, voxel RT dose, and voxel distance from tumor centroid. High-dimensional engineered radiomic features from PET or CT were not included in this preliminary application in order to mitigate the risk of false discoveries while promoting transparency in the setting of modest patient samples. Two primary multivariate Voxel Forecast models were then constructed, one using only patient-level predictors (model P) and another using both patient-level and voxel-level predictors (model P + V). Right-skewed variables were log-transformed prior to inclusion in the models. Confidence intervals and Wald test P values for each regression coefficient were calculated using the jackknife as described in the Supplementary Material (40). To limit the number of patient-level predictors included in each model, only the patient-level predictors associated with a Wald test P value < 0.1 in a univariate model were included in the multivariate models.
Pairwise interaction terms were tested individually using Wald tests of the corresponding regression coefficients with model P + V as the base model. The interactions considered were the three patient-voxel combinations of stage IIIB and the voxel-level predictors, as well as the three pairwise voxel–voxel combinations of the voxel-level predictors. Prediction error of the models was measured per patient as the mean absolute error (MAE) in mid-RT SUV across all MTV voxels. Prediction error was calculated using leave-one-patient-out cross-validation. The Wilcoxon signed-rank test was used as an approximate test for any difference in prediction error between multivariate model P and model P + V.
A residue analysis of Voxel Forecast (multivariate model P + V) was conducted to test for associations between under-responding voxels, relative to prediction, and baseline clinical/treatment/imaging characteristics. The percentage of under-responding voxels in each patient tumor was calculated to stratify baseline characteristics into tertiles (i.e., patients with <25% under-responding voxels, 25%–75%, >75%). Differences between the tertiles of under-response in the same baseline factors considered during the univariate analysis and mid-RT PET clinical response status, defined prospectively under the FLARE-RT protocol by quantitative and multidisciplinary diagnostic assessment of metabolic changes, were tested using the Kruskal–Wallis test or Fisher exact test, as appropriate. Overall survival (OS) was summarized using the Kaplan–Meier estimator in patient subgroups that were significantly associated with under-response relative to Voxel Forecast predictions. Events were defined as death from any cause, and patients were censored at the time of last follow-up. Differences in OS between subgroups were evaluated by the log-rank test. All modeling and statistical calculations were performed using the R statistical computing language version 3.1.1 and 3.4.1 (R Foundation for Statistical Computing) and the gstat package version 1.1 (41, 42).
Results
Across all simulated scenarios (see Supplementary Material), estimates of the regression coefficients using the naïve OLS, GLS, and OLS–GLS hybrid methods had little to no bias. Even with 10 simulated patients, the estimates of bias were <10% of the standard error across all coefficients. The precision and 95% CI coverage are summarized in Fig. 3. For patient-level variables (the intercept and G, see Supplementary Material for descriptions of the variables), the naïve and hybrid methods had similar standard errors while GLS consistently had higher standard errors. However, both GLS and the hybrid methods produced 95% CIs with approximately nominal coverage across the sample sizes considered while the naïve approach resulted in CI coverage < 20%. For voxel-level variables (X1 and X4), the naïve method had standard errors five to 60 times higher than both the GLS and hybrid methods, which themselves were very similar. For X1, which had extensive spatial correlation, the naïve method had CI coverage <70% while CI coverage was close to the nominal level for X4. CI coverage from the GLS and hybrid methods was close to the nominal level, though slightly below it, across all variables (87%–97%). CI coverage from the GLS and hybrid methods was 88% at N = 35 for X1, which was within the simulation margin of error of ∼5% of the CI coverage estimated at nearby sample sizes (∼92%).
Standard errors (top row) and coverage of 95% confidence intervals (CI, bottom row) for regression coefficients across different sample sizes and methods (naïve OLS, GLS, OLS–GLS hybrid) using simulated data. The horizontal light gray solid line in the low panels indicates the nominal 95% coverage. The margin of error for differences between different points on the 95% CI coverage curves for the naïve and GLS/hybrid methods is approximately ±5% (for both) for the intercept, ±5% and ±4% for G, ±8% and ±5% for X1, and ±4% and ±5% for X4, respectively.
Standard errors (top row) and coverage of 95% confidence intervals (CI, bottom row) for regression coefficients across different sample sizes and methods (naïve OLS, GLS, OLS–GLS hybrid) using simulated data. The horizontal light gray solid line in the low panels indicates the nominal 95% coverage. The margin of error for differences between different points on the 95% CI coverage curves for the naïve and GLS/hybrid methods is approximately ±5% (for both) for the intercept, ±5% and ±4% for G, ±8% and ±5% for X1, and ±4% and ±5% for X4, respectively.
The custom OLS–GLS hybrid algorithm was selected to apply to the FLARE-RT clinical trial cohort (n = 25) due to its unbiased and high-precision performance for predicting simulated spatial response patterns. The cross-validated estimates of patient variograms are displayed for an example patient (Fig. 4A) and the patient cohort (Fig. 4B), demonstrating robustness of the Matérn function for characterizing intervoxel spatial dependence. Matérn function parameters included a population-averaged range of 1.97 cm and smoothness of 3.5, along with patient-specific sill estimates to preserve the native dynamic range of SUV (see Supplementary Material).
Estimated variogram based on the Matérn function for an example FLARE-RT patient (A, black curve) and patient population (B, black curve). The sill represents the dynamic range of spatial variance, normalized to 1 for the population estimate. The range represents the intervoxel distance over which data remain correlated. Gray curves show leave-one-patient-out cross-validation (LOOCV) estimates. Each patient has an individualized variogram with common range and smoothness but patient-specific sill.
Estimated variogram based on the Matérn function for an example FLARE-RT patient (A, black curve) and patient population (B, black curve). The sill represents the dynamic range of spatial variance, normalized to 1 for the population estimate. The range represents the intervoxel distance over which data remain correlated. Gray curves show leave-one-patient-out cross-validation (LOOCV) estimates. Each patient has an individualized variogram with common range and smoothness but patient-specific sill.
Clinical and tumor characteristics are summarized in Table 1. Of the 25 patients, 15 (60%) were female, and age ranged from 50 to 75 years (median: 62 years). Eight patients (32%) were stage IIIB. Most tumors (76%) were adenocarcinoma and the rest were squamous cell carcinoma. Pre-RT SUVmax ranged from 3 to 32 g/cm3 (median: 11 g/cm3) and SUVmean ranged from 2 to 16 g/cm3 (median: 6 g/cm3). Over an average of 10,891 voxels per tumor, MTV voxel SUV decreased from 7 to 4 g/cm3 in response to 25 Gy on average, with large baseline interpatient (standard deviation [SD]: 4 g/cm3) and intratumor SUV variability (SD: 3 g/cm3).
Patient and tumor characteristics
Patient/tumor variable (n = 25) . | No. (%) or median (range) . |
---|---|
Female sex | 15 (60.0) |
Age, y | 62 (50–75) |
Adenocarcinoma (vs. squamous cell carcinoma) | 19 (76.0) |
Stage IIIB (vs. other) | 8 (32.0) |
Proton beam therapy (vs. X-ray radiotherapy) | 12 (48.0) |
Chemotherapy modality | |
Carboplatin + paclitaxel | 16 (64.0) |
Cisplatin + etoposide | 9 (36.0) |
Primary MTV, cm3 | 27 (2–396) |
SUVmax, g/cm3 | 11 (3–32) |
SUVmean, g/cm3 | 6 (2–16) |
TLG, g | 247 (5–4,408) |
Ring/concave uptake pattern (vs. foci/convex) | 12 (48.0) |
Time from pre-RT PET to mid-RT PET, d | 26 (16–51) |
Time from pre-RT PET to RT start, d | 10 (1–33) |
Patient/tumor variable (n = 25) . | No. (%) or median (range) . |
---|---|
Female sex | 15 (60.0) |
Age, y | 62 (50–75) |
Adenocarcinoma (vs. squamous cell carcinoma) | 19 (76.0) |
Stage IIIB (vs. other) | 8 (32.0) |
Proton beam therapy (vs. X-ray radiotherapy) | 12 (48.0) |
Chemotherapy modality | |
Carboplatin + paclitaxel | 16 (64.0) |
Cisplatin + etoposide | 9 (36.0) |
Primary MTV, cm3 | 27 (2–396) |
SUVmax, g/cm3 | 11 (3–32) |
SUVmean, g/cm3 | 6 (2–16) |
TLG, g | 247 (5–4,408) |
Ring/concave uptake pattern (vs. foci/convex) | 12 (48.0) |
Time from pre-RT PET to mid-RT PET, d | 26 (16–51) |
Time from pre-RT PET to RT start, d | 10 (1–33) |
Univariate results for predicting the mid-RT SUV per voxel are shown in Table 2. At the patient level, stage IIIB patients tended to have a 33% higher mid-RT SUV on average compared with stage IIB–IIIA patients (P = 0.073). Each 50% increment in SUVmax and SUVmean was associated with an 18% (P = 0.004) and 20% higher (P = 0.003) mid-RT SUV, respectively. At the voxel level, only pre-RT SUV was significantly associated with mid-RT SUV, with a 35% increase in mid-RT SUV per 50% increase in pre-RT SUV (P < 0.001). Voxel-level RT dose was not significantly associated with mid-RT SUV (P = 0.56), due in part to the spatial homogeneity of a uniform tumor dose prescription between pre-RT PET and mid-RT PET time points.
Univariate analysis of patient-level and voxel-level forecasting of mid-RT PET SUV
. | Univariate model . | . |
---|---|---|
Patient/tumor variable . | βa (95% CI) . | P valuec . |
Female sex | 1.2 (−26.5, 39.5) | 0.94 |
Age, per 5-y increase | 4.3 (−9.6, 20.5) | 0.55 |
Adenocarcinoma (vs. squamous cell carcinoma) | −6.0 (−40.8, 49.2) | 0.78 |
Stage IIIB (vs. other) | 33.4 (−2.9, 83.2) | 0.073 |
Proton beam therapy (vs. X-ray radiotherapy) | −10.8 (−35.2, 22.8) | 0.47 |
Chemotherapy modality | ||
Carboplatin + paclitaxel | (ref) | |
Cisplatin + etoposide | 1.2 (−27.2, 40.9) | 0.94 |
Primary metabolic tumor volume (MTV)b, per 50% increase | −0.4 (−4.8, 4.1) | 0.85 |
Primary SUVmaxb, per 50% increase | 18.1 (6.0, 31.7) | 0.004 |
Primary SUVmeanb, per 50% increase | 20.0 (6.9, 34.8) | 0.003 |
Primary total lesion glycolysis (TLG)b, per 50% increase | 1.3 (−2.5, 5.3) | 0.48 |
Ring/concave uptake pattern (vs. foci/convex) | −1.9 (−28.5, 34.8) | 0.90 |
Time from pre-RT PET to mid-RT PET, per 1-wk increase | −4.2 (−14.6, 7.5) | 0.45 |
Time from pre-RT PET to RT start, per 1-wk increase | −4.3 (−14.8, 7.5) | 0.44 |
No. of RT fractions between pre-RT and mid-RT PET | 1.6 (−14.5, 20.7) | 0.85 |
Mean dose between pre-RT and mid-RT PET, per 1-Gy increase | 0.7 (−6.4, 8.3) | 0.84 |
Voxel variable | ||
Voxel pre-RT SUVb, per 50% increase | 34.6 (27.1, 42.6) | <0.001 |
Voxel RT dose, per 1-Gy increase | −2.7 (−11.3, 6.9) | 0.56 |
Voxel distance from tumor centroid, per 1-cm increase | −7.4 (−23.5, 12.0) | 0.41 |
. | Univariate model . | . |
---|---|---|
Patient/tumor variable . | βa (95% CI) . | P valuec . |
Female sex | 1.2 (−26.5, 39.5) | 0.94 |
Age, per 5-y increase | 4.3 (−9.6, 20.5) | 0.55 |
Adenocarcinoma (vs. squamous cell carcinoma) | −6.0 (−40.8, 49.2) | 0.78 |
Stage IIIB (vs. other) | 33.4 (−2.9, 83.2) | 0.073 |
Proton beam therapy (vs. X-ray radiotherapy) | −10.8 (−35.2, 22.8) | 0.47 |
Chemotherapy modality | ||
Carboplatin + paclitaxel | (ref) | |
Cisplatin + etoposide | 1.2 (−27.2, 40.9) | 0.94 |
Primary metabolic tumor volume (MTV)b, per 50% increase | −0.4 (−4.8, 4.1) | 0.85 |
Primary SUVmaxb, per 50% increase | 18.1 (6.0, 31.7) | 0.004 |
Primary SUVmeanb, per 50% increase | 20.0 (6.9, 34.8) | 0.003 |
Primary total lesion glycolysis (TLG)b, per 50% increase | 1.3 (−2.5, 5.3) | 0.48 |
Ring/concave uptake pattern (vs. foci/convex) | −1.9 (−28.5, 34.8) | 0.90 |
Time from pre-RT PET to mid-RT PET, per 1-wk increase | −4.2 (−14.6, 7.5) | 0.45 |
Time from pre-RT PET to RT start, per 1-wk increase | −4.3 (−14.8, 7.5) | 0.44 |
No. of RT fractions between pre-RT and mid-RT PET | 1.6 (−14.5, 20.7) | 0.85 |
Mean dose between pre-RT and mid-RT PET, per 1-Gy increase | 0.7 (−6.4, 8.3) | 0.84 |
Voxel variable | ||
Voxel pre-RT SUVb, per 50% increase | 34.6 (27.1, 42.6) | <0.001 |
Voxel RT dose, per 1-Gy increase | −2.7 (−11.3, 6.9) | 0.56 |
Voxel distance from tumor centroid, per 1-cm increase | −7.4 (−23.5, 12.0) | 0.41 |
aPercent change in average SUV at the mid-RT PET scan.
bThe predictor was log-transformed prior to inclusion in the model to reduce right skewness.
cBoldfaced values indicate statistical significance (P < 0.05).
Multivariate voxel-scale prediction models for mid-RT SUV were constructed using patient-level predictors only (model P) and using patient- and voxel-level predictors together (model P + V; Table 3). Pre-RT SUVmean was included in the models without SUVmax due to high cross-correlation (r = 0.94). In model P, only pre-RT SUVmean was significantly and independently associated with mid-RT voxel SUV (21.5% increase per 50% increase, P = 0.001). Similarly, only pre-RT voxel SUV was significantly and independently associated with mid-RT voxel SUV (35% per 50% increase, P < 0.001) in model P + V. After including both patient- and voxel-level pre-RT SUV predictors for model P + V, pre-RT SUVmean was no longer significantly associated with mid-RT voxel SUV (P = 0.069). After patient-scale and voxel-scale multivariate adjustment for pre-RT PET predictors, a relatively stronger association between higher mean tumor radiation dose and lower mid-RT voxel SUV was observed (univariate P = 0.84, model P P = 0.13, model P + V P = 0.17, 4%–5% voxel SUV decrease per 1-Gy increase). Interactions between stage IIIB, a patient-level factor, and the three voxel-level predictors were tested, though none were statistically significant when added to model P + V (P > 0.54 for all). Similarly, the three pairwise interactions between the voxel-level predictors were tested and none were statistically significant (P > 0.13 for all).
Multivariate Voxel Forecast models with patient-level and voxel-level predictors for mid-RT PET SUV
. | Patient-level only model P . | Patient- and voxel-level model P + V . | ||
---|---|---|---|---|
Patient/tumor variable . | β (95% CI) . | Pb . | β (95% CI) . | P . |
Stage IIIB (vs. other) | 22.9 (−9.0, 65.8) | 0.17 | 23.6 (−7.6, 65.6) | 0.15 |
Primary SUVmeana, per 50% increase | 21.5 (9.6, 34.6) | 0.001 | −9.8 (−19.4, 0.9) | 0.069 |
Mean dose between pre-RT and mid-RT PET, per 1-Gy increase | −4.6 (−10.3, 1.5) | 0.13 | −4.0 (−9.5, 1.8) | 0.17 |
Voxel variable | ||||
Voxel pre-RT SUVa, per 50% increase | 34.8 (29.2, 40.7) | <0.001 | ||
Voxel RT dose, per 1-Gy increase | −0.3 (−0.8, 0.2) | 0.28 | ||
Voxel distance from tumor centroid, per 1-cm increase | −1.1 (−3.3, 1.1) | 0.30 |
. | Patient-level only model P . | Patient- and voxel-level model P + V . | ||
---|---|---|---|---|
Patient/tumor variable . | β (95% CI) . | Pb . | β (95% CI) . | P . |
Stage IIIB (vs. other) | 22.9 (−9.0, 65.8) | 0.17 | 23.6 (−7.6, 65.6) | 0.15 |
Primary SUVmeana, per 50% increase | 21.5 (9.6, 34.6) | 0.001 | −9.8 (−19.4, 0.9) | 0.069 |
Mean dose between pre-RT and mid-RT PET, per 1-Gy increase | −4.6 (−10.3, 1.5) | 0.13 | −4.0 (−9.5, 1.8) | 0.17 |
Voxel variable | ||||
Voxel pre-RT SUVa, per 50% increase | 34.8 (29.2, 40.7) | <0.001 | ||
Voxel RT dose, per 1-Gy increase | −0.3 (−0.8, 0.2) | 0.28 | ||
Voxel distance from tumor centroid, per 1-cm increase | −1.1 (−3.3, 1.1) | 0.30 |
aThe predictor was log-transformed prior to inclusion in the model to reduce right skewness.
bBoldfaced values indicate statistical significance (P < 0.05).
The median mid-RT PET SUV MAE estimated using leave-one-out cross-validation was 1.5 g/cm3 for model P and 1.0 g/cm3 for model P + V (P < 0.001 for the difference; Fig. 5). Predicted voxel-wise mid-RT SUV from model P + V as a function of pre-RT SUV is shown for two patients in Fig. 6. Voxel Forecast output in each patient was blinded to model training, highlighting an example case with typical MAE (1.1 g/cm3; Fig. 6A) and another with high MAE (3.3 g/cm3; Fig. 6B). The largest prediction errors in these patients tended to occur in voxels with high pre-RT SUV, where voxel forecasted response was either underestimated (Fig. 6A) or overestimated (Fig. 6B). Tumor voxel response forecasts were mapped back to the native PET imaging spatial coordinates and representative image slices (Fig. 7A and D) are displayed for comparison with the observed tumor voxel response image slices (Fig. 7B and E) in the same two example patients (Fig. 7A–F). Although the patient with typical MAE displays smaller regions that over-responded relative to prediction (Fig. 7C; ΔSUV > 0), the patient with high MAE displays large contiguous regions that under-responded relative to prediction (Fig. 7F; ΔSUV < 0).
Voxel Forecast model prediction errors. Leave-one-patient-out cross-validation (LOOCV) mean absolute error from OLS–GLS hybrid model of patient-level predictors only compared with OLS–GLS hybrid model of patient- and voxel-level predictors. Median (line), quartiles (box), and 1.5 IQR (whisker) are shown. Nonparametric pairwise comparisons were performed by Wilcoxon signed-rank testing.
Voxel Forecast model prediction errors. Leave-one-patient-out cross-validation (LOOCV) mean absolute error from OLS–GLS hybrid model of patient-level predictors only compared with OLS–GLS hybrid model of patient- and voxel-level predictors. Median (line), quartiles (box), and 1.5 IQR (whisker) are shown. Nonparametric pairwise comparisons were performed by Wilcoxon signed-rank testing.
Comparison of Voxel Forecast predicted (black markers) versus observed (gray markers) week 3 mid-treatment FDG PET voxel SUV in two example FLARE-RT protocol patients blinded to model training (A, B). Patient A presents with homogeneous response over moderate SUV range, resulting in MAE of 1.1 SUV (g/cm3), with most voxels over-responding to conventionally fractionated RT dose relative to prediction. Patient B presents with low response over large SUV range, resulting in biased prediction at high SUV and high MAE of 3.3 SUV (g/cm3). SUV voxels in patient B that under-responded to conventionally fractionated RT doses compared with prediction may define high-risk regions.
Comparison of Voxel Forecast predicted (black markers) versus observed (gray markers) week 3 mid-treatment FDG PET voxel SUV in two example FLARE-RT protocol patients blinded to model training (A, B). Patient A presents with homogeneous response over moderate SUV range, resulting in MAE of 1.1 SUV (g/cm3), with most voxels over-responding to conventionally fractionated RT dose relative to prediction. Patient B presents with low response over large SUV range, resulting in biased prediction at high SUV and high MAE of 3.3 SUV (g/cm3). SUV voxels in patient B that under-responded to conventionally fractionated RT doses compared with prediction may define high-risk regions.
Comparison of Voxel Forecast predicted (A and D) versus observed (B and E) week 3 mid-treatment FDG PET tumor images in the same blinded FLARE-RT protocol patients as Fig. 6. The first patient (top row) shows regions of over-response (C, ΔSUV = SUV predicted – SUV observed > 0) with low systematic bias in spatially variant prediction. The second patient (bottom row) shows systematic under-response (F, ΔSUV = SUV predicted – SUV observed < 0) throughout the tumor to conventionally fractionated RT doses, which may define high-risk regions.
Comparison of Voxel Forecast predicted (A and D) versus observed (B and E) week 3 mid-treatment FDG PET tumor images in the same blinded FLARE-RT protocol patients as Fig. 6. The first patient (top row) shows regions of over-response (C, ΔSUV = SUV predicted – SUV observed > 0) with low systematic bias in spatially variant prediction. The second patient (bottom row) shows systematic under-response (F, ΔSUV = SUV predicted – SUV observed < 0) throughout the tumor to conventionally fractionated RT doses, which may define high-risk regions.
A residue analysis of Voxel Forecast–defined under-responding tumor voxels, relative to prediction, was performed. Table 4 summarizes the percentage of under-responding voxels stratified into tertiles, which produced balanced subgroups for comparisons of baseline characteristics. Patients receiving proton pencil beam scanning therapy tended to have primary tumors with lower percentage of under-responding voxels compared with patients receiving conventional intensity modulated X-ray radiotherapy (P = 0.077). Specifically, six of 12 (50%) patients receiving proton therapy had fewer than 25% under-responding primary tumor voxels, compared with one of 13 (8%) patients receiving X-ray therapy. More importantly, patients whose tumors contained higher percentages of under-responding voxels relative to prediction were more likely to be classified as mid-treatment PET nonresponders according to the FLARE-RT protocol (P = 0.030), and in turn, these nonresponders had a significantly worse OS prognosis than responders (Fig. 8; P < 0.001).
Patient/tumor characteristics stratified by tertiles of under-responding voxel percentages relative to prediction
. | % of Under-responding voxels . | . | ||
---|---|---|---|---|
Variable . | <25% (N = 7) . | 25%–75% (N = 9) . | >75% (N = 9) . | P . |
Female sex | 2 (28.6) | 7 (77.8) | 6 (66.7) | 0.20 |
Age, y | 60 (50–70) | 66 (59–75) | 58 (51–72) | 0.22 |
Adenocarcinoma (vs. squamous cell carcinoma) | 5 (71.4) | 7 (77.8) | 7 (77.8) | >0.99 |
Stage IIIB (vs. other) | 3 (42.9) | 1 (11.1) | 4 (44.4) | 0.31 |
Proton beam therapy (vs. X-ray radiotherapy) | 6 (85.7) | 3 (33.3) | 3 (33.3) | 0.077 |
Chemotherapy modality | >0.99 | |||
Carboplatin + paclitaxel | 4 (57.1) | 6 (66.7) | 6 (66.7) | |
Cisplatin + etoposide | 3 (42.9) | 3 (33.3) | 3 (33.3) | |
Primary MTV, cm3 | 27 (3–305) | 32 (16–396) | 25 (2–351) | 0.56 |
SUVmax, g/cm3 | 11 (6–32) | 10 (4–23) | 11 (3–23) | 0.95 |
SUVmean, g/cm3 | 5 (2–16) | 5 (2–12) | 6 (2–11) | 0.97 |
TLG, g | 204 (12–1,566) | 257 (38–4,408) | 247 (5–2,847) | 0.63 |
Ring/concave uptake pattern (vs. foci/convex) | 3 (42.9) | 5 (55.6) | 4 (44.4) | >0.99 |
Time from pre-RT PET to mid-RT PET, d | 30 (23–44) | 26 (21–51) | 21 (16–44) | 0.23 |
Time from pre-RT PET to RT start, d | 14 (6–28) | 10 (5–33) | 5 (1–28) | 0.21 |
Number of fractions | 12(11–13) | 12 (11–13) | 12 (10–13) | 0.81 |
Mean dose, Gy | 24 (23–28) | 25 (23–28) | 25 (19–27) | 0.69 |
Mid-RT PET nonresponder | 0 (0.0) | 4 (44.4) | 6 (66.7) | 0.030 |
. | % of Under-responding voxels . | . | ||
---|---|---|---|---|
Variable . | <25% (N = 7) . | 25%–75% (N = 9) . | >75% (N = 9) . | P . |
Female sex | 2 (28.6) | 7 (77.8) | 6 (66.7) | 0.20 |
Age, y | 60 (50–70) | 66 (59–75) | 58 (51–72) | 0.22 |
Adenocarcinoma (vs. squamous cell carcinoma) | 5 (71.4) | 7 (77.8) | 7 (77.8) | >0.99 |
Stage IIIB (vs. other) | 3 (42.9) | 1 (11.1) | 4 (44.4) | 0.31 |
Proton beam therapy (vs. X-ray radiotherapy) | 6 (85.7) | 3 (33.3) | 3 (33.3) | 0.077 |
Chemotherapy modality | >0.99 | |||
Carboplatin + paclitaxel | 4 (57.1) | 6 (66.7) | 6 (66.7) | |
Cisplatin + etoposide | 3 (42.9) | 3 (33.3) | 3 (33.3) | |
Primary MTV, cm3 | 27 (3–305) | 32 (16–396) | 25 (2–351) | 0.56 |
SUVmax, g/cm3 | 11 (6–32) | 10 (4–23) | 11 (3–23) | 0.95 |
SUVmean, g/cm3 | 5 (2–16) | 5 (2–12) | 6 (2–11) | 0.97 |
TLG, g | 204 (12–1,566) | 257 (38–4,408) | 247 (5–2,847) | 0.63 |
Ring/concave uptake pattern (vs. foci/convex) | 3 (42.9) | 5 (55.6) | 4 (44.4) | >0.99 |
Time from pre-RT PET to mid-RT PET, d | 30 (23–44) | 26 (21–51) | 21 (16–44) | 0.23 |
Time from pre-RT PET to RT start, d | 14 (6–28) | 10 (5–33) | 5 (1–28) | 0.21 |
Number of fractions | 12(11–13) | 12 (11–13) | 12 (10–13) | 0.81 |
Mean dose, Gy | 24 (23–28) | 25 (23–28) | 25 (19–27) | 0.69 |
Mid-RT PET nonresponder | 0 (0.0) | 4 (44.4) | 6 (66.7) | 0.030 |
NOTE: Values are No. (%) or median (range) unless otherwise specified.
Kaplan–Meier estimated OS stratified by mid-treatment PET response status according to the FLARE-RT protocol (log-rank P < 0.001).
Kaplan–Meier estimated OS stratified by mid-treatment PET response status according to the FLARE-RT protocol (log-rank P < 0.001).
Discussion
We have proposed a Voxel Forecast modeling framework that uses variogram weighting to account for the spatial dependence between voxels within the same image, conducted a simulation study to measure the performance of the framework under known conditions, and applied the framework to predict week 3 mid-RT tumor voxel SUV in a cohort of patients with locally advanced NSCLC enrolled on the FLARE-RT phase II clinical trial. The overall mean absolute prediction SUV error of Voxel Forecast improved from 1.5 g/cm3 when only patient-level predictors were used to 1.0 g/cm3 when voxel-level predictors were included in the model. Pre-RT voxel SUV was the strongest individual predictor of week 3 mid-RT voxel SUV in response to uniformly prescribed tumor radiation dose.
Spatially heterogeneous prediction errors in mid-RT PET SUV across voxels and contiguous regions were influenced by patient-level and voxel-level factors. Some patients had tumors with low systematic bias in voxel forecasted FDG PET response, while other patients had tumors with large bias in voxel forecasted FDG PET response. These Voxel Forecast residuals have the potential to provide unique and powerful decision support for spatially variant response assessment in the context of adaptive cancer therapy, by defining tumor voxels that either under-respond or over-respond relative to prediction. Residue analysis revealed that primary tumor voxels treated with proton pencil beam scanning therapy may be less likely to under-respond compared with voxels treated with conventional X-ray intensity modulated radiotherapy. Reports of improved clinical outcomes in locally advanced NSCLC patients receiving proton beam therapy (43) motivate further investigation of spatially variant response patterns in this subpopulation. Patients with a higher percentage of under-responding primary tumor voxels were more likely to be classified as protocol-defined mid-RT PET nonresponders, which in turn was a significant detriment to OS. Under-responding tumor voxels on mid-treatment PET may define high-risk regions for altering patient management in clinical oncology, from selecting radiotherapy modalities to adaptively intensifying radiation dose (33). Further planned accrual of patients onto the open FLARE-RT phase II trial and outcome data maturation will support a future treatment failure pattern analysis in relation to Voxel Forecast response to refine high-risk region definition.
The Voxel Forecast multiscale regression framework can potentially support extensions to several clinical applications. Beyond early response imaging, posttreatment imaging patterns of response can be predicted. Specific to the FLARE-RT trial, post-treatment PET voxel response can be predicted to include pretreatment PET, mid-treatment PET, and RT dose voxel covariates from both spatially uniform and spatially adapted nonuniform prescriptions. Spatially variant posttreatment response prediction can consist of continuous voxel-level SUV changes as well as probabilities of voxel-level cancer recurrence from dichotomized PET SUV data. Voxel Forecast of temporal response dynamics across imaging time points, ranging from daily/weekly quantitative CT/MR imaging during therapy to quarterly/semiannual surveillance CT after therapy, can be explored. In the broader context of precision oncology, Voxel Forecast can be adapted to predict multiscale response from an array of molecularly targeted and immune-checkpoint blockade cancer therapies. These agents can be modeled as multiscale variables, both as global spatially uniform interventions and pharmacodynamically informed spatial biodistributions. In all cases, the spatial correlation structure of the therapeutic variable and its interaction with other patient-level and voxel-level input variables would help drive imaging-based response predictions. This approach has the potential to better differentiate treatment effects, so-called pseudoprogression, from true disease progression by leveraging spatial information from quantitative longitudinal imaging. Lastly, Voxel Forecast can operate on functional imaging of normal tissue, including but not limited to pulmonary perfusion/ventilation, hepatocyte activity, and bone marrow proliferation, to predict spatially variant organ function response to cancer therapies.
Other investigations have proposed technical methods for predicting spatial, temporal, or combined spatiotemporal data. Multiple linear regression was used to predict voxel-level PET response to radiotherapy and included covariates of voxel-neighborhood spatial correlation (38). Time-series predictions of voxel-neighborhood-averaged signals on functional MRI were built on Bayesian frameworks that leveraged prior knowledge to predict future time-intensity curve states and kinetic parameters at the regional scale (44–46) or voxel scale (47). Outside of oncology and medical imaging research, spatiotemporal Kriging models of geological and meteorological data have combined physical equations with empirical spatial statistics for forecasting (48–50). These methods rely on strong mathematical modeling assumptions regarding the spatiotemporal structure, which remains a challenge in predictive oncology when forecasting individual patient tumor response to cancer therapies (51). In contrast, Voxel Forecast makes assumptions about the spatial structure to gain statistical efficiency but does not rely on those assumptions to produce correct inferences. Random Forest ensemble learning models also do not require many mathematical assumptions of the underlying structure of the data; in fact, studies of densely sampled spatiotemporal data were used to train such models that closely matched the performance of Kriging spatiotemporal prediction models without strict requirements for data preprocessing and model parameterization (52). In contrast to Voxel Forecast, Random Forest models can automatically discover nonlinear relationships and interactions between variables. However, such models require substantially more data to train and are less transparent for model interpretation. Predictor variable importance can be ranked, but the quantitative effect size of each variable on the predicted outcome, as well as any interactions between variables, is difficult to estimate. In clinical oncology where therapies are rapidly evolving, independent patient sample sizes remain modest and therefore limit high-dimensional parametric spatiotemporal modeling.
The Voxel Forecast algorithm has several notable strengths. The algorithm is agnostic to quantitative imaging modality, therapeutic intervention, spatial scale, and correlation structure when making response predictions. Although our application focused on predicting mid-treatment FDG PET imaging-based tumor response to concurrent chemotherapy and radiotherapy at the voxel scale, Voxel Forecast could have been applied to quantitative CT and MRI to predict multiscale response to immune-checkpoint inhibitors in combination with radiotherapy. Furthermore, Voxel Forecast does not impose restrictions on the spatial scale of predictor variables, which can flexibly range from individual voxels to individual lesions and global patient parameters. This implies that Voxel Forecast could be trained to differentiate response to cancer therapies between patients, between lesions, between voxel clusters within lesions, and between lesion voxels as part of a comprehensive characterization of response heterogeneity. For example, region-level and voxel-level information on mediastinal and hilar lymph nodes would enable Voxel Forecast to predict spatially variant response patterns beyond the primary tumor. Statistical inference for the regression coefficients, including confidence intervals and P values from hypothesis tests, is based on a nonparametric jackknife approach that was validated for small patient sample sizes. This allows inferences to be robust to misspecification of the distribution of the outcome variable and the assumed voxel correlation structure, which could be more complex than a stationary Matérn variogram with parameter values shared across patients.
The current Voxel Forecast algorithm also has several limitations, some of which may be addressed by future algorithmic development. The core of the algorithm is a linear and additive structure which does not automatically capture many complex phenomena. However, this allows the model to achieve transparency and interpretability, and can serve as a useful approximation. This limitation can be addressed to some degree with inclusion of nonlinear terms and interactions, particularly those derived from engineered or learned imaging features. An additional feature selection or clustering routine for benchmarking a subsequent voxel radiomics response prediction model would be necessary. Extensions to allow nonparametric fitting of nonlinear terms such as with splines or kernel smoothers may also be possible (53, 54). At present, only a single time point can be predicted, which avoided the requirement for complex modeling of time and space at the same time. Approaches for this have been proposed, though with strong underlying mathematical assumptions (47, 55). We are exploring ways to extend the algorithm to handle multiple time points while maintaining the current semiparametric approach that minimizes structural assumptions. Voxel Forecast was originally developed for continuous response prediction. The estimating equations described in the Supplementary Material can, in principle, be extended to binary response classification using generalized linear models and generalized estimating equation frameworks (56), which have been previously modified for prediction of spatially correlated discrete and categorical data in 2D applications with a relatively small number of samples per patient (57, 58). However, Voxel Forecast performance may not be the same as for continuous outcomes, so further study and simulations are needed. Despite these present limitations, the statistical transparency and multiscale prediction flexibility of Voxel Forecast may complement techniques such as spatiotemporal deep learning (59), voxel-level optical flow combined with encoder–decoder convolutional neural networks (CNN; ref. 60), 3D CNN for voxel-to-voxel prediction (61), and generative adversarial networks for voxel image synthesis (62). Future investigations on spatially variant tumor voxel response prediction should seek to leverage the combined strengths of multiple statistical and artificial intelligence approaches.
In conclusion, the Voxel Forecast statistical framework consists of multiscale variogram-weighted regression to predict tumor voxel response patterns during and after cancer therapy. We have characterized the accuracy and precision of Voxel Forecast with simulations of response prediction, demonstrated that the framework remains robust in the setting of small sample sizes, and successfully generated a preliminary PET tumor voxel response prediction model in patients with locally advanced non–small cell lung cancer enrolled on the FLARE-RT clinical trial. Further training and validation of Voxel Forecast in larger patient cohorts for predicting quantitative imaging biomarker spatiotemporal response to cancer therapies may provide powerful clinical decision support and inform personalized dosing strategies.
Disclosure of Potential Conflicts of Interest
D.S. Hippe reports receiving commercial research grants from GE Healthcare, Philips Healthcare, Toshiba America Medical Systems, and Siemens Medical Solutions USA. R.S. Miyaoka has ownership interests (including patents) at Precision Sensing LLC, is a consultant/advisory board member for MIM Software, and reports receiving commercial research grants from GE Healthcare. H.J. Vesselle is a consultant/advisory board member for MIM Software. P. Kinahan is an employee of PET/X LLC, and reports receiving commercial research grants from GE Healthcare. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: S.R. Bowen, D.S. Hippe, W.A. Chaovalitwongse
Development of methodology: S.R. Bowen, D.S. Hippe, W.A. Chaovalitwongse
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): S.R. Bowen, H.J. Vesselle, J. Zeng
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): S.R. Bowen, D.S. Hippe, J. Zeng
Writing, review, and/or revision of the manuscript: S.R. Bowen, D.S. Hippe, W.A. Chaovalitwongse, C. Duan, P. Thammasorn, R.S. Miyaoka, P.E. Kinahan, R. Rengan, J. Zeng
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): S.R. Bowen, P.E. Kinahan, R. Rengan
Study supervision: S.R. Bowen, R. Rengan
Other (participated in team discussions; supervised graduate students in related problems): X. Liu
Acknowledgments
This work was funded by NIH/NCI R01CA204301. We thank Hannah Thomas, Balu Sasidharan, and Anna Reyes for their coordination of clinical and imaging data collection. We acknowledge the efforts of nuclear medicine and radiation oncology technologists during PET/CT acquisitions, radiation therapy planning, and image-guided radiation therapy delivery.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.