Clinico-genomic databases favor inclusion of long-term survivors, leading to potentially biased overall survival (OS) analyses. Risk set adjustments relying on the independent delayed entry assumption may mitigate this bias. We aimed to determine whether this assumption is satisfied in a dataset of patients with advanced non–small cell lung cancer (aNSCLC), and to give guidance for clinico-genomic OS analyses when the assumption is not satisfied.


We analyzed the association of timing of next-generation sequencing (NGS) testing with real-world OS (rwOS) in patient data from a United States–based nationwide longitudinal deidentified electronic health records–derived database. Estimates of rwOS using risk set adjustment were compared with estimates computed with respect to all patients, regardless of NGS testing.


The independent delayed entry assumption was not satisfied in this database, and later sequencing had a negative association with the hazard of death after sequencing. In a model adjusted for relevant characteristics, each month delay in sequencing was associated with a 2% increase in the hazard of death. However, until the median survival time, estimates of OS using risk set adjustment are similar to estimates computed for all patients, regardless of NGS testing.


rwOS analyses in clinico-genomic databases should assess the independent delayed entry assumption. Comparisons versus broader population may be useful to evaluate the rwOS differences between calculations using risk set adjustment and patient cohorts where the bias relates to overrepresentation of long survivors.


This study illustrates practices that can increase the interpretability of findings from OS analyses in clinico-genomic databases.

This content is only available via PDF.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs International 4.0 License.

Supplementary data