Abstract
Background: The recent decrease in myeloid leukemia incidence may be directly attributed to changes in the population-based cancer registries 2001 guidelines, which required the capture of only one malignancy in the myeloid lineage per person and the simultaneous adoption of myelodysplastic syndrome registration in the United States.
Methods: We constructed four claims-based algorithms to assess myeloid leukemia incidence, applied the algorithms to the 1999–2008 Surveillance Epidemiology and End Results (SEER)-Medicare database, and assessed algorithm validity using SEER-registered cases.
Results: Each had moderate sensitivities (75%–94%) and high specificities (>99.0%), with the 2+BCBM algorithm showing the highest specificity. On the basis of the 2+BCBM algorithm, SEER registered only 50% of the acute myelogenous leukemia cases and a third of the chronic myelogenous leukemia (CML) cases. The annual incidence of myeloid leukemia in 2005 was 26 per 100,000 persons 66 years or older, much higher than the 15 per 100,000 reported by SEER using the same sample.
Conclusion: Our findings suggest underreporting of myeloid leukemias in SEER by a magnitude of 50% to 70% as well as validate and support the use of the 2+BCBM claims algorithm in identifying myeloid leukemia cases. Use of this algorithm identified a high number of uncaptured myeloid leukemia cases, particularly CML cases.
Impact: Our results call for the commitment of more resources for centralized cancer registries so that they may improve myeloid leukemia case ascertainment, which would empower policy makers with ability to properly allocate limited health care resources. Cancer Epidemiol Biomarkers Prev; 21(3); 474–81. ©2012 AACR.
Introduction
Since the mid-1970s, cancer registries have monitored myeloid leukemia incidence in the United States. According to data from 9 Surveillance Epidemiology and End Results (SEER) sites, age-adjusted trends in myeloid leukemia incidence decreased from 1975 to 1989, increased from 1990 to 2000, and decreased again after 2000 (1). We hypothesize that the drop in myeloid leukemia incidence in 2001 is largely attributable to a change in registry protocols and practice patterns favoring use of commercial pathology laboratories, as opposed to underlying changes in patterns of exposure within the general population.
In 2001, SEER issued a guideline stating that “a myeloid malignancy diagnosed after a previous myeloid malignancy would not be recorded as a subsequent primary,” which may have reduced the registration of myeloid leukemia incidence cases and changed the composition of registered cases, particularly if some subtypes are more likely to co-occur with other subtypes. In addition to changing guidelines on the registration of multiple primaries in the myeloid lineage, myelodysplastic syndromes (MDS) became a reportable malignancy to population-based registries for the first time in 2001, the year ICD-O-3 was implemented worldwide. About 30% of MDS cases progress to acute myelogenous leukemia (AML; ref. 2), and such cases would not be registered with AML under the 2001 guidelines. In 2010, SEER updated their guidelines to allow multiple myeloid primaries, such as when a patient is “originally diagnosed in a chronic (less aggressive) phase and second diagnosis of a blast or acute phase more than 21 days after the chronic diagnosis.” The interpretation of SEER AML evidence over the period 2001 to 2009 is complicated, particularly for trend analysis.
The 2001 guidelines have further implications for capturing cases of chronic myelogenous leukemia (CML). Like MDS, CML is often diagnosed in the outpatient setting, which potentially circumvents capture and registration as compared with hospital-based CML diagnoses. In addition, when CML progresses into a blast phase and patients are hospitalized, approximately two thirds of cases present as a myeloid leukemia and registrars may have difficulty distinguishing between CML blast phase and AML; yet only 1 of the 2 must be designated as the SEER primary malignancy. These surveillance conditions may lead to underreporting of CML.
The median age of diagnoses of MDS and myeloid leukemias is 65 years or older (1). The fact that myeloid leukemias primarily affect older individuals and increase with age (3) makes myeloid leukemia incidence particularly well suited for Medicare claims-based analysis. Claims-based algorithms have been used in prior reports to identify MDS and myeloid leukemia cases (4–6); yet no study has validated these myeloid leukemia algorithms using registry data (3). Given the SEER age-adjusted trends in myeloid leukemia incidence and the potential for myeloid leukemia incident cases to be underreported by population-based cancer registries from 2001 to 2005, we constructed and validated a claims-based algorithm of myeloid leukemia incidence using Medicare administrative data, assessed differential bias between AML and CML reporting, and estimated trends in AML and CML incidence from 2000 to 2005.
Methods
Data sources
We conducted a retrospective review of the SEER-Medicare database, 1999 to 2008. The SEER program is a national, population-based cancer registry sponsored by National Cancer Institute, Bethesda, MD, with a catchment area roughly equal to 26% of the U.S. population (7). Of SEER-registered patients with cancer who were diagnosed at ages 65 years or older, 93% were matched with Medicare enrollment records and claims as previously described.
Medicare is the primary insurer for approximately 97% of the U.S. population 65 years of age or older (7). As an alternative to the traditional fee-for-service Medicare, the Medicare Advantage program, Part C, is a managed care benefit that enrolls approximately 11% to 14% of older Medicare beneficiaries (8–10). Part C was not included in this study because no claims data are available due to reimbursement structure. All study procedures were approved by the University of Florida Institutional Review Board.
Study population
For study inclusion, a beneficiary must have resided in 1 of the 9 SEER regions between 1999 and 2005, enrolled in fee-for-service Medicare due to age for 13 months or more, and not participated in Medicare advantage. Beneficiaries were excluded if all claims after the first year of enrollment were in hospice. The study population included in the current analysis represents a 5% sample of registered and nonregistered beneficiaries (n = 287,854) and an oversampling of all beneficiaries registered in SEER with myeloid leukemia and other hematologic malignancies (ICD-O-3 codes: 9800 to 9989; n = 24,904). The oversampling allows for more in-depth examinations of beneficiaries diagnosed with hematologic malignancies.
Claims-based algorithms for myeloid leukemia incidence
A claims-based algorithm was constructed incorporating temporal patterns in the administrative data that correspond to the clinical presentation of myeloid leukemia. A minimalist claims-based algorithm of myeloid leukemia incidence requires one or more myeloid leukemia claims (1+ algorithm; ref. 4), whereas more specific algorithms may attempt to remove inaccurate diagnoses by requiring additional information, such as a second claim within a specified period of time (6) or after a delay in time to confirm the indication (5). Algorithms based only on ICD-9-CM diagnosis codes do not account for clinical services required for myeloid leukemia diagnosis (e.g., blood counts and bone marrow biopsy or aspiration).
We compared the sensitivity and specificity of 4 algorithms as follows: (i) The “1+” algorithm requires a single claim with ICD-9-CM diagnosis of myeloid leukemia. (ii) The “2+” algorithm requires a second claim between 1 and 12 months after the first or death or hospice enrollment within 3 months of the first claim. This accounts for the censoring of patients with myeloid leukemia who were considered terminally ill within 3 months of the first claim. The final 2 algorithms are based on clinical knowledge of diagnostic services required to confirm myeloid leukemia, specifically blood count (BC) and bone marrow (BM) biopsy or aspiration. (iii) The “2+BC” algorithm further restricts the 2+ algorithm by requiring a blood count during the year prior to the first claim. (iv) The “2+BCBM” algorithm further requires both a blood count and a bone marrow during the year prior to the first claim. Each of the claims-based algorithms captures only myeloid leukemia cases that are clinically diagnosed (e.g., no postmortem diagnoses) and are not dependent on the use of any particular treatment (e.g., blood transfusions).
Statistical methods
To assess the validity of the claims-based algorithms, SEER registration was used as the gold standard, with myeloid leukemia cases defined by ICD-O-3 histology codes (Table 1). Sensitivity was defined as the proportion of SEER-registered patients with myeloid leukemia who were identified as myeloid leukemia cases by the claims-based algorithm. Specificity was defined as the proportion of individuals not registered in SEER who were not identified as a myeloid leukemia case by the claims-based algorithm. To avoid misclassification of prevalent myeloid leukemia cases as incident myeloid leukemia cases, all patients with claims for myeloid leukemia, unspecified leukemia (ICD-9-CM 208) or unspecified anemia (ICD-9-CM 285.9) in year 1999 or in their first year of Medicare enrollment were excluded from the analysis.
ICD-9-CM codesa . | ICD-O-3 codes by WHO and FAB classification . | |
---|---|---|
AMLs | ||
AML w/recurrent genetic abnormalities | ||
207.0 | 9840 | Acute erythroid leukemia, M6 |
9865 | AML w/t(6;9)(p23;q34)b | |
9866 | Acute promyelocytic leukemia (APL), M3 | |
9869 | AML w/inv(3)(q21q26.2) or t(3;3)(q21;q26.2)b | |
9871 | AML w/eosinophilia, M4Eo | |
9896 | AML w/t(8;21)(q22;q22)b | |
9897 | AML w/11q23 abnormalitiesb | |
9911 | Acute megakaryocytic leukemia (AMKL) w/t(1;22)(p13;q13)b | |
AML w/multilineage dysplasiac | ||
9895 | AML w/multilineage dysplasiab | |
AML MDS, therapy related | ||
9920 | Therapy-related AMLb | |
AML, not otherwise specified | ||
205.0 or 205.2 | 9861 | AML |
9867 | Acute myelomonocytic leukemia (AMML), M4 | |
9870 | Acute basophilic leukemia (ABL)b | |
9872 | Acute myeloblastic leukemia, M0 | |
9873 | AML w/o maturation, M1 | |
9874 | AML w/maturation, M2 | |
9880 | Acute eosinophilic leukemia (AEL)b | |
206.0 or 206.2 | 9891 | Acute monocytic leukemia (AMoL), M5 |
9898 | Myeloid leukemia w/down syndromeb | |
207.2 | 9910 | AMKL, M7 |
9931 | Acute panmyelosis with myelofibrosis | |
Myeloid sarcoma | ||
205.3 | 9930 | Myeloid sarcoma |
CMLs | ||
Chronic myeloproliferative neoplasms (MPN) | ||
205.1 | 9863 | CML |
9875 | CML, BCR/ABL positiveb | |
9963 | Chronic neutrophilic leukemia (CNL)b | |
9964 | Chronic eosinophilic leukemia (CEL)/hypereosinophilicb | |
Myelodysplastic/myeloproliferative neoplasms (MDN/MPN) | ||
9876 | Atypical CML, BCR/ABL negativeb | |
206.1 | 9945 | Chronic myelomonocytic leukemia (CMML) |
9946 | Juvenile myelomonocytic leukemia (JMML)b |
ICD-9-CM codesa . | ICD-O-3 codes by WHO and FAB classification . | |
---|---|---|
AMLs | ||
AML w/recurrent genetic abnormalities | ||
207.0 | 9840 | Acute erythroid leukemia, M6 |
9865 | AML w/t(6;9)(p23;q34)b | |
9866 | Acute promyelocytic leukemia (APL), M3 | |
9869 | AML w/inv(3)(q21q26.2) or t(3;3)(q21;q26.2)b | |
9871 | AML w/eosinophilia, M4Eo | |
9896 | AML w/t(8;21)(q22;q22)b | |
9897 | AML w/11q23 abnormalitiesb | |
9911 | Acute megakaryocytic leukemia (AMKL) w/t(1;22)(p13;q13)b | |
AML w/multilineage dysplasiac | ||
9895 | AML w/multilineage dysplasiab | |
AML MDS, therapy related | ||
9920 | Therapy-related AMLb | |
AML, not otherwise specified | ||
205.0 or 205.2 | 9861 | AML |
9867 | Acute myelomonocytic leukemia (AMML), M4 | |
9870 | Acute basophilic leukemia (ABL)b | |
9872 | Acute myeloblastic leukemia, M0 | |
9873 | AML w/o maturation, M1 | |
9874 | AML w/maturation, M2 | |
9880 | Acute eosinophilic leukemia (AEL)b | |
206.0 or 206.2 | 9891 | Acute monocytic leukemia (AMoL), M5 |
9898 | Myeloid leukemia w/down syndromeb | |
207.2 | 9910 | AMKL, M7 |
9931 | Acute panmyelosis with myelofibrosis | |
Myeloid sarcoma | ||
205.3 | 9930 | Myeloid sarcoma |
CMLs | ||
Chronic myeloproliferative neoplasms (MPN) | ||
205.1 | 9863 | CML |
9875 | CML, BCR/ABL positiveb | |
9963 | Chronic neutrophilic leukemia (CNL)b | |
9964 | Chronic eosinophilic leukemia (CEL)/hypereosinophilicb | |
Myelodysplastic/myeloproliferative neoplasms (MDN/MPN) | ||
9876 | Atypical CML, BCR/ABL negativeb | |
206.1 | 9945 | Chronic myelomonocytic leukemia (CMML) |
9946 | Juvenile myelomonocytic leukemia (JMML)b |
WHO codes are in underlined italic.
FAB classifications are listed in bold.
Abbreviation: WHO, World Health Organization; FAB, French—American—British classification.
aICD-9 codes for other or unspecified myeloid leukemias are not shown. Acute biphenotypic leukemia (ICD-O-3 9805) is not included as a myeloid leukemia, because it potentially represents a distinct disease entity and was not included in WHO or FAB classification systems.
bEight AML codes and 4 CML codes did not appear in the 2000 registry sample either due to their introduction after January 1, 2000 or their absence among older adults.
cAML with multilineage dysplasia was introduced as ICD-O-3 code in 2001, listed as a subgroup of AML in the 2002 WHO classification system, and removed in 2008.
In addition, we applied the 4 algorithms to estimate incidence of myeloid leukemia, AML, and CML among older adults. Each trend represents the number of incident cases per year per 100,000 beneficiaries in SEER regions and was compared with the trends in SEER incidence for the same population of older adults. The study population contains all registered myeloid leukemia cases and only 5% of other cancer and unregistered beneficiaries in SEER regions, therefore, we applied sampling weights to adjust for the oversampling. We applied a second set of sampling weights created to adjust for differences in the age distribution between 2000 and 2005 using the 2005 age distribution as the standard.
Results
Demographic characteristics of Medicare beneficiaries residing in SEER regions are described in Table 2 for 3 groups defined by SEER registration as follows: (i) those registered in SEER as myeloid leukemia cases (coded by ICD-O-3 codes: 9800 to 9989); (ii) those registered in SEER with other hematologic malignancies, such as MDS and lymphocytic leukemias (ICD-O-3 codes: 9800 to 9989); and (iii) those not registered in SEER with myeloid leukemia. Persons in the Not Registered group may have been registered in SEER with hematologic malignancies prior to 1999 or for other cancers. Consistent with previous reports, patients with myeloid leukemia were older and more likely to be male and White than those not registered. Across the 3 groups, 12% to 24% of beneficiaries had claims for myeloid leukemia or unspecified leukemia or anemia in 1999 or in their first year of Medicare enrollment; these individuals were considered prevalent cases and excluded from the validation analysis.
. | SEER registereda . | . | |
---|---|---|---|
. | Myeloid leukemia . | Other hematologic malignancy . | Not registered . |
Beneficiary characteristics . | N = 6,013 (%) . | N = 18,891 (%) . | N = 287,854 (%) . |
Age, y, in January 2001 [median (interquartile range)] | 76 (70–81) | 76 (70–82) | 73 (67–80) |
Gender | |||
Male | 3,246 (54) | 9,979 (53) | 115,182 (40) |
Female | 2,767 (46) | 8,912 (47) | 172,672 (60) |
Race/ethnicity | |||
White | 5,365 (89) | 17,069 (91) | 244,724 (84) |
Black | 288 (5) | 977 (5) | 18,050 (7) |
Other | 93 (1) | 204 (1) | 6,254 (2) |
Asian | 176 (3) | 350 (2) | 11,367 (4) |
Hispanic | 73 (1) | 216 (1) | 5,866 (2) |
North American native | <11 (<0.4b) | 38 (<1) | 1,010 (<1) |
Unknown | <11 (<0.4b) | 37 (<1) | 583 (<1) |
Claims for myeloid leukemia or unspecified leukemia or anemia within first data yearc | 1,305 (22) | 4,594 (24) | 34,747 (12) |
. | SEER registereda . | . | |
---|---|---|---|
. | Myeloid leukemia . | Other hematologic malignancy . | Not registered . |
Beneficiary characteristics . | N = 6,013 (%) . | N = 18,891 (%) . | N = 287,854 (%) . |
Age, y, in January 2001 [median (interquartile range)] | 76 (70–81) | 76 (70–82) | 73 (67–80) |
Gender | |||
Male | 3,246 (54) | 9,979 (53) | 115,182 (40) |
Female | 2,767 (46) | 8,912 (47) | 172,672 (60) |
Race/ethnicity | |||
White | 5,365 (89) | 17,069 (91) | 244,724 (84) |
Black | 288 (5) | 977 (5) | 18,050 (7) |
Other | 93 (1) | 204 (1) | 6,254 (2) |
Asian | 176 (3) | 350 (2) | 11,367 (4) |
Hispanic | 73 (1) | 216 (1) | 5,866 (2) |
North American native | <11 (<0.4b) | 38 (<1) | 1,010 (<1) |
Unknown | <11 (<0.4b) | 37 (<1) | 583 (<1) |
Claims for myeloid leukemia or unspecified leukemia or anemia within first data yearc | 1,305 (22) | 4,594 (24) | 34,747 (12) |
NOTE: P values represent group comparisons on weighted t tests.
aOther hematologic malignancies coded within the ICD-O-3 histology codes 9800 to 9989 largely involve the myeloid lineage (e.g., myeloid dysplastic syndrome); however, this range also includes a few lymphoid malignancies. Persons in the Not Registered group may be registered as incident cases for hematologic malignancies prior to 1999 or for other cancers.
bPercentages less than 0.4% are suppressed to protect patient anonymity.
cA person with myeloid leukemia claims within the first year of claim data may be a prevalent case; therefore, must be removed from the validation analysis and incidence estimation.
After removing the prevalent cases (Table 3), 333 (7%) of the remaining 4,708 SEER-registered myeloid leukemia cases lacked claims for myeloid leukemia between 2000 and 2005. Among these cases, 231 (69%) had a claim for unspecified leukemia or anemia (ICD-9-CM 208 and 285.9) between 2000 and 2005, and an additional 14 (4%) had claims for myeloid leukemia between 2006 and 2008, which is outside the study period. The SEER-registered myeloid leukemia incident cases were separated in AML and CML categories based on their initial diagnosis, to assess differential bias in the underreporting of myeloid leukemia. The proportion with no myeloid leukemia claims was greater among the registered CML cases than AML cases (11% compared with 6%, P < 0.01).
. | SEER registereda . | . | |||
---|---|---|---|---|---|
. | All myeloid leukemia . | AML . | CML . | Other hematologic malignancies . | Not registered . |
Claims-based algorithms . | N = 4,708 (%) . | N = 3,365 (%) . | N = 1,343 (%) . | N = 14,297 (%) . | N = 253,107 (%) . |
No myeloid leukemia claims | 333 (7) | 187 (6) | 146 (+11) | 11,589 (81) | 252,632 (100) |
1 or more myeloid leukemia claims (1+) | 4,375 (93) | 3,178 (94) | 1,197 (89) | 2,708 (19) | 475 (<1) |
And a second myeloid leukemia claim 1 to 12 months after first claim or death or hospice entry within 3 months (2+) | 4,266 (91) | 3,140 (93) | 1,120 (83) | 2,047 (14) | 147 (<1) |
And a blood count within 12 months before first claim (2+BC) | 4,260 (90) | 3,137 (93) | 1,117 (83) | 2,041 (14) | 143 (<1) |
And a bone marrow biopsy within 12 months before first claim (2+BCBM)b | 4,190 (89) | 3,123 (93) | 1,061 (79) | 1,921 (13) | 107 (<1) |
2+BCBM using only CML claims | 1,551 (33) | 546 (16) | 1,005 (75) | 606 (4) | 44 (<1) |
2+BCBM using only AML claims | 3,414 (73) | 3,088 (92) | 326 (24) | 1,515 (11) | 65 (<1) |
. | SEER registereda . | . | |||
---|---|---|---|---|---|
. | All myeloid leukemia . | AML . | CML . | Other hematologic malignancies . | Not registered . |
Claims-based algorithms . | N = 4,708 (%) . | N = 3,365 (%) . | N = 1,343 (%) . | N = 14,297 (%) . | N = 253,107 (%) . |
No myeloid leukemia claims | 333 (7) | 187 (6) | 146 (+11) | 11,589 (81) | 252,632 (100) |
1 or more myeloid leukemia claims (1+) | 4,375 (93) | 3,178 (94) | 1,197 (89) | 2,708 (19) | 475 (<1) |
And a second myeloid leukemia claim 1 to 12 months after first claim or death or hospice entry within 3 months (2+) | 4,266 (91) | 3,140 (93) | 1,120 (83) | 2,047 (14) | 147 (<1) |
And a blood count within 12 months before first claim (2+BC) | 4,260 (90) | 3,137 (93) | 1,117 (83) | 2,041 (14) | 143 (<1) |
And a bone marrow biopsy within 12 months before first claim (2+BCBM)b | 4,190 (89) | 3,123 (93) | 1,061 (79) | 1,921 (13) | 107 (<1) |
2+BCBM using only CML claims | 1,551 (33) | 546 (16) | 1,005 (75) | 606 (4) | 44 (<1) |
2+BCBM using only AML claims | 3,414 (73) | 3,088 (92) | 326 (24) | 1,515 (11) | 65 (<1) |
aPersons registered with myeloid leukemia were separated into AML and CML categories based on first diagnosis. Individuals with other hematologic malignancies were coded with a nonmyeloid malignant disease using an ICD-O-3 code between 9800 and 9989. The validation samples (All myeloid leukemia, AML, CML, and other hematologic malignancies) excluded persons with claims for myeloid leukemia or unspecified leukemia or anemia within first data year (i.e., prevalent cases). Persons in the Not Registered group may have been registered for a hematologic malignancy prior to 1999 or for other cancers.
bFor the myeloid leukemia 2+BCBM algorithm, sensitivity is 89.00% (4,190/4,708) and specificity is 99.96% [(253,107-107)/253,107]. Sensitivities for the 2+BCBM AML and CML algorithms are 92% (3,088/3,365) and 75% (1,061/1,343), respectively.
In Table 3, the sensitivities of the 4 claims-based algorithms are presented in the first column. Sensitivity was highest for the 1+ algorithm (93%). Requiring a second claim for AML within 1 to 12 months after the first claim or death or hospice entry within 3 months of the first claim (i.e., the 2+ algorithm) reduced the sensitivity to 91%. Further requiring a blood count claim within the year and before the first claim had little effect on sensitivity (90%); however, requiring a bone marrow biopsy reduced the sensitivity to 89%. The specificity is 100% minus the percentages in the last column and was above 99% for all algorithms.
The most conservative algorithm (2+BCBM) identified 107 myeloid leukemia cases among patients who were not registered. The 2+BCBM algorithm identified 1,921 additional myeloid leukemia cases among patients registered with other hematologic malignancies (e.g., MDS). In summary, SEER registered about half of the myeloid leukemia cases [51%; 4,190/(4,190 + 1,921 + 107 × 20)] as defined by the 2+BCBM algorithm, which is a conservative estimate.
Sensitivities for the 2+BCBM AML and CML algorithms were 92% (3,088/3,365) and 75% (1,061 of 1,343), respectively. The difference between sensitivities was largely attributable to the frequent absence of 2 myeloid leukemia claims for CML cases (−17% compared with −7%) and the reduced use of bone marrow biopsy within the year prior to diagnosis (−4% compared 0%). Applying the same sampling weight adjustment as before, SEER registered half of the AML cases [50%; 3,088/(3,088 + 326 + 1,515 + 65 × 20)], but only a third of the CML cases [33%; 1,005/(1,005 + 546 + 606 + 44 × 20)] as defined by the 2+BCBM algorithm. On the basis of 2+BCBM algorithm, 546 CML incident cases (18%) were SEER registered with AML and 326 AML cases (5%) were SEER registered with CML.
Figure 1 illustrates the trend over time in myeloid leukemia incident rates estimated using 3 claims-based algorithms and SEER registry data with age adjustment. For the year 2005, myeloid leukemia incidence was estimated to be 70 per 100,000 persons based on the 1+ algorithm and 26 based on the more conservative 2+BCBM algorithm. Using the same sample, SEER-based myeloid leukemia incidence ranged from 15 to 18 cases per 100,000 over this period. The incident rates were lower in 2000 due to the removal of all 1999 prevalent cases. Compared with the 1+ algorithm estimates, requiring a second claim halved the number of incident cases. The 2+BC algorithm results are not shown, because they were nearly identical to the 2+ results. The difference between the 1+ algorithm and the 2+ algorithm was lower in 2004, potentially related to the increased use of diagnostic services recommended by International Working Group 2003 AML guidelines (11, 12).
Figure 2 illustrates trends in AML and CML incidence based on the 2+BCBM algorithm and SEER registry data. In 2005, the number of AML incident cases was 19 per 100,000 based on the 2+BCBM algorithm, substantially higher than the 11 per 100,000 based on SEER. Likewise, the number of CML incident cases was 11 per 100,000 based on the 2+BCBM and 5 per 100,000 based on SEER.
Discussion
Given the potential for myeloid leukemia cases to be uncaptured by population-based cancer registries, we sought to develop and validate a Medicare claims-based algorithm for the identification of myeloid leukemia and estimate incidence in individuals ages 65 years and older (3, 13). The results of this validation study show that claims-based algorithms of myeloid leukemia incidence are moderately sensitive and highly specific. The 1+ algorithm has reduced specificity and may overestimate incidence and obscure patterns identified by more definitive algorithms. In comparison, the more rigorous 2+BCBM algorithm showed the highest specificity, which is of critical importance if this tool is to be used as a basis for large scale extrapolations and investigations of myeloid leukemia treatment patterns. Therefore, our work validates and supports the use of the 2+BCBM claims algorithm in identifying myeloid leukemia cases.
This is the first study to assess the validity of claims-based algorithms in leukemia; however, previous studies have applied claims-based measures. Three studies did not disclose specific ICD-9-CM or ICD-O-3 codes for identified AML cases in the SEER database (14–16). Three additional studies included ICD-9-CM codes for myeloid (205.XX) and monocytic (206.XX) leukemia; however, the use of this range of codes encompassed acute and chronic diagnoses, thereby potentially misrepresenting CML cases as AML cases (5, 6, 17). Two studies required 2 or more claims of AML (5, 6), but only 1 required AML claims more than 30 days apart and did not limit the difference to less than 1 year or correct for terminal diagnoses and hospice (i.e., the 2+BCBM algorithm; ref. 5).
The 2+BCBM algorithm may be separately applied to all myeloid leukemia claims, AML claims, and CML claims; however, these algorithms provided overlapping results. Among the 2,201 cases identified by the 2+BCBM algorithm using only CML claims, nearly half of the cases (991; 45%) were independently identified by the 2+BCBM algorithm using only AML claims. This overlap has 6 possible interpretations as follows: (i) CML cases were being coded as AML for reimbursement; (ii) CML cases that progressed to the blast phase were miscoded as AML; (iii) AML cases were initially thought to be CML cases until results of cytogenetic and molecular testing became available; (iv) the process to rule out CML took over 1 month, rendering 2 claims 1 month apart; or (v) clerical inaccuracies were made in ICD-9-CM coding. Because of SEER guidelines prohibiting multiple primaries involving the myeloid lineage from 2001 to 2005, it was not possible to test between these interpretations. By and large these clinical interpretations favored the CML diagnosis; therefore, future work may first identify the CML cases and remove them before applying the AML 2+BCBM algorithm. This hierarchal approach agrees with 2001 SEER guidelines that prohibit AML registration after CML registration.
The primary results showing that SEER registered only half of the AML cases and a third of the CML cases may be underestimates, because these results are based on a conservative claims-based algorithm that requires bone marrow biopsy, as recommended by 2000 NCCN guidelines. Furthermore, SEER did not require biopsy confirmation for registration and may have included cases that would be excluded upon biopsy.
Absence of SEER registration may be overstated. Among the 4,994 cases identified by the 2+BCBM algorithm using only AML claims, over a third of the cases (37%; 1,841) were SEER registered as another hematologic malignancy, such as MDS. Therefore, these patients with AML were captured in SEER but with a different disease. Consequently, data from SEER between 2001 and 2005 may not accurately reflect the true incidence of AML between 2001 and 2005.
The implications of these findings expand beyond Medicare and suggest specific changes in the registry system. When analyzing the uncaptured myeloid leukemia cases, we found that many of these cases linked to individuals already registered in SEER for another cancer. This gap in the registry system may be resolved by requiring myeloid leukemia registration regardless of other cancer diagnoses, which is partially addressed in the updated 2010 coding rules (18). However, a more difficult gap to address is registry reliance on inpatient surveillance for myeloid leukemia incidence (13, 19, 20). In particular, patients with CML and other chronic myeloid malignancies, like MDS and myeloproliferative neoplasms, are often diagnosed and managed in the outpatient setting and may be missed by surveillance systems relying on hospital registration. We have recently shown the bias of cancer registration in MDS toward a more advanced disease stage by comparing the clinical characteristics of patients with captured MDS with patients with uncaptured MDS (21). The evidence of bias against outpatient registration suggests that the CML cases in SEER may favor moderately advanced CML cases (e.g., acceleration phase, transfusion requiring) and exclude higher grade cases (e.g., blast phase, immediate outpatient referral to hospice) and lower grade CML cases (e.g., asymptomatic patients).
A primary limitation of claims-based algorithms is the reliance on ICD-9 codes. The ICD-9 codes generated by the treating physician at the time of billing are assumed to be the impression of physician of the diagnosis rather than a confirmed pathologic diagnosis. A small proportion of the SEER-registered cases did not have claims with myeloid leukemia codes and patients with myeloid leukemia may not have consented to undergo bone marrow biopsy. The imprecise coding of myeloid leukemia as unspecified leukemia or anemia and myeloid leukemia treatment without confirmatory bone marrow biopsy suggests that claims-based algorithms may miss myeloid leukemia cases (i.e., poor sensitivity). In addition, no claims algorithm can measure incidence in the first 12 months of Medicare enrollment because of the lack of data prior to enrollment to rule out myeloid leukemia prevalence. Claims-based algorithms are also limited to insured populations and excluded persons enrolled in managed care organizations due to the absence of claims (e.g., Medicare advantage).
In summary, we evaluated 4 claims-based algorithms for sensitivity and specificity using the SEER-Medicare database, which included registered patients with myeloid leukemia as gold standard comparators. Our findings validate and support the use of the 2+BCBM claims algorithm in identifying myeloid leukemia cases and indicate that SEER registered half of AML cases and a third of CML cases. Moreover, use of this conservative and highly specific algorithm identified a high number of uncaptured myeloid leukemia cases. From a policy perspective, our results call for the commitment of more resources for centralized cancer registries so that they may improve myeloid leukemia case ascertainment. More accurate data would empower policy makers with ability to properly allocate limited health care resources.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception of design: B.M. Craig, C.R. Cogle.
Interpretation of data: B.M. Craig, C.R. Cogle.
Analysis of data: B.M. Craig.
Writing and approval of manuscript: B.M. Craig, C.R. Cogle, D.E. Rollison, A.F. List.
Acknowledgments
The authors thank the staff in the laboratory of B.M. Craig at Lee H. Moffitt Cancer Center & Research Institute for their contributions to the research and creation of this paper, Riddhi Patel for research assistance, and Carol Templeton for copy editing.
Grant Support
This work was supported by the NIH Infrastructure Grant, Developing Information Infrastructure Focused on Cancer Comparative Effectiveness Research (RC2-CA148332; PI: Fenstermacher), and B.M. Craig's NCI Career Development Award (K25-CA122176).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.