Abstract
Lung cancer is the leading cause of cancer death worldwide. Here, we describe a genome-wide association study of chemically induced lung tumorigenesis on 593 mice from 21 inbred strains using 115,904 genotyped and 1,952,918 imputed single nucleotide polymorphisms (SNPs). Using a genetic background–controlled genome search, we identified a novel lung tumor susceptibility gene Las2 (Lung adenoma susceptibility 2) on distal chromosome 18. Las2 showed strong association with resistance to tumor induction (rs30245983; P = 1.87 × 10−9) as well as epistatic interactions (P = 1.71 × 10−3) with the pulmonary adenoma susceptibility 1 locus, a major locus affecting mouse lung tumor development (rs13459098, P = 5.64 × 10−27). Sequencing analysis revealed four nonsynonymous SNPs and two insertions/deletions in the susceptible allele of Las2, resulting in the loss of tumor suppressor activities in both cell colony formation and nude mouse tumorigenicity assays. Deletion of LAS2 was observed in ∼40% of human lung adenocarcinomas, implying that loss of function of LAS2 may be a key step for lung tumorigenesis. [Cancer Res 2009;69(15):6290–8]
Introduction
Lung cancer accounts for 30% of cancer-related deaths in the United States. In 2008, there will be an estimated 215,000 new cases of lung cancer diagnosed and only 15% of those patients will be alive after 5 years (1). Lung cancer is a result of complex gene-environment interactions. Prolonged exposure to carcinogens found in tobacco smoke and other environmental carcinogens that interact with various genetic susceptibility factors contribute to lung cancer development in humans (2, 3). The genetic component is highlighted by recent genome-wide association studies (GWAS) on lung cancer populations of European ancestry, which identified association of common variants on chromosomes 5p15.33 (4, 5), 6p21.33 (5), and 15q24-25.1 (6–8) with lung cancer susceptibility.
Lung tumors in mice are similar in morphology, histopathology, and molecular characteristics to human adenocarcinomas (9). Many genetic alterations identified in mouse models of lung cancer have genetic equivalents (or orthologues) in humans (10). Mouse models have a number of advantages that favor detection of genetic associations that might remain elusive in human populations, including large variations in disease susceptibility among inbred strains, controlled environmental influences, generation of large cohorts, and a homozygous inbred genome. These features are particularly useful for studying common diseases such as lung cancer in which environmental exposure to carcinogen are more easily controlled and measured. Previous studies using linkage analyses in mice identified a number of quantitative trait loci (QTLs) affecting inherited predisposition to lung cancer suggesting a polygenic genetic basis of lung tumor susceptibility. Many of these QTLs are also implicated in human cancers (11).
Recent progress in genomic sequence analysis and high-density single nucleotide polymorphism (SNP) discovery have permitted the analysis of a wide range of genetic variation in laboratory mice (12, 13). The use of the mouse haplotype map has proven successful in refinement of previous QTL regions and identification of new genetic determinants of complex traits (14–18). Several lung tumor susceptibility loci have been identified through genome-wide association analysis in laboratory inbred mice (16). Among these loci, the pulmonary adenoma susceptibility 1 (Pas1) locus, which was mapped to the distal region of mouse chromosome 6 in previous linkage studies, was refined to a region of <0.5 Mb in which at least two genes Kras2 and Casc1 are strong candidates. Kras2 is a common target of somatic mutation in chemically induced mouse lung tumors (19). Approximately 15% to 30% of lung adenocarcinomas contain Kras2 mutations, which are correlated with loss of heterozygosity (LOH) of chromosome 12p in humans (20).
In the present study, we used a genetic background–controlled genome search for lung cancer QTLs that take into account the Pas1 allelic status in inbred mice. This search strategy for GWAS identified a novel tumor susceptibility gene, Las2, on the distal region of chromosome 18. The allelic effects of Las2 on cell proliferation were validated in cell colony formation and nude mouse tumorigenicity assays. Our data also showed that different search strategies for GWAS can help identify additional genetic variants that are undetectable with conventional genome searches.
Materials and Methods
Statistical analysis. Detailed description of statistical methods and LOH analysis, and any associated references, are provided in Supplementary Materials and Methods. Briefly, 593 mice from 21 inbred strains were measured for lung tumor multiplicity (tumors per mouse) at 14 to 16 wk after a single injection of urethane (1 mg/g; Supplementary Table S1). It is important to note that although there have been no studies implicating urethane as a carcinogen in humans, it is reasonably anticipated to be one because of its carcinogenicity in animals. I.p. injection of urethane into mice has been used very commonly as a carcinogen for the detection of lung cancer susceptibility and resistance loci. To correct for population structure and genetic relatedness among inbred strains, we used a recently developed method, efficient mixed-model association (EMMA), to assess association of lung tumor mulitiplicity with SNPs (21). Specifically, the mixed model in the EMMA method can be represented as y = Xβ + Zu + e, where y is an n × 1 vector of observed phenotypes (i.e., lung tumor multiplicity), and X is an n × q matrix of fixed effects including mean, SNPs, and other covariate variables. β is a q ×1 vector representing coefficients of the fixed effects. Z is an n × t incidence matrix mapping each observed phenotype to one of t inbred strains. u is the random effect (i.e., strain effects) of the mixed model with Var(u) = 2Kσ2g where K is the t × t kinship matrix inferred from genotypes, and e is an n × n matrix of residual effect such that Var(e) = Iσ2e. The overall phenotypic variance-covariance matrix can be represented as V = 2ZKZ'σ2g+Iσ2e.
In the conventional genome search for QTLs, one SNP was tested for association at a time using the EMMA method. A two-sided P value for each SNP was obtained for testing hypothesis of no association between the SNP and lung tumor multiplicity. In the genetic background–controlled search, the Pas1 genetic background was taken into account for association tests. Specifically, the tumor multiplicity was first adjusted by the Pas1 allelic status using a simple linear regression model. The residual from this regression model was then used as a phenotypic variable for the association analysis. Once a SNP is identified with strong association, the epistatic interaction between that SNP and the Pas1 was further assessed by the EMMA approach.
Cell culture, plasmid constructs, transfection, and infection. The LM2 cell line used was originally derived from a urethane-induced papillary tumor of A/J (22). We grew LM2 cells in αMEM medium supplemented with 10% fetal bovine serum (FBS). We subcloned full-length Las2 SWR (susceptible or -S) or CBA (resistant or -R) into the NotI/ClaI restriction sites of the pcDNA3 vector (Invitrogen) with three NH2-terminal HA epitope tags. LM2 cells were transiently transfected with 4 μg of linearized pcDNA3-HA3-Las2-S, -R, or empty vector using Lipofectamine 2000 (Invitrogen). Cells were split 48 h after transfection and selected in 500 μg/mL of geneticin for 2 wk. Individual clones were picked and subject to immunoblot analysis for HA-Las2 expression. The PCC4 cell line was derived from a urethane-induced BALB/c Clara cell tumor (23) and were grown in McCoy's 5A medium supplemented with 10% FBS. We subcloned full-length Las2-R or -S into the BamHI/EcoRI of pRetroX-Tight-Pur (Clontech, Inc.). Retrovirus was produced by transient transfection of pRetroX-Tight-Pur-Las2-S, -R, and pRetroX-Tet-On-Advanced (Clontech, Inc.) in PhoenixA cells (Orbigen). Virus was collected from 24 to 48 h after transfection and passed through a 0.45-μm filter. Virus and polybrene (4 μg/mL) were added to PCC4 cells for 24 h, after which cells were split to 20% confluency and selected in geneticin (0.5 μg/mL) and puromycin (6 μg/mL) for 2 wk. For Las2 knockdown assays, cells were infected with pSIREN retrovirus expressing either scrambled (Vector) or Las2 targeting (Las2-sh) short hairpin RNA and selected in culture medium containing 2 μg/mL puromycin for 3 d.
Colony-formation assay, 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide assay, and athymic mouse tumorigenicity assay. For the colony-formation assay, individual colonies were seeded at 200 cells per plate and grown for 2 wk. Cells were fixed with methanol/acetic acid (3:1; v:v) and stained with 0.5% crystal violet. PCC4 cells infected with retrovirus made from the plasmids pRETROX-Tight-Pur-Las2 and pRETROX-Tet-On-Advanced, and stably selected in geneticin and puromycin were plated (500 cells/plate), treated with Doxycyline (1 μg/mL) for 2 wk and similarly stained. Cells stably expressing Las2-sh as described above were seeded onto six-well tissue culture dishes at a density of 50 cells per well. Cells were assayed for viable cell numbers using the CellTiter 96 One-Solution Proliferation Assay kit (Promega) periodically over 10 d in culture. The level of knockdown was determined at the protein level by HA immunoblot. For the athymic mouse tumorigenicity assay, BALB/c nude mice ages 5 wk were purchased from Harlan Bio-Products. We injected 2 million cells s.c. into one flank of each nude mouse. Nine animals were used per sample. We monitored the health of the animals and measured the size of tumors biweekly for 3 wk. Tumor weight difference between different groups was determined by two-tailed Student's t test.
Results
GWAS for lung tumor susceptibility. To identify lung cancer susceptibility loci, we conducted GWAS on urethane-induced lung tumors in inbred mice. In our study, we used previously published data on a total of 593 mice from 21 inbred mouse strains, which were measured for urethane-induced lung tumor multiplicity (i.e., number of tumors per mouse; Supplementary Table S1). In our GWAS, we used a panel of 115,904 genotyped SNPs,1
spanning the mouse genome at an average density of ∼20 kb per SNP. To increase statistical power and mapping resolution, we further analyzed 1,952,918 imputed SNPs2 with confidence levels of >0.9 (24), which spans the mouse genome at an average density of ∼1 kb per SNP. We first assessed the cumulative distribution of P values from the GWAS using the EMMA approach. The distribution of observed P values was similar to the expected uniform distribution (0, 1), indicating no inflation of test statistics from population structure or any other form of bias (Supplementary Fig. S1).We evaluated two genome search strategies for our GWAS (Fig. 1). In the conventional genome search, GWAS tests one SNP for association at a time. Using this conventional strategy, we identified a set of SNPs showing strong association with lung tumor multiplicity on chromosome 6 (e.g., rs13459098, P = 1.07 × 10−5; Supplementary Table S2; Fig. 1A). The identified QTL colocalizes with the Pas1 locus previously detected by linkage studies, spanning ∼26 cM on the distal region of chromosome 6 (25–27). Our association analysis refined this QTL to a region of <0.5 Mb in which Kras2 and Casc1 are candidate lung tumor susceptibility genes (Supplementary Fig. S2). This result is consistent with our recent GWAS for lung tumor incidence (16). In addition to the Pas1 locus, the analysis identified several other loci on chromosomes 2, 4, and 19 with strong associations with lung tumor susceptibility (Supplementary Table S3).
Pas1 is the major locus affecting inherited predisposition to lung cancer, but its effect is modulated by several other genetic loci (25). Because of the strong effect of Pas1 on lung tumor susceptibility, we hypothesized that Pas1 may mask effects of other important loci. Therefore, we conducted GWAS adjusting for Pas1 allelic status in inbred mice (28). In this genome search strategy, lung tumor multiplicity was first adjusted for the Pas1 effect, and then GWAS was performed on the Pas1-adjusted phenotype. Using this strategy, we identified one additional locus with strong associations on the distal region of chromosome 18 (e.g., rs30245983, P = 9.33 × 10−5; Supplementary Table S4; Fig. 1B). This locus was not identified by the conventional genome search. In the conventional GWAS, the strongest association (rs29542464) on chromosome 18 was located at 60.81 Mb, which is ∼10 Mb away from the linkage peak and resides outside either one-LOD support interval from previous linkage studies (29–31) or the refined region from congenic mice (Supplementary Fig. S3; refs. 32, 33).
Epistasis of Pas1 and Par2. Using the genetic background–controlled GWAS, we identified several SNPs with strong associations clustering in an ∼40-kilobase interval (70.60–70.64 Mb) on distal chromosome 18 (Fig. 2). This interval is within the pulmonary adenoma resistance 2 (Par2), a major locus suppressing urethane induction of pulmonary adenomas identified using (A/J × BALB/c) F2 mice (30). The QTL identified by our GWAS is coincident with that from fine mapping studies using congenic strains of mice (32, 33) and was significantly narrowed down to an interval ∼40-kb in size.
SNPs from the Pas1 and Par2 loci that showed significant associations with lung tumor multiplicity in the GWAS were used to further evaluate potential interactions between them on lung tumorigenesis (Table 1). Both Pas1 and Par2 loci had large main effects (P = 5.64 × 10−27 for rs13459098 at the Pas1, and P = 1.87 × 10−9 for rs30245983 at the Par2) as well as interaction effects (P = 1.71 × 10−3) on chemically induced pulmonary adenoma development. Interestingly, the genetic effects of the Par2 locus on lung tumorigenesis depend on the allelic status at the Pas1 locus. In the presence of the Pas1 susceptible allele, mice carrying the Par2-resistant allele averaged a 7.5-fold decrease in tumor multiplicity (15 tumors in Pas1SSPar2SS mice versus 2 tumors in Pas1SSPar2RR mice; Supplementary Fig. S4). In strains with the Pas1-resistant allele, the Par2-resistant allele reduced tumor multiplicity by 2.2-fold (0.35 tumors in Pas1RRPar2SS mice versus 0.16 tumors in Pas1RRPar2RR mice; Supplementary Fig. S4). This finding suggests that multiple loci exert a synergistic effect to modify the penetrance and/or expressivity of lung cancer in mice.
Pas1 (Chr 6) . | . | Par2 (Chr 18) . | . | . | Epistasis . | |||
---|---|---|---|---|---|---|---|---|
SNP* . | P . | SNP† or polymorphism . | Position‡ . | P . | P . | |||
rs13459098 | 7.19 × 10−28 | rs29622203 | 70,602,803 | 9.81 × 10−10 | 3.16 × 10−3 | |||
rs13459098 | 1.99 × 10−28 | Rs29557643 | 70,613,785 | 6.30 × 10−10 | 1.81 × 10−3 | |||
rs13459098 | 5.68 × 10−45 | rs30301901 | 70,622,787 | 4.71 × 10−11 | 9.56 × 10−4 | |||
rs13459098 | 5.68 × 10−45 | rs29686328 | 70,622,848 | 4.71 × 10−11 | 9.56 × 10−4 | |||
rs13459098 | 5.64 × 10−27 | rs30273275 | 70,623,572 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs30120015 | 70,626,673 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 2.36 × 10−25 | Las2-3-bp-deletion (234 codon) | 70,626,717-70,626,719 | 1.08 × 10−8 | 2.73 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs29675801 | 70,628,766 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs30259292 | 70,628,787 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs30245983/Las2-T94I | 70,629,114 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 2.36 × 10−25 | Las2-S37A | 70,629,286 | 1.08 × 10−8 | 2.73 × 10−3 | |||
rs13459098 | 2.36 × 10−25 | Las2-9-bp-insertion (32-34 codons) | 70,629,292-70,629,301 | 1.08 × 10−8 | 2.73 × 10−3 | |||
rs13459098 | 2.36 × 10−25 | Las2-S30R | 70,629,307 | 1.08 × 10−8 | 2.73 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs30066196/Las2-R13Q | 70,629,357 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs51575278 | 70,630,714 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs46078787 | 70,630,994 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs49840286 | 70,631,087 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs50790044 | 70,631,338 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs45700290 | 70,637,806 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs48238521 | 70,638,098 | 1.87 × 10−9 | 1.71 × 10−3 |
Pas1 (Chr 6) . | . | Par2 (Chr 18) . | . | . | Epistasis . | |||
---|---|---|---|---|---|---|---|---|
SNP* . | P . | SNP† or polymorphism . | Position‡ . | P . | P . | |||
rs13459098 | 7.19 × 10−28 | rs29622203 | 70,602,803 | 9.81 × 10−10 | 3.16 × 10−3 | |||
rs13459098 | 1.99 × 10−28 | Rs29557643 | 70,613,785 | 6.30 × 10−10 | 1.81 × 10−3 | |||
rs13459098 | 5.68 × 10−45 | rs30301901 | 70,622,787 | 4.71 × 10−11 | 9.56 × 10−4 | |||
rs13459098 | 5.68 × 10−45 | rs29686328 | 70,622,848 | 4.71 × 10−11 | 9.56 × 10−4 | |||
rs13459098 | 5.64 × 10−27 | rs30273275 | 70,623,572 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs30120015 | 70,626,673 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 2.36 × 10−25 | Las2-3-bp-deletion (234 codon) | 70,626,717-70,626,719 | 1.08 × 10−8 | 2.73 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs29675801 | 70,628,766 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs30259292 | 70,628,787 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs30245983/Las2-T94I | 70,629,114 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 2.36 × 10−25 | Las2-S37A | 70,629,286 | 1.08 × 10−8 | 2.73 × 10−3 | |||
rs13459098 | 2.36 × 10−25 | Las2-9-bp-insertion (32-34 codons) | 70,629,292-70,629,301 | 1.08 × 10−8 | 2.73 × 10−3 | |||
rs13459098 | 2.36 × 10−25 | Las2-S30R | 70,629,307 | 1.08 × 10−8 | 2.73 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs30066196/Las2-R13Q | 70,629,357 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs51575278 | 70,630,714 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs46078787 | 70,630,994 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs49840286 | 70,631,087 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs50790044 | 70,631,338 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs45700290 | 70,637,806 | 1.87 × 10−9 | 1.71 × 10−3 | |||
rs13459098 | 5.64 × 10−27 | rs48238521 | 70,638,098 | 1.87 × 10−9 | 1.71 × 10−3 |
rs13459098 is one of the most significant SNPs in the Pas1 locus. SNPs with the strongest association in the Pas1 locus are in complete linkage disequilibria.
SNPs in italic are imputed SNPs.
The SNP positions (bp) were based on the NCBI mouse genome build 37.1.
In addition, we identified strains SM/J, LP/J, CBA/J, PL/J, 129S1/SvImJ, ST/bJ, and BALB/c to be associated with the Par2-resistant allele G (rs30245983), and RIIIS/J, MA/MyJ, O20, SWR/J, and A/J to be associated with the Par2-susceptible allele A. All of these strains carry the high-penetrant Pas1 susceptible allele C (rs13459098), but show a large variation in lung tumor susceptibility due to allelic status at the Par2 locus (Table 2).
Strain . | Multiplicity . | rs13459098 (Pas1)* . | rs30245983 (Par2)† . | 27th codon in Poli‡ . | Polymorphisms in Las2 (4930503L19Rik) . | . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | R13Q§ . | S30R . | S37A . | T94I . | 9-bp insertion (32–34 codons) . | 3-bp deletion (234 codon) . | |||||
C57BR/cdJ | 0 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
AKR/J | 0.1 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
C57L/J | 0.1 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
C3H/HeJ | 0.2 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
SJL/J | 0.29 | T | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
NZB/BINJ | 0.3 | T | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
C57BL/10J | 0.31 | T | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
DBA/2J | 0.4 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
C57BL/6J | 0.5 | T | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
SM/J | 0.5 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
LP/J | 1.1 | C | G | TCA(Stop) | G | A | T | C | / | / | |||||
CBA/J | 1.7 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
PL/J | 2 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
129S1/SvImJ | 2.1 | C | G | TCA(Stop) | G | A | T | C | / | / | |||||
ST/bJ | 3.2 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
BALB/c | 3.3 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
RIIIS/J | 6.7 | C | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
MA/MyJ | 8.9 | C | A | TCG(Ser) | A | T | G | T | Ins | Del | |||||
O20 | 12.1 | C | A | NA | A | NA | NA | T | NA | NA | |||||
SWR/J | 20.1 | C | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
A/J | 27.5 | C | A | TCG(Ser) | A | C | G | T | Ins | Del |
Strain . | Multiplicity . | rs13459098 (Pas1)* . | rs30245983 (Par2)† . | 27th codon in Poli‡ . | Polymorphisms in Las2 (4930503L19Rik) . | . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | R13Q§ . | S30R . | S37A . | T94I . | 9-bp insertion (32–34 codons) . | 3-bp deletion (234 codon) . | |||||
C57BR/cdJ | 0 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
AKR/J | 0.1 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
C57L/J | 0.1 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
C3H/HeJ | 0.2 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
SJL/J | 0.29 | T | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
NZB/BINJ | 0.3 | T | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
C57BL/10J | 0.31 | T | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
DBA/2J | 0.4 | T | G | TCG(Ser) | G | A | T | C | / | / | |||||
C57BL/6J | 0.5 | T | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
SM/J | 0.5 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
LP/J | 1.1 | C | G | TCA(Stop) | G | A | T | C | / | / | |||||
CBA/J | 1.7 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
PL/J | 2 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
129S1/SvImJ | 2.1 | C | G | TCA(Stop) | G | A | T | C | / | / | |||||
ST/bJ | 3.2 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
BALB/c | 3.3 | C | G | TCG(Ser) | G | A | T | C | / | / | |||||
RIIIS/J | 6.7 | C | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
MA/MyJ | 8.9 | C | A | TCG(Ser) | A | T | G | T | Ins | Del | |||||
O20 | 12.1 | C | A | NA | A | NA | NA | T | NA | NA | |||||
SWR/J | 20.1 | C | A | TCG(Ser) | A | C | G | T | Ins | Del | |||||
A/J | 27.5 | C | A | TCG(Ser) | A | C | G | T | Ins | Del |
rs13459098 is one of the most significant SNPs in the Pas1 locus; T and C are associated with the Pas1-resistant and susceptible alleles, respectively.
rs30245983 is one of the most significant SNPs in the Par2 locus; G and A are associated with the Par2-resistant and susceptible alleles, respectively.
NA, SNP genotypes are not available.
Sequence variants R13Q and T94I are identical to rs30066196 and rs30245983, respectively. These two SNPs have available genotype data in strain O20 in the SNP panel from the MPD.
Las2 is a novel lung tumor susceptibility gene. The 40-kb interval identified on chromosome 18 contains three genes Poli, Stard6, and 4930503L19Rik, with two additional genes 2310002L13Rik and Mdb2 nearly. Among them, Poli, Stard6, Mbd2, and 2310002L13Rik were excluded as likely candidates based on previous studies indicating lack of gene expression in lung tissues and nucleotide polymorphisms between A/J and BALB/c inbred strains as well as recent association and sequencing analyses (see Discussion; ref. 34).
The Par2 candidate 4930503L19Rik has nine exons encoding for a transcript of 2,291-bp and protein of 526 amino acids. The 4930503L19Rik gene is widely expressed in multiple mouse tissues with the highest expression in the lung (Supplementary Fig. S5A). 4930503L19Rik mRNA levels increased from embryonic day 7 to 11 and peaked at day 15, suggesting a possible developmental role. This gene was provisionally named Las2 (Lung adenoma susceptibility 2). To determine the potential effect of Las2 mRNA levels on lung tumor susceptibility, we analyzed the expression differences between the top five resistant strains (C57BR/cdJ, AKR/J, C57L/J, C3H/HeJ, and SJL/J) and the top five susceptible strains (BALB/cByJ, RIIIS/J, MA/MyJ, SWR/J, and A/J). No significant correlation between Las2 mRNA levels and lung tumor multiplicity was observed (P > 0.05; Supplementary Fig. S5B). Resequencing of the Las2 coding region identified three synonymous SNPs, eight nonsynonymous SNPs, one 9-bp insertion, and one 3-bp deletion in A/J mice compared with BALB/c mice. These polymorphisms were genotyped in the remaining DNA samples of inbred strains and their association with lung tumor multiplicity was examined. Four nonsynonymous SNPs (R13Q, S30R, S37A, and T94I) and the two insertions/deletions were significantly associated with tumor multiplicity (P = 9.33 × 10−5; Supplementary Fig. S6; Table 2). These SNPs and insertions/deletions were in complete linkage disequilibrium with the most significant SNP (rs30245983) in the refined Par2 region, and fully explain the association of this locus with lung tumor multiplicity (Table 2; Fig. 2).
Conservation of Las2 protein sequence among species. The Las2 protein sequence from C57BL/6J was used to query for putative orthologues in other species using the National Center for Biotechnology Information (NBCI) nonredundant protein sequences database. The highest homology match in each organism in the BLAST result was presented. Las2 protein is evolutionarily conserved among species and likely possesses an important cellular function. Mouse Las2 has 65% identity and 79% similarity to the human orthologue (Supplementary Fig. S7A). However, Las2 has no reported function in the literature, and bears no obvious functional motifs. Among the four missense SNPs and two insertions/deletions that are significantly associated with the increased risk of lung cancer, the missense SNP in the 37th codon is highly conserved among species. All species including mouse strains (such as BALB/c and 129S1/SvImJ) that take the Par2-resistant allelic sequences have serine at codon 37 (37S). In the Par2-susceptible strains (such as C57BL/6J and A/J), this codon is mutated to alanine (37A). Furthermore, C57BL/6J has a 9-bp insertion (codons 32–34), which is not observed in the other species (Supplementary Fig. S7B). Therefore, the S37A SNP and the 9-bp insertion may be causal genetic variants that confer increased risk for lung cancer.
Functional assays of Las2. To evaluate the effects of Las2 on cell proliferation, we carried out a series of transfection experiments to determine whether overexpression of the Las2 gene could affect lung tumor cell growth. Because large inhibitory effects of the Par2 locus were observed in mice carrying the Pas1 susceptible allele, we used the LM2 cell line, which is derived from a urethane-induced papillary tumor from A/J mice (22). Transfection of Las2-R (Par2-resistant allele) into LM2 cells inhibited colony formation in two clones compared with Las2-S–transfected cells or vector alone (Fig. 3A). Colonies of LM2 cells overexpressing Las2-R were somewhat dispersed compared with Las2-S–overexpressing and vector alone LM2 colonies (Supplementary Fig. S8). These Las2-R–overexpressing cells were also slightly larger than Las2-S or vector alone LM2. Therefore, cell size is not responsible for smaller colony size in Las2-R–overexpressing LM2 cells. We further confirmed the inhibitory nature of the Las2 resistance allele by establishing doxycycline-inducible Las2 expression in the PCC4 mouse lung tumor cell line (Fig. 3B). Las2-R inhibited colony formation, but Las2-S had no effect in PCC4 cells. Our overexpression studies confirm the inhibitory nature of the Las2 resistance allele (Fig. 3A and B).
To determine the function of Las2 in vivo, Las2-transfected LM2 cells were injected s.c. into athymic nude mice. Tumor diameter was monitored for 3 weeks, after which tumors were excised, weighed, and protein isolated for immunoblot analysis. As shown in Fig. 3C, 21 d after injection, tumors derived from LM2-Las2-R cells were significantly smaller compared with LM2-Las2-S or LM2-vector–derived tumors (P < 0.01).
To further evaluate the role of Las2 in modulating the rate of cell proliferation, LM2 cells overexpressing Las2-R were subjected to short hairpin RNA–mediated knockdown of Las2 as verified by immunoblot. Upon knockdown of Las2, growth rates were increased as measured by colony formation and 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assays (Fig. 3D and E). Thus, these functional assays support the data from the association analysis and suggest that germline nonsynonymous mutations and insertions/deletions in Las2 confer increased risk of lung cancer in inbred mice through affects on proliferative potential.
LAS2 loss in human lung cancers. Human chromosome 18q21 locus, which contains the murine homologue of Las2, is frequently deleted in colorectal cancer and many other human cancers (35–37). We analyzed LOH data around the LAS2 (C18orf54) region from the Tumor Sequencing Project public data sets.3
We analyzed 384 lung adenocarcinomas and matched normal DNAs that were genotyped with the StyI chip of the 500K Human Mapping Array set (Affymetrix Inc). LOH was inferred by comparing the genotype of each SNP in a tumor to that in its paired normal sample. Frequent allelic imbalance in chromosome 18q21 was observed in association with two microdeletions around the LAS2 region in lung tumors (Supplementary Fig. S9A–C). One is located at 49.766 to 49.911 Mb, spanning ∼145 kb including 17 consecutive SNPs with an average LOH frequency of 61%. The other is located at 49.990 to 50.216 Mb, which covers LAS2 and spans 16 consecutive SNPs with an average LOH frequency of 72%. These LOH data suggest that chromosome 18q21 may contain a lung tumor suppressor gene, with LAS2 as a strong candidate. We further validated LOH using microsatellite markers inside the microdeletion region and upstream of the microdeletion region in independent lung adenocarcinoma samples (Fig. 4). Out of 14 patients giving unambiguous results, 6 (42.9%) had LOH at the D18S487 marker within the microdeletion, whereas only 1 of 8 (12.5%) showed LOH at the D18S1127 marker 1.5 Mb upstream of the deletion. We also resequenced the coding exons of C18orf54 in the six human tumor samples that exhibited LOH, to see if allelic loss was a result of initial somatic mutation of the other allele. We did not observed any somatic mutations in these coding exons. Nevertheless, it is possible that somatic mutation of nonexonic regions or promoter methylation may lead to allelic inactivation.Discussion
Previously, we and others reported Poli as a possible candidate for the Par2 locus (30, 32, 34, 38). Our resequencing analysis combined with the association results does not support Poli as a strong candidate at the Par2 locus. Both 129X1/SvJ and LP/J mice, which carry a nonsense mutation in codon 27 of exon 2 (named as Ser27Stop) in Poli, only develop about one to two tumors per lung after urethane treatment. This specific nonsense mutation causes a truncated Poli protein, which abrogates function (39). This observation is inconsistent with the prior hypothesis and our association results that both 129X1/SvJ and LP/J carry a functional resistant allele of Par2 (Table 2; refs. 30, 32, 34, 38, 40). If Poli is the true Par2-resistant gene, natural knockout of its tumor suppressor function in 129S1/SvImJ and LP/J strains would be expected to cause more tumors, which is not what is observed. Recently, Lee and Matsushita (40) conducted a linkage study in (A/J × 129X1/SvJ)F1 × A/J backcross and (A/J × 129X1/SvJ) F2 mice and observed the linkage peak at marker Ser27Stop. However, this study only confirmed the Par2 locus rather than established Poli as a Par2 candidate, as Las2 and Poli are within the 40-kb interval (Fig. 2C) and are indistinguishable in linkage analysis. Another observation that Poli-deficient mice and Polk and Poli double-deficient mice exhibit normal somatic hypermutation frequency, suggesting that Poli is less likely to be the Par2 candidate (39, 41).
The sample size of mouse inbred strains currently available for GWAS is relatively small, which precludes the possibility of validation studies in independent samples, as is often done in humans. Therefore, functional analysis of candidate loci identified in GWAS is necessary to confirm the association findings. Due to limited statistical power (42), GWAS in inbred mouse strains is best applied in combination with other evidence, such as previous linkage mapping and congenic fine mapping. It is worth noting that the strongest associations identified in our GWAS were located within previously detected QTL regions (Supplementary Fig. S2; Figs. 1 and 2). This provides additional evidence for the validity of the associations identified in our GWAS. Based on strong linkage evidence, region-wide thresholds could be used for declaring significant associations in GWAS. For example, the region-wide P value for the two confirmed associations in the Pas1 and Par2 loci reached high significance (P < 0.01) in the present study. Furthermore, small sample size of inbred strains may render a nonrandom correlation between different genetic loci, which can mask or change genetic effects at those loci in the association analysis. This may be the reason why the association in the Par2 locus was not identified in the conventional GWAS, but was identified in the genetic background–controlled GWAS. As a result, the nonrandom correlation also reduced significance of the identified associations in the GWAS. We compared the goodness-of-fit of three different models for analyzing the associations at the Pas1 and Par2 loci: one locus, two loci without interaction, and two loci with interaction. The best-fit model was to analyze both loci and their epistatic interaction simultaneously, based on the Bayesian information criteria (data not shown). This analysis yielded a P value of 5.64 × 10−27 for rs13459098 at the Pas1 and P value of 1.87 × 10−9 for rs30245983 at the Par2, and P value of 1.71 × 10−3 for their interaction effects, which provide dramatically higher significance for associations in either the conventional GWAS or genetic background–controlled GWAS (Supplementary Table S2 and S4). Therefore, modeling multiple loci in the association analysis may increase statistical power in GWAS in model organisms such as inbred mice and Arabidopsis thaliana.
Our results emphasize the critical importance of the use of genome search strategies in genetic association studies. The common search strategy that is used in current GWAS tests one SNP for association at a time. We applied a genome search for QTLs that take into account the genetic background in the GWAS. The genetic background–controlled genome search was previously used in linkage analysis to increase the precision of QTL mapping (28). Application of this strategy into the GWAS led to the identification of a novel gene Las2 associated with lung tumor susceptibility. Las2 showed strong association with resistance to tumor induction as well as epistatic interactions with the Pas1 locus. More importantly, Las2 seems to be a novel lung cancer tumor suppressor gene based on its activity in cell colony formation and nude mouse tumorigenicity assays and frequent loss in human lung adenocarcinomas. Our approach highlights a different search strategy for GWAS that can help identify additional genetic variants that are otherwise embedded in conventional genome searches.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
P.Y. Liu, H. Vikis, and M. James contributed equally to this work.
Acknowledgments
Grant support: NIH (CA099187, CA099147, ES012063, and ES013340).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank the Broad Institute of Harvard and MIT, the Wellcome Trust Center for Human Genetics, and Perlegen Sciences for releasing inbred laboratory mouse SNP data; the Mouse Phenome Project for collecting mouse SNP data and the Center for Genome Dynamics at The Jackson Laboratory (Bar Harbor, ME) for generating imputed mouse SNP data; the NHGRI investigators for generating the LOH data in the Tumor Sequencing Project; and the Tissue Procurement Core of Washington University (St. Louis, MO) for providing tumor tissues.