CD8+ T cells recognize peptides displayed by HLA class I molecules on cell surfaces, monitoring pathologic conditions such as cancer. Advances in proteogenomic analysis of HLA ligandomes have demonstrated that cells present a subset of cryptic peptides derived from noncoding regions of the genome; however, the roles of cryptic HLA ligands in tumor immunity remain unknown. In the current study, we comprehensively and quantitatively investigated the HLA class I ligandome of a set of human colorectal cancer and matched normal tissues, showing that cryptic translation products accounted for approximately 5% of the HLA class I ligandome. We also found that a peptide encoded by the long noncoding RNA (lncRNA) PVT1 was predominantly enriched in multiple colorectal cancer tissues. The PVT1 gene is located downstream of the MYC gene in the genome and is aberrantly overexpressed across a variety of cancers, reflecting its oncogenic property. The PVT1 peptide was recognized by patient CD8+ tumor-infiltrating lymphocytes, as well as peripheral blood mononuclear cells, suggesting the presence of patient immune surveillance. Our findings show that peptides can be translated from lncRNAs and presented by HLA class I and that cancer patient T cells are capable of sensing aberrations in noncoding regions of the genome.
In contrast to protein-coding genes that account for only a few percent of the human genome, noncoding transcripts are pervasively transcribed across the genome (1). Long noncoding RNAs (lncRNA) are a subset characterized by size (>200 nucleotides) and structurally resemble mRNAs but do not encode proteins by definition (2). The functions of most lncRNAs remain unknown; however, dysregulation of lncRNAs is often found in cancer cells, implying their role in oncogenesis (3–5). Although large-scale proteomics analyses have demonstrated that the human proteome is mostly composed of proteins derived from coding genes (6, 7), spontaneous translation from lncRNAs has been reported, suggesting a noncanonical translation from short open-reading frames (ORF; refs. 8–11).
Cells continuously sample intracellular translation products, displaying a repertoire of peptide–MHC class I complexes (pMHC I) for CD8+ T–cell surveillance (12). Noncanonical translation products can be presented by the MHC, and the potential of cryptic peptides as a source of T-cell antigens has attracted attention (13). Studies employing proteogenomic approaches have postulated that up to 15% of the MHC class I immunopeptidome of both mouse and human tumors arises from outside canonical ORFs, including lncRNAs (14–16). Together, these findings suggest the possibility of tumor-associated cryptic HLA ligands of lncRNA origin, which may elicit spontaneous patient immune surveillance against tumors. The roles of cryptic HLA ligands in patient immune surveillance remain unclear to date. Here, we explored the HLA class I ligandome of colorectal cancer and patient-matched normal tissues, and identified a lncRNA-derived tumor antigen. The oncogenic properties of the lncRNA PVT1 gene encoding the antigen led to aberrant gene expression followed by HLA ligand presentation across multiple patients with colorectal cancer. Our findings demonstrate the presence of patient T-cell surveillance against colorectal cancer, which is mediated by a lncRNA-derived shared tumor antigen.
Materials and Methods
Patient and healthy donor material
The study was performed with the approval of the Institutional Review Board (IRB; 312-1134) and the Research Ethics Committee of Sapporo Medical University (Sapporo, Japan; 29-2-69). Signed informed consent was obtained from all patients with colorectal cancer (n = 36) and healthy donors (n = 3). Patient material was obtained from the primary lesions of patients with pStage I–IIIb colorectal cancer who were histologically diagnosed with adenocarcinoma and underwent surgical resection (n = 6). Cancer lesions (n = 6) and patient-matched normal colorectal mucosa (n = 4) were macroscopically determined by a surgeon and pathologist and sampled immediately after resection with the careful removal of extra connective tissues. Tissues were either used for tumor-infiltrating lymphocyte (TIL) expansion (n = 5) or snap frozen and cryopreserved for proteogenomic and DNA/RNA analysis, as described below. Peripheral blood obtained from patients with colorectal cancer (n = 15) and healthy donors (n = 3) was used for T-cell induction, as described below. All subjects in this study were confirmed to be HLA-A*24:02 positive by PCR of genomic DNA (17). The patient studies were conducted in accordance with the guidelines of the Declaration of Helsinki.
T-cell induction and single-cell cloning
Peripheral blood mononuclear cells (PBMC) were isolated from patients with colorectal cancer and healthy donors using Lymphoprep (Cosmo Bio) according to the manufacturer's instructions and cultured in complete AIM-V medium (Gibco) containing 10% human AB serum (kindly provided by S. Takamoto, Japanese Red Cross Hokkaido Block Blood Center), 1% penicillin/streptomycin (Gibco), 1% GlutaMAX (Gibco), 10 mmol/L HEPES (Gibco), 1 mmol/L sodium pyruvate (Gibco), and 55 μmol/L 2-mercaptoethanol (Gibco). Approximately 5.0 × 106 cells were stimulated with 20 μmol/L of HWNDTRPAHF (HF10) peptide (Sigma) on days 0 and day 7, with the addition of rhIL2 (50 U/mL; kindly provided by Takeda Chemical Industries) on day 1. On days 0 and 14, cells were prestained with human FcR blocking reagent (MBL, catalog no. MTG-001) and stained with HF10–HLA-A24 tetramer-PE (MBL) and HIVenv584–592–HLA-A24 tetramer-FITC (MBL, catalog no. TS-M007-3) followed by anti-CD8-PC5 (Beckman Coulter, T8) for 20 minutes at 4°C. Stained cells were analyzed using FACSCanto II (BD) with FACSDiva (BD) or FlowJo (Tree Star). To establish T-cell clones, CD8 and tetramer double-positive cells were isolated using FACS Aria II (BD) and sorted onto U-bottom 96-well plates (1 cell/well). Each single cell was expanded in complete AIM-V medium supplemented with rhIL2 (100 U/mL) and phytohemagglutinin (1 μg/mL; WAKO) in the presence of irradiated human PBMCs (2.0 × 105 cells/well) and expanded for 3 weeks. Culture medium was replaced every 3 days. PBMCs were irradiated by 100 Gy (SOFTEX, M100W). At the end of culture, 10 clone candidates were generated.
Colorectal cancer specimens were minced and enzymatically digested using Liberase (1 μg/mL; Roche) in the presence of DNase I (2 μg/mL; Roche) for 30 minutes at 37°C without mixing and RBC lysis. TILs were obtained by culturing cell suspensions (1.0 × 106 cells/mL) in complete AIM-V medium (Gibco) containing 10% human AB serum, 1% penicillin/streptomycin, 1% GlutaMAX, 10 mmol/L HEPES, 1 mmol/L sodium pyruvate, and 55 μmol/L 2-mercaptoethanol, supplemented with recombinant human (rh)IL2 (6,000 U/mL) for 2 to 4 weeks. Culture medium was replaced every 3 days. For the phenotypic characterization of TILs, cells were prestained with human FcR blocking reagent (MBL, Clear Back catalog no. MTG-001), and stained with an HLA-A24 tetramer conjugated with PE at 4°C for 20 minutes. followed by anti-CD8-FITC (Beckman Coulter, T8) and anti-CD39-PerCP/Cy5.5 (BioLegend, A1) at 4°C for 20 minutes. Stained cells were analyzed using FACSCanto II (BD) with FACSDiva (BD) or FlowJo (Tree Star). An HLA-A24 tetramer complexed with HF10 was generated by MBL. An HLA-A24 tetramer complexed with EBVEBNA3A or CMVpp65 was purchased from MBL (catalog no. TS-M004-1 and catalog no. TS-002-1C, respectively).
Cell culture and cell lines
Cells were cultured in complete RPMI1640 (nacalai tesque) or DMEM (nacalai tesque) supplemented with 10% FBS, 1% penicillin/streptomycin, 1% GlutaMAX, 10 mmol/L HEPES, 1 mmol/L sodium pyruvate, and 55 μmol/L 2-mercaptoethanol unless specifically mentioned. SW480, SW620, HCT15, Colo320, HT29, 293T, and K562 cell lines were purchased from ATCC. Lu99 was obtained from the Cell Resource Center for Biomedical Research, Tohoku University. LHK2 and 293T-A24 cell lines (293T stably expressing HLA-A*24:02) were established in our laboratory (18). The T2-A24 cell line (T2 stably expressing HLA-A*24:02) was a gift from K. Kuzushima (Aichi Cancer Center Research Institute, Nagoya, Japan; ref. 19). We established (20) and used HCT15/β2m throughout the study, which stably expresses the wild-type β2 microglobulin gene. To block TAP transportation, 293T-A24 cells were transiently transfected with ICP47 or UL49.5 minigene (gifts from N. Hirano, University of Toronto, Toronto, Canada) as reported previously (21). Cell lines were frozen in batches on arrival and not cultured over the maximum period of 8 weeks. Cell lines were periodically verified to be Mycoplasma negative using the Mycoplasma Detection Set (TaKaRa). The histologic origins and HLA class I genotypes (TRON Cell Line Portal, http://celllines.tron-mainz.de) were as follows: colorectal adenocarcinoma (Colo320, A*24:02, B*14:02, C*08:02, C*07:01; SW480, A*24:02, A*02:01, B*07:02, B*15:18, C*07:04; SW620, A*24:02, A*02:01, B*07:13, B*37:04, C*07:04; HCT15, A*24:02, A*02:01, B*08:01, B*35:01, C*04:01, C*07:06; HT29, A*24:03, A*01:01, B*44:03, B*35:01, C*04:01), and lung carcinoma (LHK2, A*24:02, A*02:07, B*46:01, B*48:01, C*01:02; Lu99, A*24:02, B*54:01, B*52:01, C*01:02).
Isolation of HLA class I ligands
Hybridomas producing HLA-A24 mAb (C7709A2, a gift from P.G. Coulie, Ludwig Institute for Cancer Research) were cultured in Hybridoma serum-free medium (SFM; Gibco) supplemented with 1% penicillin/streptomycin in CELLine Bioreactor Flasks (Corning, CL1000). Condensed mAb was collected through a semipermeable membrane during cell culture and purified using HiTrap Protein G HP (GE Healthcare). To isolate HLA-A24 ligands, we followed established procedures with slight modifications (18). Approximately 0.5 to 2.0 g of cancer or matched normal tissues or 1.0 × 109 Colo320 cells, were used for analysis. Briefly, frozen tissue or cell pellets were ground under cryogenic conditions and lysed with lysate buffer containing 0.25% sodium deoxycholate (Wako), 0.2 mmol/L iodoacetamide (Wako), 1 mmol/L EDTA (Dojindo), 200× protease inhibitor cocktail (Sigma, catalog no. P8340), 1 mmol/L PMSF (Sigma), and 1% octyl-β-D glucopyranoside (Dojindo) in DPBS (Gibco). Peptide–HLA-A24 complexes were captured using affinity chromatography of C7709A2 mAb coupled to CNBr-activated Sepharose 4B (GE Healthcare). The HLA ligands were then eluted with 0.2% trifluoroacetic acid (TFA) and desalted using Sep-Pak tC18 (Waters) with 28% acetonitrile (ACN) in 0.1% TFA and ZipTip U-C18 (Millipore) with 50% ACN in 1% formic acid (FA). Samples were dried using vacuum centrifugation and resuspended in 5% ACN in 0.1% TFA for LC-MS/MS analysis.
Samples were loaded into a nano-flow LC (Thermo Fisher Scientific, Easy-nLC 1000 system) online-coupled to an Orbitrap mass spectrometer equipped with a nanospray ion source (Thermo Fisher Scientific, Q Exactive Plus). Nano-flow LC separation was performed with a linear gradient ranging from 3% to 30% buffer B (100% ACN and 0.1% FA) at a flow rate of 300 nL/minute. for 80 minutes using a 75 μm × 20 cm capillary column with a particle size of 3 μm (Nikkyo Technos, NTCC-360). In mass spectrometry (MS) analysis, survey scan spectra were acquired at a resolution of 70,000 at 200 m/z with an automatic gain control (AGC) target value of 3 × 106 ions and a maximum injection time (IT) of 100 ms, ranging from 350 to 2,000 m/z with charge states between 1+ and 4+. We employed a data-dependent top 10 method, which generated high-energy collision dissociation fragments for the 10 most intense precursor ions per survey scan. MS/MS resolution was 17,500 at 200 m/z with an AGC target value of 1 × 105 ions and a maximum IT of 120 ms.
Total RNA was isolated from cancer and matched normal tissues, as well as Colo320 cells, using the RNeasy Mini Kit or Allprep DNA/RNA/Protein Kit (Qiagen). RNA quality was validated on the basis of the RNA integrity number (RIN > 7). Poly A–selected libraries were prepared using the TruSeq Stranded mRNA LT Sample Prep Kit (Illumina). The libraries were sequenced on a NovaSeq 6000 (Illumina) with 100-bp paired-end reads, yielding approximately 20 Gb (200 M reads) data per sample. Low-quality reads, adapter sequences, contaminant DNA, or PCR duplicates were removed using FastQC (v. 0.11.7) and Trimmomatic (v. 0.38). Reads were mapped to the human reference genome (hg38) using HISAT2 (v. 2.1.0) and Bowtie2 (v. 220.127.116.11). The genes and transcripts were annotated using the comprehensive gene annotation file provided by GENCODE (v. 31). The abundance of genes or transcripts was calculated as transcripts per million (TPM) using StringTie (v. 1.3.4d), and only transcripts with TPM values > 0 were considered to be expressed. Library preparation and sequencing were performed by Macrogen (Korea). To estimate the population abundance of tumor-infiltrating immune cells, RNA sequencing (RNA-seq) data (six colorectal cancer and five normal tissues) were processed using the MCP-counter algorithm with a default setting as reported previously (22).
Cap analysis of gene expression
cDNA library preparation and cap analysis of gene expression (CAGE) sequencing were carried out by DNAFORM as reported previously (23). Experimental conditions were described in detail elsewhere (24). Briefly, cDNAs were synthesized from the total RNA of Colo320. Ribose diols in the 5′ cap structures of 4 μg of RNAs were oxidized and biotinylated, and the biotinylated RNA/cDNAs were selected using streptavidin beads (cap-trapping). Double-stranded cDNA libraries (CAGE libraries) were constructed after RNA digestion by RNase ONE/H (Promega) and adapter ligation to both ends of cDNA. CAGE libraries were sequenced on a NextSeq 500 (Illumina) with 75-bp single-end reads. Obtained reads (CAGE tags) were mapped to the human genome (GRCh38) using BWA (v. 0.7.12), and the remaining unmapped reads were mapped using HISAT2 (v. 2.0.5). CAGE tag clustering, detection of differentially expressed genes, and motif discovery were performed using RECLU. Tag count data were clustered using the modified Paraclu program. Clusters longer than 200 bp and clusters with a minimum count per million < 0.1 or an irreproducible discovery rate ≥ 0.1 were discarded.
Colo320 cDNA was prepared using the SMARTer cDNA synthesis kit (Clonetech) and converted into a SMRTbell library using the SMRTbell Template Prep Kit 2.0 (Pacific Biosciences) according to the manufacturer's instructions. The samples were sequenced on a PacBio Sequel I (Pacific Biosciences). Obtained data were analyzed using the Iso-Seq3 in the PacBio SMRT Analysis (v. 6.0) as described elsewhere (25). Briefly, circular consensus sequences with both 5′ and 3′ primers were regarded as full-length reads, and they were pooled for isoform-level clustering analysis. The Arrow algorithm in the SMRT Link software called high-quality sequences (predicted consensus accuracy ≥ 99%), which were then mapped to GRCh38.p13 using minimap2 (v. 2.1). The full-length isoforms were ultimately annotated and characterized using a SQANTI2 pipeline (26). Library preparation and sequencing were performed by Macrogen (Korea).
Personalized database construction
For the proteogenomic MS database search of cryptic, as well as canonical, translation products, we built personalized FASTA reference databases for six colorectal cancer, four matched normal tissues, and Colo320, using Python scripts. Each database contained every translation product derived from all possible ORFs in the three translation frames of the expressed transcripts. Possible ORFs were defined as the sequences that began from the ATG codon and ended with stop codons or reached the transcript end, yielding at least eight amino acid–long polypeptides. Transcript expression was determined based on the RNA-seq data (TPM > 0). The transcript sequences provided by GENCODE (v. 31) were used for the database construction.
MS database search and cryptic HLA ligand identification
Experimental MS/MS data were searched against the personalized databases using the Sequest HT and Percolator algorithms on the Proteome Discoverer 2.3 platform (Thermo Fisher Scientific). For database searching, the tolerance of precursor and fragment ions was set at 10 ppm and 0.02 Da, respectively. The oxidation of methionine (+15.995 Da) was selected as the dynamic modification. No specific enzymes were selected for this search. For the following analysis using Percolator (Thermo Fisher Scientific), concatenated target-decoy selection was validated based on the q-values. Annotated peptide-spectrum matches with the highest score (Delta correlation, ⊿Cn ≤ 0.05) were selected for each MS spectrum.
For colorectal cancer tissue samples, we performed label-free quantification (LFQ) on Proteome Discoverer 2.3 (six colorectal cancer vs. four matched normal tissues). The relative abundance of a given peptide was estimated based on its precursor ion intensity, which corresponded to the peak height of the chromatogram traces of the LC-MS features derived from the precursor ion. The relative abundance was normalized across the samples and scored between 0 and 1,000 (scaled abundance), wherein the mean of the peptide abundance was set to 100. Undetected peptides were scored as 0. The sequences identified as LFQ peaks and were 8 to 11 amino acids in length with % rank scores of less than 2.0 (HLA-A*24:02, NetMHCpan4.1) were counted as HLA-A24 ligands.
For Colo320 cells, MS/MS data were searched against the Colo320-specific database using the Sequest HT along with the Percolator algorithm on the Proteome Discoverer 2.3 platform. We applied an FDR of 0.01 as the detection threshold, and the sequences that were 8 to 11 amino acids in length and had % rank scores of less than 2.0 (HLA-A*24:02, NetMHCpan4.1) were counted as HLA-A24 ligands.
In both patient samples and Colo320 cells, the identified HLA-A24 ligand sequences were classified as either canonical or cryptic HLA-A24 ligands, according to their presence or absence in the human Swiss-Prot database (42,331 protein sequences, with isoforms, downloaded on March 14, 2018), respectively. During the comparison with the Swiss-Prot database, every “Iso” was converted into “Leu” and both amino acids were treated as equals as they are indistinguishable by MS. To validate the search results, the MS data were researched against a reference proteome database (Swiss-Prot). This database did not contain cryptic sequences and the search was performed without personalization (without considering transcript expression) and quantification. We applied an FDR of 0.05 as the detection threshold in patient samples.
MS raw data and personalized FASTA files used for MS searches have been deposited to the ProteomeXchange Consortium via the jPOSTrepo partner repository (https://repository.jpostdb.org) with the dataset identifier PXD024533. RNA-seq data have been deposited to the Japanese Genotype-phenotype Archive (JGA) via the NBDC human database (https://humandbs.biosciencedbc.jp/en/) with the dataset identifier JGAS000280.
Total RNA was isolated from cancer tissues and cancer cell lines (Colo320, SW480, SW620, HT29, LHK2, Lu99, HCT15/β2m) using the RNeasy Mini Kit (Qiagen) or AllPrep DNA/RNA Mini Kit (Qiagen). cDNA was synthesized from 2 μg of total RNA by reverse transcription using SuperScript III (Invitrogen). A panel of cDNAs from human fetal and adult tissues was purchased from Clontech and Bio Chain. Gene expression was measured using the StepOne Real-Time PCR System (Applied Biosystems) with PowerUp SYBR Green Master Mix (Thermo Fisher Scientific). Primer pairs were as follows: PVT1, 5′-TGCATGGAGCTTCGTTCAAGT-3′ and 5′-GAGATCTCAACCCTCTCAGCC-3′ (product size 232 bp, designed to encompass the HF10-encoding ORF); G3PDH, 5′-ACCACAGTCCATGCCATCAC-3′ and 5′-TCCACCACCCTGTTGCTGTA-3′ (product size 452 bp). An initial denaturation step of 95°C for 10 minutes was followed by 40 cycles of denaturation at 95°C for 15 seconds and annealing/extension at 60°C for 60 seconds. Each sample was analyzed in triplicate and the threshold cycle values (Ct) of PVT1 were normalized according to those of G3PDH.
Peptide–HLA class I stability assay
TAP-deficient T2-A24 cells were precultured overnight at room temperature. The next day, peptides (HF10, HIVenv584–592, GK12) in a range of indicated concentrations were pulsed onto T2-A24 cells and incubated for 1 hour at room temperature followed by incubation for 3 hours at 37°C. Cells were stained with HLA-A24 mAb (C7709A2), followed by goat anti-mouse IgG-FITC (KPL) and analyzed using a FACSCanto II (BD). On the basis of the difference in mean fluorescence intensity (MFI) values between samples pulsed with and without indicated peptides, ΔMFI was calculated and represented the stability of the corresponding peptide–HLA-A24 complexes on the surface. Synthetic peptides (HWNDTRPAHF, HF10; RYLRDQQLL, HIVenv584–592; GYISPYFINTSK, GK12) with >80% purity were purchased from Sigma and Cosmo Bio.
We used human IFNγ ELISPOT set (BD Biosciences, catalog no. 551849) according to the manufacturer's instructions. Cancer cells (Colo320, SW480, SW620, HT29, LHK2, Lu99, or HCT15/β2m), or antigen-presenting cells (T2, T2-A24, or 293T-A24) preincubated with or without 20 μmol/L synthetic peptides (HF10 or HIVenv584–592) for 2 hours at room temperature served as target cells. T-cell clones (H3, A10, or E10) were cultured with target cells at a 1:1 ratio (10,000 cells per well) for 24 hours at 37°C. Cultures were incubated with a biotinylated anti-human IFNγ (250×) for 2 hours at room temperature, followed by the ELISPOT Streptavidin-HRP (horseradish peroxidase) or 1 hour at room temperature, and positive spots were visualized using the ELISPOT AEC Substrate Set (BD). To block the activity of the proteasomes, target cells were preincubated with bortezomib (Selleck) or carfilzomib (ApexBio Technology) for 48 hours at the indicated concentrations. For comparison of functional avidity, an HF10-specific CD8+ T–cell clone, as well as the established T-cell clones with different antigen specificities (AKF9, IV9, or RF8), were cultured with T2-A24 cells pulsed with the indicated range of the corresponding peptides (27, 28). The peptide concentration of a half-maximal IFNγ production was calculated as EC50.
Lactate dehydrogenase cytotoxicity assay
Cancer cells (Colo320 or K562) or T2-A24 cells preincubated with or without 20 μmol/L synthetic peptides (HF10 or HIVenv584–592) for 2 hours at room temperature served as target cells. Target cells (1.0 × 105) were cultured with an HF10-specific T-cell clone at the indicated effector/target ratios (E/T) for 6 hours at 37°C. The amount of lactate dehydrogenase (LDH) released from lysed target cells was measured using the LDH Cytotoxicity Detection Kit (Takara Bio) according to the manufacturer's instructions. The percentage of LDH released was calculated as follows:
Spontaneous, minimal, and maximal LDH releases were determined for T cells alone, target cells alone, and target cells treated with 2% NP40 (Sigma-Aldrich, IGEPAL CA-630), respectively.
PVT1 minigene and gene knockdown
The HF10-encoding ORF, which extends over exons 1 and 2, was amplified from Colo320 cDNA using a pair of PCR primers concatenated with a FLAG tag (5′-CCCGGATCCCTCCGGGCAGAGCGCGTGTG-3′ and 5′-CCGCTCGAGTCACTTATCGTCGTCATCCTTGTAATCGCCGCCAGCTGCAGTCCTTCGTC-3′). The purified PCR product was digested with enzymes (BamH1 and Xho1; New England Biolabs) and ligated into an empty vector (pcDNA3.1, Invitrogen). For overexpression, the PVT1 minigene construct was transiently transfected into 293T or 293T-A24 cells using Lipofectamine 2000 (Invitrogen) according to the manufacturer's instructions. For gene knockdown experiments, we used a predesigned siRNA located at exon 3 targeting a downstream position of the HF10-encoding ORF (Qiagen, Hs_PVT1_6 FlexiTube siRNA; indicated as siRNA PVT1 #1). Twenty nanomolar PVT1 siRNA or an irrelevant negative control (Qiagen, All Stars Negative Control siRNA; indicated as siRNA ctrl #1) was transfected into Colo320 using Lipofectamine 2000 (Invitrogen) according to the manufacturer's instructions. Cells were harvested 48 hours after transfection and used for assays.
Western blot analysis
Approximately 1 × 107 293T cells transfected with PVT1 minigene were lysed with a buffer containing 50 mmol/L Tris-HCl (pH 7.5), 150 mmol/L NaCl, 5 mmol/L EDTA, 1% NP40, and a protease inhibitor mixture (1 tablet/50 mL; Roche) for 30 minutes at 4°C. Samples equivalent to 1 × 106 cells were separated by 12% SDS-PAGE, transferred to a polyvinylidene difluoride membrane, and blocked with 5% milk (MEGMILK SNOW BRAND) in PBS for 1 hour at room temperature. The antibodies used for blotting were as follows: mouse anti-FLAG (1:1,000, Sigma-Aldrich), mouse anti–β-actin (1:1,000, Sigma-Aldrich), and HRP-conjugated goat anti-mouse IgG (1:1,000, KPL). The primary and secondary antibodies were incubated with the membrane for 45 minutes each at 4°C. Signals were detected using ECL Western Blotting Detection Reagents (GE Healthcare) and developed according to the manufacturer's instructions.
Immunofluorescence and IHC
293T cells expressing the PVT1 minigene were fixed with 4% paraformaldehyde on a poly L-lysine plate (IWAKI). After preincubation with PBS containing 0.1% Triton X100 (Sigma-Aldrich) and 10% goat serum (Nichirei Biosciences), cells were stained with anti-FLAG (1:500, Sigma-Aldrich) and Hoechst 33342 (1:1,000, Invitrogen) for 1 hour at room temperature, and subsequently stained with goat anti-mouse IgG-Alexa Fluor 488 (1:1,000, Invitrogen) for 1 hour at room temperature. Fluorescence images were obtained using a BZ‐9000 (Keyence Corporation). Six formalin-fixed paraffin-embedded (FFPE) colorectal cancer tissues (CRC071, CRC113, CRC114, CRC117, CRC123, and CRC125) were mounted and stained with hematoxylin–eosin, anti-MSH6 (DAKO, EP49), anti-PMS2 (DAKO, EP51), anti-pan HLA class I (Hokudo, EMR8-5), or anti-CD8 (DAKO, C8/114B) along with corresponding secondary antibodies on DAKO autostainers (magnification, ×100). Stained samples were analyzed using the ECLIPSE E600 (Nikon). Tumor-invasive margins were determined by trained pathologists (shown as red dotted lines). Tumor parenchyma and invasive margin indicate the inside and outside of tumor-invasive margin, respectively.
Cell proliferation and mouse model
Approximately 1 × 106 Colo320 cells were transfected with 100 pmol of a control or the above-mentioned PVT1 siRNA, and used for subsequent in vitro and in vivo experiments 48 hours after transfection. PVT1 gene downregulation was simultaneously validated by RT-PCR. For in vitro cell proliferation, equal numbers (5,000 cells/well) of cells were plated in a 96-well plate and cultured for 0–96 hours in a CO2 incubator. Cell growth was measured by absorbance at 450 nm using a microplate reader (Thermo Fisher Scientific, MULTISKAN FC) at indicated timepoints after the addition of 10× WST-8 reagent (Dojindo) followed by incubation for 1 hour at 37 °C. For in vivo experiments, 5.0 × 106 control and PVT1 knockdown Colo320 were subcutaneously injected on the left and right flanks of immunodeficient NSG mice, respectively (10 NSG mice; 20 injections total). Tumor volume was calculated as xy2/2, where x and y represent the major and minor axes of tumors, respectively. NSG mice were purchased from the Jackson Laboratory. Mice were maintained in the animal facility of Sapporo Medical University (Sapporo, Japan), and all procedures were performed in accordance with the institutional animal care guidelines and procedures approved by the animal study committee of Sapporo Medical University (Sapporo, Japan). Tumor sizes were measured every 2 to 3 days using a digital caliper. Endpoint was set as major axes of 20 mm.
We used a R package (DESeq2) for differential gene expression analysis. Other statistics were performed using GraphPad Prism (v. 8.4.3; GraphPad Software, LLC). P values below 0.05 were considered significant.
Discovery of a tumor-enriched HLA ligand encoded by the lncRNA PVT1
We investigated the HLA class I immunopeptidome of colorectal cancer tissues and patient-matched normal colorectal mucosa using MS-based proteogenomic HLA ligandome analysis. Conventional proteomics relies on predefined protein sequences for peptide sequencing from MS spectra and is therefore unable to detect sequences that are not annotated in the reference database. To overcome this limitation, we virtually translated all the potential ORFs found in both the coding and noncoding transcripts expressed in each sample, constructing personalized reference databases that consisted of both known and hypothetical protein sequences (7, 8, 29). The proteogenomic approach was combined with an MS LFQ method to determine the relative abundance of HLA-A24–bound ligands. Here, we examined the HLA-A24 ligands of six mismatch repair–proficient (pMMR) colorectal cancer and four patient-matched normal colorectal mucosa tissues, identifying an average of 2,667 nonredundant peptides per sample (Fig. 1; Supplementary Fig. S1; Supplementary Tables S1–S10). In accordance with previous reports, both colorectal cancer and normal HLA-A24 ligandomes comprised approximately 95% of canonical peptides and 5% of cryptic peptides, which are encoded by lncRNA genes, pseudogenes, 5′ and 3′ untranslated regions (UTR), or the unannotated ORFs of protein-coding genes (Fig. 2A; ref. 14). Canonical and cryptic ligands showed similar trends in length distribution, the conservation of anchor residues for HLA-A24 binding, and source gene expression profiles (Supplementary Fig. S2A and S2B). Significantly shorter ORF size and translation from second or later ORFs underscored the characteristics unique to cryptic ligands (Supplementary Fig. S2C).
Next, LFQ profiling of eluted ligands showed the natural presentation landscape across colorectal cancer and normal tissues, indicating that a set of peptides were enriched in colorectal cancer tissues (Supplementary Fig. S3). Further comparison between colorectal cancer and normal samples demonstrated that a cryptic HLA-A24 ligand (HWNDTRPAHF, HF10), which is encoded by the lncRNA, PVT1, significantly increased 15.9-fold in colorectal cancer tissues (*, P < 0.05; Fig. 2B). Likewise, PVT1 gene expression significantly increased 7.5-fold in colorectal cancer tissues (***, P < 0.001; Fig. 2C). The LFQ method detected the 10-mer HF10 peptide in four of six colorectal cancer tissues, despite its scarce presentation in normal tissues, indicating that HF10 presentation was tumor associated and was shared across individuals (Fig. 2D and E).
Patient immune surveillance against the PVT1 antigen
The 8q24 locus is the most frequently amplified genomic region in human cancers, encompassing the transcription factor MYC, as well as PVT1 genes (30). Amplification of MYC alone is not sufficient for tumor development in vivo, coamplification of the downstream noncoding PVT1 gene is also necessary (31, 32). In colorectal cancers, single nucleotide polymorphisms in the PVT1 locus associate with increased cancer risk, supporting the oncogenic property of the PVT1 gene (33, 34). In our colorectal cancer samples, PVT1 gene expression increased 10-fold or more in 14 of 20 colorectal cancers compared with that of normal colon tissue, whereas the expression was barely observed in a panel of normal tissues (Fig. 3A). Because both the quantitative HLA ligandome and genomic data suggested tumor-associated HLA presentation of HF10, we then analyzed patient T-cell responses. CD8+ T–cell infiltration into cancer lesions is a surrogate of inflammation at the site and is acknowledged as a predictor of favorable colorectal cancer patient prognosis (35). Gene expression profiling indicated an enrichment of CD8+ T–cell subsets in colorectal cancer lesions, and IHC showed an accumulation of CD8+ T cells at the tumor invasive margins (Supplementary Fig. S4). In five of six colorectal cancer tissues analyzed by MS, TILs were successfully expanded and used for subsequent analysis. We observed a CD8+ T–cell subset that specifically recognized the HF10–HLA-A*24:02 complex in two of five TIL samples (Fig. 3B). Although it is known that TILs are a heterogeneous mixture of varied specificity, which may consist of a minority of tumor-reactive CD8+ T cells and irrelevant bystander cells, both HF10-responding TILs were CD39+, supporting the conclusion that they were autologous tumor-reactive TILs (Fig. 3C; ref. 36). We further asked whether responding T cells were also found in the circulation of the patients with colorectal cancer. PBMCs from the CRC113 patient, in which a natural presentation of HF10 and TIL recognition was observed, contained HF10-reactive CD8+ T cells (Fig. 3D). An additional experiment using another set of PBMCs demonstrated that HF10-reactive CD8+ T cells expanded more than 5-fold in response to HF10 stimulation in vitro in 5 of 10 patients with colorectal cancer (Fig. 3E). These data indicate the immunogenicity of HF10 inducing T-cell responses in multiple patients with colorectal cancer, suggesting that patient immune surveillance does not tolerate the PVT1 antigen.
Functional T-cell responses and HLA class I processing of the PVT1 antigen
HF10 formed a stable peptide–HLA I complex on T2 cells expressing HLA-A*24:02 (Fig. 4A). To characterize the immune response to the PVT1 antigen, we established three T-cell clones (H3, A10, and E10) specific to HF10 from healthy donor PBMCs. All three CD8+ T-cell clones recognized HF10 presented by HLA-A24 equally well, as well as produced IFNγ (Fig. 4B). A comparison of functional avidity among T-cell clones responding to a set of naturally presented antigens demonstrated the high avidity of HF10-reactive clones secondary to that of neoantigen-specific clones with considerably high cytotoxicity (Fig. 4C; refs. 27, 28). To further analyze the properties of the PVT1 antigen and reactive T cells, we screened colorectal cancer cell lines using proteogenomic HLA ligandome analysis and found that Colo320 cells naturally displayed HF10 on HLA-A24 (Supplementary Fig. S5A; Supplementary Table S11). The HLA-A24 ligandome of Colo320 cells contained cryptic ligands besides HF10, with proportions and profiles similar to those of clinical colorectal cancer tissues (Supplementary Fig. S5B–S5D). As anticipated, established HF10 T-cell clones readily recognized and lysed Colo320 cells, as well as T2-A24 cells pulsed with HF10 (Fig. 4D). Thus, HF10 was able to induce functional and cytotoxic CD8+ T–cell responses that specifically targeted colorectal cancer cells presenting the PVT1 antigen.
We also asked how the cryptic PVT1 antigen was processed by the intracellular antigen processing machinery. In the classical pathway, endogenous proteins are digested by the proteasomes in the cytosol and subsequently transported into the endoplasmic reticulum (ER) through the transporter associated with antigen processing (TAP; refs. 37–39). In the ER, pMHC I complexes are formed, optimized, and subsequently sorted onto the cell surface. Given that the response of a T–cell clone to Colo320 cells was significantly attenuated in the presence of either proteasome (bortezomib or carfilzomib) or TAP inhibitors (ICP47 or UL49.5), whereas pulsing HF10 peptide recovered the reduced responses, we considered that the translated PVT1 antigen was processed in a conventional processing-dependent manner (Fig. 4E and F).
PVT1 transcripts with a cryptic ORF are responsible for T-cell recognition
Tens of thousands of lncRNAs are modestly conserved across species, and most lncRNAs yield multiple transcript variants. Differential transcript expression in normal and tumor tissues implies their dysregulation in tumors; however, the complexity of transcription variants makes functional characterization a challenge (3, 40). At present, more than 180 transcript variants of the PVT1 gene have been registered in the Ensembl database (https://asia.ensembl.org/index.html). We aimed to clarify the link between PVT1 transcripts and HF10 generation, as well as its role in oncogenesis using the Colo320 model. To this end, we carried out a combination of transcriptome analysis, characterizing PVT1 transcript variants located downstream of the MYC gene, and their expression (Fig. 5A). RNA-seq showed PVT1 gene expression and the dominant usage of exons 1, 2, 3, and 5. Isoform sequencing (Iso-seq; ref. 41) revealed 26,343 unique full-length isoforms expressed in Colo320 cells, including PVT1 transcripts, of which 11 of 27 variants encoded the HF10 peptide (Supplementary Fig. S6A). Cap analysis gene expression sequencing (CAGE) indicated a single dominant transcriptional start site at the beginning of exon 1. These 11 variants that encoded the HF10 peptides shared a unique ORF that is located across exons 1 and 2 (Fig. 5B; Supplementary Fig. S6B). The HF10-encoding ORF begins with the first ATG codon of the mRNA; however, in contrast to canonical ORFs, the termination codon is located 130 nucleotides upstream of an exon–exon junction, suggesting that the HF10-encoding ORF concludes with a premature termination codon (PTC; ref. 42). Although the presence of PTC precludes mature protein synthesis, it has been postulated that the pioneer round of mRNA translation potentially yields MHC ligands (43). Overexpression of a minigene harboring the HF10-encoding ORF concatenated with a FLAG tag showed accumulation of the translation product in the cytosol and allowed HF10-specific T cells to recognize 293T cells expressing HLA-A*24:02 (Fig. 5C). siRNA-based gene knockdown targeting a position downstream of the HF10-encoding ORF diminished T-cell responses to Colo320 cells (Fig. 5D; Supplementary Fig. S7A and S7B). Experiments using a panel of colorectal cancer and lung cancer lines with comparable HLA-A24 surface expression further demonstrated the positive correlation between the expression of PVT1 transcripts with the ORF in target cells and IFNγ production of HF10-specific T cells (Fig. 5E). These data suggest a role for the cryptic PVT1 ORF in HLA-A24 presentation, as well as T-cell recognition of the PVT1 antigen.
The oncogenic properties of the lncRNA PVT1
In patients with colorectal cancer, TILs often recognize neoantigens that arise from somatic mutations. However, most neoantigens are unique to individuals, and they originate from passenger mutations irrelevant to the development of tumors. In contrast, the PVT1 antigen presentation was observed in four of six colorectal cancer tissues we tested. To confirm the oncogenic properties of PVT1, we asked how our PVT1 knockdown model influences cell growth and the tumorigenicity of Colo320 cells in vitro and in vivo. The efficacy of PVT1 downregulation in a MYC-driven colorectal cancer model has been reported (31). As expected, temporary downregulation using siRNA reduced tumor cell proliferation in culture, although not completely, compared with that of control tumors with intact PVT1 expression (Fig. 6A; Supplementary Fig. S7C). The PVT1 knockdown significantly compromised Colo320 cell tumor growth in immunodeficient NSG mice (Fig. 6B; Supplementary Fig. S7D). Together, these data support the oncogenic property of PVT1, as well as its necessary involvement, in tumor development. Thus, HF10 is an immunogenic antigen shared across patients and is encoded by the oncogenic lncRNA PVT1.
We found a novel class of tumor antigens encoded by the lncRNA, PVT1. The PVT1 antigen is immunogenic and induces CD8+ T–cell responses in patients with colorectal cancer, and responding T cells accumulate in tumor lesions. Previous bioinformatic analyses using ribosome profiling data have demonstrated that lncRNAs can be translated; however, most encoded polypeptides are unstable and nonfunctional (44). Nevertheless, cryptic translation products could be an optimal source of MHC class I ligands because the antigen processing pathway preferentially samples rapidly degraded polypeptides so that T cells promptly sense cellular aberration (45, 46). Here, we found the unconventional HF10-encoding ORF followed by a PTC. The possibility of the translation from this ORF was once predicted by the human genome project (GenBank accession no. EAW92103; ref. 47). Despite the fact that its presence has not been confirmed at the protein level, we showed its translation and HLA presentation in the current study. We assume that the formation of stable pMHC I allows cryptic peptides to avoid further degradation (48, 49). In fact, proteogenomic analyses have demonstrated that the HLA class I ligandome contains a certain proportion of cryptic ligands originating from outside of canonical coding genes, suggesting a qualitative difference between the HLA ligandome and the human proteome (14–16). We also note the limitation of MS database searches. When the MS data were researched against a reference proteome database (Swiss-Prot) instead of personalized databases, the spectra for 21.4% of the cryptic peptides were assigned, leaving the possibility of additional explanations.
The tumor specificity of cryptic HLA ligands and their roles in tumor immunity remain elusive. Our HLA ligandome data show that HLA presentation of cryptic ligands is found in both colorectal cancer and normal tissues, suggesting that HLA presentation as a whole may not be a tumor-specific event. The molecular mechanism leading to the tumor-dominant presentation of the PVT1 antigen remains an issue of interest. This could be due to upregulated PVT1 gene expression in tumors; however, we also allow for the possibility of underlying mechanisms. For example, the HF10-encoding ORF harbors a PTC that can be targeted by nonsense-mediated decay, which may be compromised in tumors (50). Alternatively, cryptic translation for MHC presentation may employ initiation factors distinct from canonical translation, in which activity may be enhanced under pathologic conditions (51).
The discovery of the PVT1 lncRNA antigen is consistent with a previous scenario elucidated with a mouse model, in which the majority of tumor antigens responsible for antitumor T-cell responses were found to arise from noncoding regions rather than coding genes (52). PVT1 has oncogenic properties with regard to MYC function. Although somatic gene mutations give rise to neoantigens that elicit host T-cell responses, tumors may evolve by editing neoantigens given that most corresponding mutations are irrelevant to tumor growth (53). In sharp contrast, the necessity of PVT1 in tumor development suggests the potential targeting of the PVT1 antigen, which is conserved across patients with colorectal cancer independent of the tumor mutation burden, as a surrogate target for MYC in immunotherapy. At present, the clinical relevance of immune surveillance against the PVT1 antigen with regard to patient prognosis, as well as therapeutic efficacy, is unknown. Our findings highlight the diversity of tumor-associated antigen sources, providing a rationale for an immunotherapy targeting a novel class of lncRNA-derived tumor antigens.
T. Kanaseki reports a patent for PCT/JP2016/88904 issued. T. Torigoe reports a patent for PCT/JP2016/88904 issued. No disclosures were reported by the other authors.
Y. Kikuchi: Validation, investigation, visualization, writing–original draft. S. Tokita: Data curation, software, formal analysis, validation, investigation, visualization. T. Hirama: Resources. V. Kochin: Methodology. M. Nakatsugawa: Software, methodology. T. Shinkawa: Investigation. Y. Hirohashi: Investigation. T. Tsukahara: Investigation. F. Hata: Resources. I. Takemasa: Resources. N. Sato: Supervision. T. Kanaseki: Conceptualization, supervision, funding acquisition, writing–original draft, writing–review and editing. T. Torigoe: Supervision, funding acquisition.
The authors thank T. Suzuki (Sapporo Medical University) for technical support on immunofluorescence imaging, K. Matsuo (Sapporo Clinical Laboratory) for technical support with IHC, and M. Kudo and M. Hasegawa (Sapporo Medical University) for handling patient material and technical support for data processing.
This work was supported by a Japan Agency for Medical Research and Development (AMED) Grant, to T. Kanaseki (21cm0106352h0003); Grant-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (JSPS) KAKENHI, to T. Kanaseki (JP19H03490 and JP20K21528); Takeda Science Foundation Grant, to T. Kanaseki; AMED Grant, to T. Torigoe (20cm0106309h0005); and Grant-in-Aid for Scientific Research from JSPS, to T. Torigoe (JP17H01540).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.