A pan-cancer databases of Alternative Splicing for Cancer Molecular Classification



About AS-CMC

AS-CMC has a user-friendly interface, which allows researchers to explore alternative splicing (AS) events in The Cancer Genomics Atlas (TCGA) molecular subtypes. Our web service consists of two parts, viz. “Single-cancer AS” and “Pan-cancer AS.” In the “Single-cancer AS,” users can select cancer type first and get the list of AS events with the statistical analysis results. In the “Pan-cancer AS,” users can obtain pan-cancer views for a selected AS event.

TCGASpliceseq data myTest.svg
Home

Citation:

AS-CMC: a pan-cancer database of alternative splicing for molecular classification of cancer. Sci Rep. 2022 Dec 6;12(1):21074.




Home

AS-CMC workflow

AS-CMC provides three analysis modules. In the “Subtype-specific AS” module (top left), differential regulation of AS PSI values was tested among molecular subtypes provided by TCGAbiolinks. In the “Phenotype association” module (bottom left), each AS event was evaluated in association with patient-level (clinical outcomes), tissue-level (microenvironment), and gene-level (gene-expression) data. In the “Pan-cancer comparison” module (right), the analyzed data pertaining to each AS event is displayed in a panoramic view across cancer types.

myTest.svg

TCGA cancer types included in AS-CMC

myTest.svg

The number of samples in each cancer type is shown in the parenthesis.



Statistics of TCGA cancers in AS-CMC

(a) Number of patient samples for each cancer. Each bar consists of the molecular subtypes. Subtype names can be displayed above the graph.

(b) The number of subtype-specific AS events by splice types. The subtype-specific AS events were selected based on analysis of variance (ANOVA) (p < 0.001 and adjusted R2 > 0.1).

(c) The fraction of survival-associated AS events among subtype-specific AS events. The fraction is marked by dark red color and is also shown as percentage on the right side of each bar. X-axis indicates the number of AS events.





  Goal in Single-cancer AS: Identification of subtype-specific AS events for each cancer type

  Selection


  Priority : You can select AS events by using the following criteria.

  


Loading...




  Goal in Pan-cancer AS: Investigation of the prevalence of sutype-specific AS event across cancer types

  Priority : You can select AS events by using the following criteria.


 


Loading...




Introduction

Introduction



AS-CMC (Alternative Splicing for Cancer Molecular Classification) is a web-based database for allowing users to browse subtype-specific changes in AS along with phenotypic association for each cancer type as well as compare the regulation pattern across diverse cancer types. For AS in TCGA samples, we used the per-cent-spliced-in index (PSI) value from TCGASpliceSeq database. We obtained the information pertaining to cancer molecular subtypes from TCGAbiolinks R package.




TCGASpliceseq data

Alternative Splicing Events




Alternative Splicing Events

For AS of TCGA samples, we download the PSI value of 27,682 AS events in 24 cancer types from the TCGASpliceSeq database. AS-CMC provided the PSI values of five splicing types: exon skip (ES), retained intron (RI), alternate acceptor sites (AA), alternate donor sites (AD), and mutually exclusive exon (ME).


myTest.svg

Quantification of AS Events

Percent splice in (PSI) ranged from 0 to 1 was a commonly used ratio to indicate different uses of alternative exon. For each splice event, a percent-splice-in (PSI) value generated by the ratio of inclusion of reads over the total reads for that event (both inclusion and exclusion reads).


myTest.svg

AS ID Definition

AS ID rule is simple to create from gene model. AS ID format includes a AS type, exon number and delimiting character '_'.
(AS ID : Gene.Name_AS.Type_Retulated.Exon_From.Exon_To.Exon)


myTest.svg


Reference SpliceSeq: a resource for analysis and visualization of RNA-Seq data on alternative splicing and its functional impacts. Bioinformatics. 2012;28(18):2385-7









TCGASpliceseq data

TCGA molecular subtype information



The information of cancer molecular sub-types was obtained from TCGAbiolinks R package. And we used the most prominent subtype classification which is curated annotation named ‘Subtype_Selected’. For more information on the molecular types of each cancer, see the literature links in the table below.


#
Cancer Molecular data Reference #Type Molecular Subtype (n)
ACC DNAmeth Cancer Cell 2016 3 ACC.CIMP-high, ACC.CIMP-intermediate, ACC.CIMP-low
BLCA mRNA Nature 2014 4 BLCA.4, BLCA.3, BLCA.1, BLCA.2
BRCA PAM50(mRNA) Nature 2012 5 BRCA.Normal, BRCA.Her2, BRCA.Basal, BRCA.LumB, BRCA.LumA
COAD Molecular_Subtype Cancer Cell 2018 4 GI.HM-SNV, GI.GS, GI.MSI, GI.CIN
ESCA Molecular_Subtype Cancer Cell 2018 5 GI.GS, GI.HM-SNV, GI.MSI, GI.CIN, GI.ESCC
GBM Supervised_DNAmeth Cell 2016 6 GBM_LGG.G-CIMP-high, GBM_LGG.G-CIMP-low, GBM_LGG.LGm6-GBM, GBM_LGG.NA, GBM_LGG.Classic-like, GBM_LGG.Mesenchymal-like
HNSC mRNA Nature 2015 4 HNSC.Classical, HNSC.Atypical, HNSC.Mesenchymal, HNSC.Basal
KICH Eosinophilic Cancer Cell 2014 2 KICH.Eosin.1, KICH.Eosin.0
KIRC mRNA Nature 2013 5 KIRC.NA, KIRC.4, KIRC.2, KIRC.3, KIRC.1
KIRP COC NEJM 2015 4 KIRP.C2c - CIMP, KIRP.C2b, KIRP.C2a, KIRP.C1
LAML mRNA NEJM 2013 7 AML.1, AML.3, AML.7, AML.2, AML.5, AML.6, AML.4
LGG Supervised_DNAmeth Cell 2016 7 GBM_LGG.NA, GBM_LGG.G-CIMP-low, GBM_LGG.Classic-like, GBM_LGG.PA-like, GBM_LGG.Mesenchymal-like, GBM_LGG.Codel, GBM_LGG.G-CIMP-high
LIHC iCluster Cell 2017 4 LIHC.NA, LIHC.iCluster:2, LIHC.iCluster:3, LIHC.iCluster:1
LUAD iCluster Nature 2014 6 LUAD.1, LUAD.2, LUAD.4, LUAD.6, LUAD.3, LUAD.5
LUSC mRNA Nature 2012 4 LUSC.primitive, LUSC.basal, LUSC.secretory, LUSC.classical
OV mRNA Nature 2011 4 OVCA.Immunoreactive, OVCA.Mesenchymal, OVCA.Differentiated, OVCA.Proliferative
PCPG mRNA Cancer Cell 2017 5 PCPG.NA, PCPG.Cortical admixture, PCPG.Wnt-altered, PCPG.Pseudohypoxia, PCPG.Kinase signaling
PRAD Mutation/Fusion Cell 2015 8 PRAD.7-IDH1, PRAD.4-FLI1, PRAD.6-FOXA1, PRAD.3-ETV4, PRAD.2-ETV1, PRAD.5-SPOP, PRAD.8-other, PRAD.1-ERG
READ Molecular_Subtype Cancer Cell 2018 4 GI.MSI, GI.HM-SNV, GI.GS, GI.CIN
SKCM Mutation Cell 2015 5 SKCM.NF1_Any_Mutants, SKCM.Triple_WT, SKCM.- , SKCM.RAS_Hotspot_Mutants, SKCM.BRAF_Hotspot_Mutants
STAD Molecular_Subtype Cancer Cell 2018 5 GI.HM-SNV, GI.EBV, GI.GS, GI.MSI, GI.CIN
THCA mRNA Cell 2014 6 THCA.NA, THCA.2, THCA.3, THCA.5, THCA.4, THCA.1
UCEC iCluster - updated according to Pan-Gyne/Pathways groups Nature 2013 5 UCEC.NA, UCEC.POLE, UCEC.MSI, UCEC.CN_LOW, UCEC.CN_HIGH
UCS mRNA Cancer Cell 20172 UCS.1, UCS.2

Reference TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2016 May 5;44(8):e71





TCGA clinical data

Significant Molecular Subtype-Specific AS Events




ANOVA test

We performed ANOVA test to selected subtype-specific AS events by analysis of variance (p-value < 1x10 -3) for each cancer type. For visualization,AS-CMC displays boxplots for a each cancer and a scatter plot across cancers.


myTest.svg

Each cancer

myTest.svg

Pan cancers

myTest.svg


TCGA clinical data

Phenotype association (Tissue-level & Gene-level)



The “Phenotype association” module enables prioritization of AS events by relevance in terms of clinical outcomes (patient-level), cancer microenvironment scores (tissue-level), and gene expression levels (gene-level).

In this section show two workflows(Tissue-level & Gene-level) of a module. Tissue-level: In the cancer microenvironment part,users can investigate the correlations between the changes in AS and predefined scores indicating the status of immune and stromal cells, epithelial-to-mesenchymal transition (EMT), and hypoxia. Gene-level: Users can also examine the correlations of each AS event with expression levels of all genes. The gene expression levels correlated with AS PSI values enables determination of biological pathways underlying AS changes. The relationship of the host gene with AS helps in assessing the dependency of the AS event on transcriptional regulation. If the AS is independent of gene expression, it is likely regulated solely by splicing machinery


myTest.svg



TCGA cancer

Single-cancer AS



This page shows the approches for the cancer. Method for searching a significant molecular subtype-specific AS events in certain cancer is as follows.


myTest.svg

a. List of subtype-specific AS events for a selected cancer type. Once a cancer type is selected, the subtype-specific AS events are shown with relevant statistics displayed in a tabular form. Users can filter the results using the defined cut-off in survival and correlation with the expression level of genes with the AS. b.Visualization panel showing the analysis results of an AS event. Once an AS event is selected (click the hyper linked letter named "Click"), a window with various plots pops up.





AS-level

AS-level (Significant AS events)



We performed survival analysis between high PSI and low PSI cohorts. Threshold PSI value to separate two groups was defined as 10%, 25%, and 50% of PSI value in the distribution. The significance for differential survival rates was evaluated by Log-rank test. Three thresholds in PSI value distribution were applied to generate two groups to be tested. P-value derived from Log-rank test. A pan-cancer view of survival analysis results for the queried AS events.


myTest.svg

We compared exon-level PSI values and gene expression levels across cancer molecular subtypes. And we checked the correlation of PSI value to spliced gene expression across cancer molecular subtypes.

myTest.svg

Reference TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2016 May 5;44(8):e71



Patient-level

Patient-level (Association of survival with AS events)


To identify potential clinically association of a significant AS event, we analysis survival analysis using TCGA survival data. .


Survival Analysis

We performed survival analysis between high PSI and low PSI cohorts. Threshold PSI value to separate two groups was defined as 10%, 25%, and 50% of PSI value in the distribution. The significance for differential survival rates was evaluated by Log-rank test. Three thresholds in PSI value distribution were applied to generate two groups to be tested. P-value derived from Log-rank test. A pan-cancer view of survival analysis results for the queried AS events.


myTest.svg

Reference TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2016 May 5;44(8):e71





Tissue-level

Tissue-level (Association of tissue-level score)



To test the relationship of each AS event with tumor microenvironment, we used immune infiltration levels, tumor purity,hypoxia and EMT, which were previously published in pan-cancer analysis. First, we checked if PSI values are related to tumor microenvironment using cellular fraction estimates (leukocyte fraction and CIBERSORT immune fractions) [1]. Next, we assesed the association of PSI values with tumor hypoxia scores which can reflect the level of molecular oxygen in tumor sample [2]. Finally, we provided the association of EMT value [3].




myTest.svg

Reference

1) The spliceosome pathway activity correlates with reduced anti-tumor immunity and immunotherapy response, and unfavorable clinical outcomes in pan-cancer. Comput Struct Biotechnol J 2021;19:5428-5442

2) The Immune Landscape of Cancer. Immunity 2018;48:812-30 e14

3) Molecular landmarks of tumor hypoxia across cancer types. Nat Genet 2019;51:308-18

4) Pan-cancer survey of epithelial-mesenchymal transition markers across the Cancer Genome Atlas. Dev Dyn 2018;247:555-64










TGene-level

Gene-level (Association of Gene Expression with AS Events)

We explored which genes have mRNA levels correlated with queried AS.


myTest.svg


TCGA cancer

Pan-cancer AS



This page shows the approches for the cancer.Method for searching a significant molecular subtype-specific AS events in certain cancer is as follows.


myTest.svg

a. Selection panel for AS events. Users are directed to the pan-cancer views upon clicking (click the hyper linked letter named "Click") on an AS event. b-d. Pan-cancer information pertaining to MAP3K7 exon 11 as an example. b. Comparison of subtype-specificity across cancer types. ANOVA results are displayed using two different y-axes: -log10.





AS-level

AS-level (Significant AS events)



We performed ANOVA test to selected subtype-specific AS events by analysis of for each cancer type. And then we identify the distribuion p & r-value across cancers


myTest.svg

Reference TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2016 May 5;44(8):e71







Patient-level

Patient-level (Association of Survival with AS Events)



Survival Analysis

We performed survival analysis between high PSI and low PSI cohorts. Threshold PSI value to separate two groups was defined as 10%, 25%, and 50% of PSI value in the distribution. The significance for differential survival rates was evaluated by Log-rank test. Three thresholds in PSI value distribution were applied to generate two groups to be tested. P-value derived from Log-rank test. A pan-cancer view of survival analysis results for the queried AS events.


myTest.svg

Reference TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2016 May 5;44(8):e71





Gene-level

Gene-level (Association of gene expression with AS events)




myTest.png



Biological pathways related genes

Genes related biological pathways



Display genes of significant ASE events across cancers

Pathways Genes
Cell Cycle Control (3 genes) CDK2, E2F5, E2F6
Notch signaling (11 genes) APH1A, ARRDC1, DLL3, DTX2, DTX3, HES4, ITCH, JAG2, NCOR2, NUMB, RBPJ
DNA Damage Response (4 genes) CHEK1, CHEK2, RAD51, MLH1
Other growth/proliferation signaling (4 genes) FGFR1, CSF1, PLAGL1, AURKA
Survival/cell death regulation signaling (4 genes) BCL2L1, BCL2, CASP10, CASP3
RTK signaling family (3 genes) FGFR1, VEGFA, PDGFA
PI3K-AKT-mTOR signaling (4 genes) TSC2, MLST8, AKT1, RHEB
Ras-Raf-MEK-Erk/JNK signaling (8 genes) MAPK9, HRAS, MAP3K4, KRAS, MAPK3, MAP3K3, DAB2, MAPK8
Regulation of ribosomal protein synthesis and cell growth (3 genes) RPS6, RPS6KB2, RPS6KB1
Angiogenesis (1 gene) VEGFA
Invasion nad metastatsis (4 genes) MMP23B, PTK2, MMP19, WFDC2
TGF-beta Pathway (2 genes) TGFBR3, SMAD5





Reference https://www.cbioportal.org/








Case

Case study: MAP3K7_ES_11



Using AS-CMC, we selected a notable subtype-specific AS event, which can serve as a pan-cancer AS biomarker as an example. An ES event in exon 11 of MAP3K7 (mitogen-activated protein kinase kinase kinase 7) gene was chosen as this marker showed significant subtype-specificity in 10 cancer types (BRCA, LGG, ESCA, STAD, HNSC, BLCA, OV, THCA, LUAD, UCS). In the survival analysis, the skipping of the exon was found to be strongly associated with poor clinical outcome in stomach adenocarcinoma (STAD). The survival difference was found to be the largest in the most stringent cut-off (upper 10% vs lower 10%).

MAP3K7 AS was analyzed in depth in STAD due to its significant association with survival. Among the five molecular subtypes of STAD, only GI.GS subtype showed significant distribution of PSI values compared to the other subtypes. The PSI values were correlated with the molecular scores related to EMT scores (r = -0.71). Taken together, these data support that an ES event in exon 11 of MAP3K7 may play a role in regulating subtypes across diverse cancers, and that this event may particularly play a crucial role in STAD where its function has not been reported earlier.


myTest.svg

myTest.svg

An example of a potential pan-cancer AS biomarker. (a) Location of MAP3K7 exon with AS on the corresponding gene and chromosome. (b) Distribution of PSI values for five molecular subtypes in STAD. (c) Correlations between the AS PSIs and EMT scores in STAD. Spearman’s correlation coefficient is shown on the top. Each dot represents an individual patient sample. (d) Survival plots comparing survival rates between two patient groups with high- and low-PSI values in STAD. In the survival plots, AS-CMC provides three plots representing the survival difference between the groups for an AS event according to the following PSI cut-offs: 50% (upper 50% vs lower 50%), 25% (upper 25% vs lower 25%), and 10% (upper 10% vs lower 10%). The significance of differential survival rates was evaluated using log rank test.





Contact

Contact



If you have any questions or suggestions on this database, please feel free to contact us.


Jiyeon Park PhD
E.mail : parkji7@gmail.com
Precision Medicine Research Center, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
Integrated Research Center for Genome Polymorphism, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea



Jin-Ok Lee MSc
E.mail : jinoklee.01@gmail.com
Precision Medicine Research Center, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
Integrated Research Center for Genome Polymorphism, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea



Yeun-Jun Chung MD, PhD
E.mail : yejun@catholic.ac.kr
Precision Medicine Research Center, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
Integrated Research Center for Genome Polymorphism, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea