Constructions and Applications of Alternative Splicing Databases

37
逢逢逢逢 逢逢逢逢逢逢逢逢 Constructions and Applications of Constructions and Applications of Alternative Splicing Databases Alternative Splicing Databases speaker: speaker: 逢逢逢 逢逢逢

description

Constructions and Applications of Alternative Splicing Databases. 逢甲大學. 生物資訊研究中心. speaker: 許芳榮. Outline. Introduction Construction of alternative splicing database Survey of existing solutions Applications. Introduction. RNA Splicing. Alternative Splicing. Definitions - PowerPoint PPT Presentation

Transcript of Constructions and Applications of Alternative Splicing Databases

Page 1: Constructions and Applications of Alternative Splicing Databases

逢甲大學 生物資訊研究中心

 

Constructions and Applications of Constructions and Applications of Alternative Splicing DatabasesAlternative Splicing Databases

speaker: speaker:  許芳榮許芳榮

Page 2: Constructions and Applications of Alternative Splicing Databases

OutlineOutline

IntroductionIntroductionConstruction of alternative splicing Construction of alternative splicing

databasedatabaseSurvey of existing solutionsSurvey of existing solutionsApplications Applications

Page 3: Constructions and Applications of Alternative Splicing Databases

IntroductionIntroduction

Page 4: Constructions and Applications of Alternative Splicing Databases

RNA SplicingRNA Splicing

Page 5: Constructions and Applications of Alternative Splicing Databases

Alternative SplicingAlternative Splicing

DefinitionsDefinitionsSplicing the Splicing the same pre-mRNAsame pre-mRNA in in two or two or more waysmore ways to yield two or more to yield two or more different different mRNAsmRNAs that produce two or more that produce two or more different different protein productsprotein products

Page 6: Constructions and Applications of Alternative Splicing Databases

Types of alternative splicingTypes of alternative splicing

Page 7: Constructions and Applications of Alternative Splicing Databases

The Troponin T (muscle protein) pre-mRNAis alternatively spliced to give rise to64 different isoforms of the protein

Constitutively spliced exons (exons 1-3, 9-15, and 18)

Mutually exclusive exons (exons 16 and 17)

Alternatively spliced exons (exons 4-8)

Exons 4-8 are spliced in every possible waygiving rise to 32 different possibilities

Exons 16 and 17, which are mutually exclusive,double the possibilities; hence 64 isoforms

Page 8: Constructions and Applications of Alternative Splicing Databases

EExpressed xpressed SSequence equence TTags (ESTs)ags (ESTs)

Page 9: Constructions and Applications of Alternative Splicing Databases

What are the relationships What are the relationships of Genome, mRNA and ESTof Genome, mRNA and EST

s?s?

Page 10: Constructions and Applications of Alternative Splicing Databases

Genome

EST

EST

Page 11: Constructions and Applications of Alternative Splicing Databases

3’

Exon 1 Exon 2 Exon 3 Exon 4Intron 1 Intron 2 Intron 3

5’

AAA...

EST 1

EST 2

EST 4

EST 3

EST 5

EST 6

AAA...EST 7

Page 12: Constructions and Applications of Alternative Splicing Databases

Construction of alternative Construction of alternative splicing databasesplicing database

Page 13: Constructions and Applications of Alternative Splicing Databases

Gene Discovery

SNPAlternative

Splicing

dbEST

5 million ESTs

Genome Sequences

3 billion bp

alignment

Exons, Introns Database

Page 14: Constructions and Applications of Alternative Splicing Databases

Methods of Alternative Methods of Alternative Splicing Detection Splicing Detection

mRNA – EST alignment (or EST mRNA – EST alignment (or EST consensus)consensus)Without knowledge of genomic Without knowledge of genomic

sequencesequenceGenomic sequence to EST alignmentGenomic sequence to EST alignment

informativeinformative

Page 15: Constructions and Applications of Alternative Splicing Databases

How to cluster ESTs ? How to cluster ESTs ? UniGene clusterUniGene cluster

Consider the ESTs in the same UniGene clusConsider the ESTs in the same UniGene clusterter

Save time but not informativeSave time but not informativeGenome templateGenome templateGenomic sequence to EST alignmentGenomic sequence to EST alignment

informative but time consuming informative but time consuming

Page 16: Constructions and Applications of Alternative Splicing Databases

The Approaches of EST The Approaches of EST ClusteringClustering

Unigene like approachUnigene like approach1.1. Overlapped ESTs are grouped in a cluster as UnigOverlapped ESTs are grouped in a cluster as Unig

ene.ene.2.2. Generating a consensus sequence of each cluster.Generating a consensus sequence of each cluster.3.3. Aligning consensus sequences to genome sequeAligning consensus sequences to genome seque

nce. nce. Genome template Genome template

1.1. Cut Human Genome Sequence in 20k base pairs.Cut Human Genome Sequence in 20k base pairs.2.2. Screening in ESTs similarity by BLAST.Screening in ESTs similarity by BLAST.3.3. Detecting exons by sim4. Detecting exons by sim4.

Directly alignmentDirectly alignment

Page 17: Constructions and Applications of Alternative Splicing Databases

Unigene like approachUnigene like approach

1.1. Overlapped ESTs aOverlapped ESTs are grouped in a clure grouped in a cluster as Unigene.ster as Unigene.

2.2. Generating a consGenerating a consensus sequence oensus sequence of each cluster.f each cluster.

3.3. Aligning consensuAligning consensus sequences to ges sequences to genome sequence. nome sequence.

genomic seq

Candidates of gene location

STS

BLAST

gene

consensus sequence

Report exons

Page 18: Constructions and Applications of Alternative Splicing Databases

Genome templateGenome template

1.1. Cut Human GenoCut Human Genome Sequence into me Sequence into 20k base pairs.20k base pairs.

2.2. Screening in ESTs Screening in ESTs similarity by BLASsimilarity by BLAST.T.

3.3. Detecting exons bDetecting exons by sim4. y sim4.

genomic template

EST DB

WU-BLAST

ESTs with similarity

Sim4

exons

Page 19: Constructions and Applications of Alternative Splicing Databases

Directly alignmentDirectly alignment

Page 20: Constructions and Applications of Alternative Splicing Databases

Using UniGene Cluster is not InfUsing UniGene Cluster is not Informativeormative

Many ESTs in different UniGene clusteMany ESTs in different UniGene clusters are aligned to same genome area.rs are aligned to same genome area.

UniGene cluster ID UniGene cluster ID 101131,100437,100101131,100437,100738,101182 and 100143738,101182 and 100143 should be gro should be grouped together to detect alternative spuped together to detect alternative splicinglicing

Page 21: Constructions and Applications of Alternative Splicing Databases

Resource Description Approach Quantity Species Institute

ASAP

performed genome-wide detection of human

alternative splicing and Unigen Like6201 A.S sites inall genes Human UCLA

TAP

performed an EST-based gene structure

prediction in genomic sequences and alsocollected splicing information Genome template

669 A.S sites in365of the 1007multiexon genes Human WU

PALSdb

PALS db is a collection of PutativeAlternative Splicing information. Alternative

splicing sites were predicted by using the

longest mRNA sequence in each UniGenecluster as the reference sequence Unigen Like 9,952/19,936

Human,mouse Yang Ming

Compugen

ESTs and all mRNA sequences were alignedwith the human genomesequence using LEADS, Compugen’salternative splicing modeling platform. Unigen Like Human Compugen

SpliceNest

Web-based graphical tool to explore genestructure, including alternative splicing,based on a mapping of the EST consensusfrom GeneNest to complete human genome Unigen Like

26880 introns,32348 exons from5468 genes,

Human,mouse,arabidopsis,zebrafish

Max-Planck-Gesellschaft

STACKSTACK can provide putative tissue-specifictranscripts for each gene Unigen Like Human

Egenetic,SouthAfrica

Page 22: Constructions and Applications of Alternative Splicing Databases

Avatar: Avatar: aa vvalue alue aadded dded ttrraanstnstrriptomiptome databasee database

Align entire dbEST to genome using PCsAlign entire dbEST to genome using PCs

Page 23: Constructions and Applications of Alternative Splicing Databases

Homo sapiens 14,989 22,969 11,188 330 7,481

Mus musculus 7,479 13,075 4,850 127 3,493

Rattus norvegicus 531 900 401 4 373

Caenorhabditis elegans

162 28 263 5 174

Drosophila melanogaster

351 117 221 6 221

Arabidopsis thaliana 83 4 77 1 32

OrganismNumber. of alternative splicing events

5’ AS 3’ ASExon

skippingmutually exclusive

intron retention

Page 24: Constructions and Applications of Alternative Splicing Databases

ApplicationsApplications

Cross-species analysisCross-species analysisTissue specific analysisTissue specific analysisSNP and alternative splicing SNP and alternative splicing Quantity analysis Quantity analysis Splicing enhancer Splicing enhancer Gene prediction through dbESTGene prediction through dbESTSNP finding through dbESTSNP finding through dbEST

Page 25: Constructions and Applications of Alternative Splicing Databases

BoneMarrow , 1

Brain , 15

Eye , 1

Liver, 15

Lung , 5

LymphNode , 1

MammaryGland , 1

Placenta , 4

Stomach , 5

Testis, 1

WholeBlood, 2

BoneMarrow Brain Eye LiverLung LymphNode MammaryGland Placenta Stomach TestisWholeBlood

Tissue distributions of 51 tumor-specific alternative splicing sites

Page 26: Constructions and Applications of Alternative Splicing Databases

1,598 SNP dependent alternative 1,598 SNP dependent alternative splicingsplicing

Page 27: Constructions and Applications of Alternative Splicing Databases

Comparison of human and Comparison of human and micemice

Page 28: Constructions and Applications of Alternative Splicing Databases

Exon skipping Exon skipping

Conserved alternative splicing events (CES events)

F1

F2

F1

F2

F1

F2

F1

F2

If NCES.F1 > K and NCES.F2 == 0

Non-conserved alternative splicing events (NCES events)

Page 29: Constructions and Applications of Alternative Splicing Databases

Discovering the different Discovering the different constitutive splicing eventsconstitutive splicing events

94 91

ME12713588-1 ME12751459-1

HumanSNX3

MR12705131-1

EST support: 41

ME2231614-2

MouseSnx3

ME2238811-1 EST support: 90

+

Page 30: Constructions and Applications of Alternative Splicing Databases

EST frequency >=10

EST frequency >=1

Page 31: Constructions and Applications of Alternative Splicing Databases

PSMD13

Psmd13

F2

F1

GGTGAACCCTTTGTCCCTCGTGGAAATCATTCTTCATGTAGTTAGACAGATGACTG

GGTAAACCCTCTGTCCCTGGTAGAAATAATTCTCCATGTGGTTAGACAGATGACCG

MR178998-1 ME184041-1 ME184161-1

ME579264-1

ME582152-1 ME582275-1

184167,C,T,D,2,2,48,0,0.00452488687782805

184171,T,C,D,2,2,48,0,0.00452488687782805

Human exon

Mouse exon

F1

86CT

CT

TC

T C

48

2 2

0

Page 32: Constructions and Applications of Alternative Splicing Databases

Finding SNP from dbESTFinding SNP from dbEST

3’

Exon 1 Exon 2 Exon 3 Exon 4Intron 1 Intron 2 Intron 3

5’

AAA...

EST 1

EST 2

EST 4

EST 3

EST 5

EST 6

AAA...EST 7

Page 33: Constructions and Applications of Alternative Splicing Databases

EST to genome alignment with EST to genome alignment with profileprofile

3’

Exon 1 Exon 2 Exon 3 Exon 4Intron 1 Intron 2 Intron 3

5’

EST 4

EST 3

EST 5

EST 6

AAA...EST 7

Page 34: Constructions and Applications of Alternative Splicing Databases

TranslocationTranslocation

Page 35: Constructions and Applications of Alternative Splicing Databases

Finding gene from dbESTFinding gene from dbEST

3’

Exon 1 Exon 2 Exon 3 Exon 4Intron 1 Intron 2 Intron 3

5’

AAA...

EST 1

EST 2

EST 4

EST 3

EST 5

EST 6

AAA...EST 7

Page 36: Constructions and Applications of Alternative Splicing Databases

Transciptome GenomicsTransciptome GenomicsWhere Where What What Why Why How How

Page 37: Constructions and Applications of Alternative Splicing Databases

ConclusionConclusion