01 nc rna-intro

32
Introduction to ncRNAs Paul Gardner August 10, 2015 Paul Gardner intro2ncRNA

Transcript of 01 nc rna-intro

Page 1: 01 nc rna-intro

Introduction to ncRNAs

Paul Gardner

August 10, 2015

Paul Gardner intro2ncRNA

Page 2: 01 nc rna-intro

Where on the bio/math spectrum do you lie?

Paul Gardner intro2ncRNA

Page 3: 01 nc rna-intro

What is RNA?

��

���

2R

1R

��

���

��

���

��

��

1R

1R

1R

����

��

��

���

���

1

2R

R : −OH −H

: −H −CH3

�������������������� ����������� ��������� ����� ����� ����� �� �����

���������������������������������������������

��������

�������

�������

������������������������

RNA DNA

������������������

IUPAC ambiguity chars:

Paul Gardner intro2ncRNA

Page 4: 01 nc rna-intro

Crick’s “central dogma of molecular biology”

Paul Gardner intro2ncRNA

Page 5: 01 nc rna-intro

RNA: why is this stuff interesting?

I RNA world was an essential step to modern protein-DNAbased life (using current reasonable models).

I Which came first, DNA or protein?I RNA has catalytic potential (like protein), carries hereditary

information (like DNA).

Image by James W. Brown, www.mbio.ncsu.edu/JWB/soup.html

Paul Gardner intro2ncRNA

Page 6: 01 nc rna-intro

RNA: why is this stuff interesting?

I RNA fulfills all essential biochemical functions of life1. Information storage and replication2. Enzymatic activity: catalyze biochemical reactions3. Regulator: sense and react to environment

I RNA World originally proposed in 1986

Paul Gardner intro2ncRNA

Page 7: 01 nc rna-intro

RNA: representations

C A C G G C G U A A U G C A G G U C G G G G A U A U U G U A U G C U C C A G G C U C G U U G U C U A U G G G G A U U C G U C A C U

C A C G G C G U A A U G C A G G U C G G G G A U A U U G U A U G C U C C A G G C U C G U U G U C U A U G G G G A U U C G U C A C UCA

CG

GC

GU

AA

UG

CA

GG

UC

GG

GG

AU

AU

UG

UA

UG

CU

CC

AG

GC

UC

GU

UG

UC

UA

UG

GG

GA

UU

CG

UC

AC

U

CA

CG

GC

GU

AA

UG

CA

GG

UC

GG

GG

AU

AU

UG

UA

UG

CU

CC

AG

GC

UC

GU

UG

UC

UA

UG

GG

GA

UU

CG

UC

AC

U

Paul Gardner intro2ncRNA

Page 8: 01 nc rna-intro

Types of RNA

Hoeppner, Gardner & Poole (2012) Comparative Analysis of RNA Families Reveals Distinct Repertoires for Each

Domain of Life. PLOS Computational Biology.

Paul Gardner intro2ncRNA

Page 9: 01 nc rna-intro

Molecular fossils

Hoeppner, Gardner & Poole (2012) Comparative Analysis of RNA Families Reveals Distinct Repertoires for Each

Domain of Life. PLOS Computational Biology.

Paul Gardner intro2ncRNA

Page 10: 01 nc rna-intro

Non-coding RNAs are often poorly conservedI Poor homology search tools or genuine evolutionary turnover?

0

20

40

60

80

100

Conservation of RNAs & Proteins in bacterial genomes

Con

serv

edfa

mili

es(%

)Freq.

RNA−seq species0

10

Pfam (N=6671)Rfam (N=331)

KingdomPhylum

ClassOrder

FamilyGenus

0.0 0.1 0.2 0.3 0.4 0.5 0.6

Phylogenetic distance

A

BC

Lindgreen et al. (2014) Robust Identification of Noncoding RNA from Transcriptomes Requires

Phylogenetically-Informed Sampling. PLOS Computational Biology.

Paul Gardner intro2ncRNA

Page 11: 01 nc rna-intro

Some example mechanisms for RNA function

I RNA spongesI RNA:protein (6S RNA, CsrB)I RNA:RNA (circRNAs, spacer)

I Guide RNAsI CRISPRsI snoRNAsI microRNAI spliceosomal RNAs

I AttenuatorsI RiboswitchesI ThermosensorsI Peptide leaders

Gather data

Analyze-Classify

Hypotheses-Predictions

ExperimentGCGAGCAGACGCACCGAACAGACACAGUGAGCAGGCGCCCCGAGCAGUCAUAACACUGAGACGCAGCGAGCGU-AACG

RAAAARCY

Y R

RGYUUUUUU U5'

0.0

1.0

2.0

A

CGU

CC

A

GA5

A

GA

U

CAGGUA10

CAGUCUGA

Paul Gardner intro2ncRNA

Page 12: 01 nc rna-intro

RNA sponges: 6S RNA, a bacterial sRNA

I One of the 1st ncRNAs to beidentified & sequenced (Hindley 1967& Brownlee 1971)

I Function wasn’t determined until 2002(Wasserman 2002)

I Structure mimics an open promoter

I A major regulator of gene expression

I Expressed at high levels duringstationary growth phase, binds σ70

with RNA polymerase

I Found in Proteobacteria, Firmicutes,Cyanobacteria, Bacteroidetes,Chlamydiae, ...

Barrick et al. (2005) 6S RNA is a widespread regulator of eubacterial RNA polymerase that resembles an open

promoter. RNA.

Paul Gardner intro2ncRNA

Page 13: 01 nc rna-intro

RNA sponges: CsrB RNA and CsrA

R A R Y C R R R R R A R A R G A RCAU

CAG G A

UG

AUG

C R Y Y C A G G A

Y

YC

AG G A

GR

G

R C A R R G R R A GGCUUC

AG G A

YGAAGC

A R G G ARYYYC

AG G A

UGRRRY

A A R G G A C A CCUCC

AG G A

YGGAG

A A U G A G AGCCGRU

CAG G A

UR

RUCGGU

GGGU

CAG G A

RRCY

R R R R R

YU

AG G A

U

AR

A Y A C R G G R Y G R U G

CAG G A

YG

G R A Y R Y G Y Y A G G A G R C C R A G G A A A A G U UUC

A

G GAU

GA

G C A G G G A G C A Y A AGUAGC

GG A

YU

GCURY

R A A A C G A A C CGGGRGCACUG

UU

Y A

ACAGUGCYCCC

U U U U U U U Y Y5´

CsrA_binding

RYY

AG G A

YGRY

Toledo-Arana et al. (2007) Small noncoding RNAs controlling pathogenesis. Current Opinion in Microbiology.

Paul Gardner intro2ncRNA

Page 14: 01 nc rna-intro

Other RNA sponges: sRNA & miRNAs

I Lalaouna et al. (2015) A 3 External Transcribed Spacer in atRNA Transcript Acts as a Sponge for Small RNAs to PreventTranscriptional Noise. Molecular Cell.

I Hansen et al. (2013) Natural RNA circles function as efficientmicroRNA sponges. Nature.

Hentze & Preiss (2013) Circular RNAs: splicing’s enigma variations. The EMBO journal.

Paul Gardner intro2ncRNA

Page 15: 01 nc rna-intro

Guide RNAs: snoRNAs, miRNAs, snRNAs & CRISPRs

Paul Gardner intro2ncRNA

Page 16: 01 nc rna-intro

snoRNAs: small nucleolar RNAs

I Identified and characterised in the late1990s

I They guide covalent modifactions onrRNA, spliceosomal RNAs and somemRNAs

I Two main classes: C/D box & H/ACAbox

I C/D box snoRNAs guide methylationreactions

I H/ACA box snoRNAs guidepseudouridylation reactions

Gardner, Bateman & Poole (2010) SnoPatrol: how many snoRNA genes arethere?. Journal of Biology.

Paul Gardner intro2ncRNA

Page 17: 01 nc rna-intro

miRNAs

Paul Gardner intro2ncRNA

Page 18: 01 nc rna-intro

miRNAs

Paul Gardner intro2ncRNA

Page 19: 01 nc rna-intro

Excercise: Rfam database

I Navigate to: http://rfam.xfam.org/family/mir-10I Correct any errors you find in the Wikipedia entryI What species is miR-10 found in?I How many sequences are in the seed alignmnet? How many

regions are annotated as miR-10?I Who added this family to the Rfam database? How was the

consensus structure produced?

Paul Gardner intro2ncRNA

Page 20: 01 nc rna-intro

snRNAs and the spliceosome

I Spliceosome core: snRNAs (small nuclear RNA) U1,U2,..U6I Small RNAs: both recognition (base pairing) and catalysisI Recognize splice sites, interact with each other, other proteins

Figure from: https://commons.wikimedia.org/wiki/File:Spliceosome ball cycle new2.jpg

Paul Gardner intro2ncRNA

Page 21: 01 nc rna-intro

snRNAs and the spliceosome

Figure from: https://www.mpibpc.mpg.de/luehrmann

Paul Gardner intro2ncRNA

Page 22: 01 nc rna-intro

Zimmerly & Semper (2015) Evolution of group II introns. Mobile DNA

Paul Gardner intro2ncRNA

Page 23: 01 nc rna-intro

CRISPR/CAS background

I CRISPR-Cas:clustered, regularlyinterspaced shortpalindromic repeats

I Adaptive immunityfor 1

2 bacteria & mostarchaea

Westra, Buckling & Fineran (2014) CRISPRCas systems: beyond adaptive immunity. Nature Reviews Microbiology.

Paul Gardner intro2ncRNA

Page 24: 01 nc rna-intro

Exercise: inside-out genes and mir-sno clusters

I Check out the following genes in the UCSC genome browser:I GAS5, SNHG1, TRG11I SNORD116, miR-17, miR-200

I Are these genes conserved?I Find your favourite RNA sequences in RNAcentral

I Explore the different sources of information

Paul Gardner intro2ncRNA

Page 25: 01 nc rna-intro

Example 2: bacterial attenuators

I Riboswitches

I Thermoregulators

I Peptide leaders

Paul Gardner intro2ncRNA

Page 26: 01 nc rna-intro

Riboswitches

Paul Gardner intro2ncRNA

Page 27: 01 nc rna-intro

RNA discovery & characterissation is accelerating

I ApplicationsI Genome annotation (mRNAs, ncRNAs, spliceforms, UTRs)I Quantification (Listen to Alicia Oshlack)

I ExtensionsI Infer RNA structure (SHAPE) (Lucks et al. (2011))I RNA:RNA (CLASH) (Travis et al. (2014))I RNA:protein (RIP-seq) (Cook et al. (2015))

Paul Gardner intro2ncRNA

Page 28: 01 nc rna-intro

RNA-seq identifies 1,000s of new RNAs

SraB yceD rpmF E.RUF plsX

E.coli K12E.coli E24377AC.rodentiumS.entericaK.pneumoniae

rmf RNA motif rmf P.RUF pyrD

S.maltophiliaX.axonopodis

secY X.RUF rpsM

D. Enterobacteriaceae RUF E. Pseudomonas RUF

F. Xanthomonadaceae RUF

G A U U A C C AGCACGCCCYA UC

CGGGCGGC

G G G C

RGCCCAGGGGCUCCYYR

RGGAGCCCYU

UUUU

5'

terminator

GUCUCGY

GCG

CGG

GU

G

GA

YG

GYGG

UCCUGCGCYG

GAGUA

GCGCGGG

CGRY

CG

RRR

YY Y C R

G GY C A R

C

YR

UCC

GGCG

CCGG

A GC

RU

GGGCA

CAC U C C C

CA

YGC C G

GGUYCR

YGGAACCR A

GUUCCRY

GGGCUUC

CA

GY

AA

YCCGRGAC

C U U G Y U A

AUU

C

AGUUCACUU

U AAUCACGC

R YGCGUGAU

G A AGCUU

AGU

GA

GG

AY

UUCCCCG

GC A

A

YGGGGAA

YA

CC

GA

ACC

RGGC

R G C G A C G A U A C C U U G5´

GNRA tetraloop

0-48 BPs

A. Enterobacteriaceae RUF

B. Pseudomonas RUF

C. Xanthomonadaceae RUF

Gaps

0-910-99100-9991,000-4,999>5,000

Number of RNA-seq reads

80%90% 70%

40%

nucleotidepresent

nucleotideidentity

N

N 90%N 80%

covaryingmutationsbasepairannotations

compatiblemutationsnomutationsobserved

R =AorG. Y=C orU.

Legend

70%

P.putidaP.aeruginosa-PAO1P.aeruginosa-PA14

T T A G C G C C G G A A A C C A G G C G T C A T G A G C C T G C A A C A T A T G G C C C T A T C G A C G A A A G C G T T A A G T C T T T A T G A C A A A T C G G T C A T T C A C A C G C C T G A A C G C T T T G G T T A G A A C T C C A G T T A A T C C G C C C A C C G C A A C G G T G T C G G G C G A − − G G G T C G T C A C G C C G G C A A C G A C C C C T T − T C G G C G A A A − − G C T T C G C C A G G C C T C C C C T G G G G G C C A A C G G G A C A T A A C A G T C A A C A A G T G A G G G C A A C A C C C T A T G A G A A G A C T T A A G C G T G A T C C G T T G G A A A G A G C C T T C T T G C G T G G T T A T C A G A A C G G C A T A A C C G G T A A A T C T C G T G A T C T T T G T C C G T T C A C C C A T C C T A C G A C G C G G C A G T C C T G G C T C A A C G G C T G G C G C G A G G G C C G T G G C G A C A A C T G G G A C G G C C T C A C T G G C A C G G C C G G C T T A C A A C G T C T C A A T C A A C T C C A G C A C G T G T A A G C G A C A A C A C G G A T A G C A C C G A T T T C C C C A A G G C A C G C C C C A T C C G G G C G G C G G G C G C A A G C C C A A G G G C T C C G C − A A G G A G C C C T T T T C A A T T C C − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − G C C G C G G C A A T G C G G C G A T G G C G T C C A C C G C T T C G C G G A T C A A C G C C G G T C C C T T G T A G A T G A A A C C C G A A T A G A T C T G C A C C A G G C T C G C C C C G G C G G C G A T C T T C T

T A G G C A T A T T T T T T T C C A T C A G A T A T A G C G T A T T G A T G A T A G C C A T T T T A A A C T A T G C G C − − − T T C G T T T T G C A G G T T G A T G T T T G T T A T C A G C A C T G A A C G A A A A T A A A G C A G T A A C C C G C A A T G T G T G C G A A T T A T T G G C A A A A G G C A A C C A C A G G C T G C C T T T T T C T T T G A C T C T A T G A C G T T A C A A A G T T A A T A T G C G C G C C C T A T G C A A A A G G T A A A A T T A C C C C T G A C T C T C G A T C C G G T T C G T A C G G C T C A A A A A C G C C T T G A T T A C C A G G G T A T C T A T A C C C C T G A T C A G G T T G A G C G C G T C G C C G A A T C C G T A G T C A G T G T G G A C A G T G A T G T G G A A T G C T C C A T G T C G T T C G C T A T C G A T A A C C A A C G T C T C G C A G T G T T A A A C G G C G A T G C G A A G G T G A C G G T A A C G C T C G A G T G T C A G C G T T G C G G G A A G C C G T T T A C T C A T C A G G T C T A C A C A A C G T A T T G T T T T A G T C C T G T G C G T T C A G A C G A A C A G G C T G A A G C A C T G C C G G A A G C G T A T G A A C C G A T T G A G G T T A A C G A A T T C G G T G A A A T C G A T C T G C T T G C A A T G G T T G A A G A T G A A A T C A T C C T C G C C T T G C C G G T A G T T C C G G T G C A T G A T T C T G A A C A C T G T G A A G T G T C C G A A G C G G A C A T G G T C T T T G G T G A A C T G C C T G A A G A A G C G C A A A A G C C A A A C C C A T T T G C C G T A T T A G C C A G C T T A A A G C G T A A G T A A T T G G T G C T C C C C G T T G G A T C G G G G A T A A A C C G T A A T T G A G G A G T A A G G T C C A T G G C C G T A C A A C A G A A T A A A C C A A C C C G T T C C A A A C G T G G C A T G C G T C G T T C C C A T G A C G C G C T G A C C G C A G T C A C C A G C C T G T C T G T A G A C A A A A C T T C T G G T G A A A A A C A C C T G C G T C A C C A C A T C A C T G C C G A C G G T T A C T A C C G C G G C C G C A A G G T C A T C G C T A A G T A A T C A C G C A − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − T C T G C − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − G T G A T G A A G C T T A G T G A G G A T T T T C C C C A G G C A A C T G G G G A A A G A C C A A A C C G G G C G G C G A C G A T A C C T T G A C A C G T C T A A C C C T G G C G T T A G A T G T C A T G G G A G G G G A T T T T G G C C C T T C C G T G A C A G T G C C T G C A G C A T T G C A G G C A C T G A A T T C T A A T T C G C A A C T C A C T C T T C T T T T A G T C G G C A A T T C C G A C G C C A T C A C G C C A T T A C T T G C T A A A G C T G A C T T T G A A C A A C G T T C G C G T C T G C A G A T T A T T C C T G C G C A G T C A G T

A T G G C G C A G G C T G G C A T T G G T A A C C T C G G C G G C G G G C T C G G C A A G T T C A C G G A A C T T C G C C A G C G G T T G C T G T T C G T C C T C G G G G C A T T G A T C G T T T A T C G C A T C G G C T G C T A T G T G C C G G T G C C T G G C G T G A A T C C C G A T G C C A T G C T T T C G T T G A T G C A G G C G C A G G G C G G C G G C A T C G T G G A C A T G T T C A A C A T G T T C T C G G G C G G C G C C C T G C A C C G T T T C A G T A T T T T T G C A T T G A A C G T G A T G C C G T A T A T C T C G G C A T C G A T C G T G A T C C A G T T G G C C A C G C A C A T C T T T C C C G C C C T C A A G G C G A T G C A G A A A G A A G G C G A A T C G G G C C G A C G C A A G A T C A C C C A A T A T T C G C G C A T C G G T G C G G T G T T G C T G G C G G T G G T G C A G G G C G G C A G T A T C G C G C T G G C A C T G C A G A A C C A G A C C G C C C C T G G T G G C G C T C C G G T G G T G T A T G C G C C G G G C A T G G G C T T C G T G C T C A C C G C G G T G A T C G C T T T G A C C G C T G G T A C C A T C T T C C T G A T G T G G G T A G G C G A G C A G G T T A C C G A G C G C G G C A T C G G T A A C G G C G T A T C G C T G A T C A T C T T T G C C G G C A T C G T G G C T G G C C T G C C G T C G G C G G C C A T C C A G A C G G T C G A A G C C T T C C G C G A A G G C A A T C T G A G C T T C A T T T C G C T G T T G T T G A T C G T C A T C A C C A T C C T G G C G T T C A C G C T G T T C G T C G T G T T T G T C G A G C G T G G G C A G C G G C G G A T C A C G G T C A A C T A C G C G C G C C G C C A G G G C G G T C G C A A T G C G T A C A T G A A C C A G A C C T C G T T C T T G C C G C T C A A G C T G A A C A T G G C C G G T G T G A T T C C G C C G A T C T T T G C G T C C A G C A T C C T G G C A T T C C C G G C A A C G T T G T C G A T G T G G T C G G G T C A G G C T G C − − A T C G G − G T G G T A T C G G C T C G T G G C T G C A G A A G A T T G C C A A C G C G C T T G G C C C C G G T G A G C C G G T A C A C A T G C T G G T C T T C G C T G C G C T G A T C A T C G G T T T T G C A T T C T T C T A C A C C G C G C T G G T G T T C A A C T C G C A G G A A A C C G C C G A C A A C C T C A A G A A A T C G G G C G C G C T G A T T C C G G G C A T C C G T C C A G G C A A G G C C A C C G C A G A T T A C G T C G A T G G C G T A C T G A C G C G C C T G A C A G C T G C C G G T T C G T T G T A C C T G G T A A T C G T C T G C C T G C T G C C G G A A A T C A T G C G C A C G C A G C T C G G C A C T T C G T T C C A C T T C G G G G G C A C C T C G C T A T T G A T T G C A G T G G T G G T G G T G A T G G A C T T C A T T G C G C A G A T C C A G G C G C A C C T G A T G T C G C A C C A G T A T G A G A G C T T G C T G A A G A A G G C C A A C C T C A A G G G C G G C T C A C G C G G C G G T C T T G C G C G C G G T T A A G T G G T A C A C T A G A T C T T C A T C − − − − − − A C G T G A A G A C G G C − C T G G T T C C C G G G C C A C G A T C T T C C G A T C A G A A G G G C G G C T C G C G C G A C G − T C T C G C G C G C G G G T G T G A C G G G G T G G T T C T G T G C G G G A G T A G C A C A G G C G A T T C − G G A G T G G T T T T C T G G A T C A G C A C C G T C C G G C G C C G G A G C G A G G G C A C A C T C C C C A C G C C G G G T C C A T G G A A C C T C T G G T T C C A C G G G C T T C A A A G C A A T C C G A G G C C T T G C T A T A A T T C C G A G T T C A C T T T − − T G A T C C A T C C T G C C G G A T G G − − − C G C C T G G G − − − C G C T G T C G G G C C A T C A C T C A G T T G G A G A A T C G C G T C A T G G C G C G T A T T G C A G G C G T C A A C C T G C C A G C C C A G A A G C A C G T C T G G G T C G G G T T G C A A A G C A T C T A C G G C A T C G G C C G T A C C C G T T C A A A G A A G C T C T G C G A A T C C G C A G G C G T T A C C T C G A C C A C G A A G A T T C G T G A T C T G T C C G A A C C C G A A A T C G A G C G C C T G C G C G C C G A A G T C G G C A A G T A T G T C G T C G A A G G C G A C C T G C G C C G C G A A A T C G G T A T C G C G A T C A A G C G A C T G A T G G A C C T C G G C T G C T A T C G C G G T C T G C G T C A T C G C C G T G G T C T T C C G C T G C G T G G T C A G C G C A C C C G T A C C A A C G C C C G C A C C C G C A A G G G T C C G C G C A A G G C G A T C A G G A A G T A A

Lindgreen et al. (2014) Robust identification of noncoding RNA from transcriptomes requiresphylogenetically-informed sampling. PLOS Computational Biology.

Paul Gardner intro2ncRNA

Page 29: 01 nc rna-intro

Non-coding RNA resources

Paul Gardner intro2ncRNA

Page 30: 01 nc rna-intro

Opportunities

I Rfam/RNAcentral are hiring:I Project LeaderI Software DeveloperI Database Biocurator

Paul Gardner intro2ncRNA

Page 31: 01 nc rna-intro

Some open questions

I How much transcription is ”functional”?I What’s a good negative control for transcriptome

experiments?I What causes variation in [protein]:[mRNA] ratios?

Lu, Vogel et al. (2007) Absolute protein expression profiling estimates the relative contributions of transcriptionaland translational regulation. Nature Biotechnology

Paul Gardner intro2ncRNA

Page 32: 01 nc rna-intro

Thanks

Paul Gardner intro2ncRNA