01 nc rna-intro
-
Upload
paul-gardner -
Category
Science
-
view
491 -
download
0
Transcript of 01 nc rna-intro
Introduction to ncRNAs
Paul Gardner
August 10, 2015
Paul Gardner intro2ncRNA
Where on the bio/math spectrum do you lie?
Paul Gardner intro2ncRNA
What is RNA?
��
���
2R
1R
�
�
��
�
���
�
��
�
�
���
�
�
�
�
�
��
��
�
1R
1R
1R
����
��
�
�
��
���
���
1
2R
R : −OH −H
: −H −CH3
�������������������� ����������� ��������� ����� ����� ����� �� �����
���������������������������������������������
��������
�������
�������
������������������������
RNA DNA
������������������
IUPAC ambiguity chars:
Paul Gardner intro2ncRNA
Crick’s “central dogma of molecular biology”
Paul Gardner intro2ncRNA
RNA: why is this stuff interesting?
I RNA world was an essential step to modern protein-DNAbased life (using current reasonable models).
I Which came first, DNA or protein?I RNA has catalytic potential (like protein), carries hereditary
information (like DNA).
Image by James W. Brown, www.mbio.ncsu.edu/JWB/soup.html
Paul Gardner intro2ncRNA
RNA: why is this stuff interesting?
I RNA fulfills all essential biochemical functions of life1. Information storage and replication2. Enzymatic activity: catalyze biochemical reactions3. Regulator: sense and react to environment
I RNA World originally proposed in 1986
Paul Gardner intro2ncRNA
RNA: representations
C A C G G C G U A A U G C A G G U C G G G G A U A U U G U A U G C U C C A G G C U C G U U G U C U A U G G G G A U U C G U C A C U
C A C G G C G U A A U G C A G G U C G G G G A U A U U G U A U G C U C C A G G C U C G U U G U C U A U G G G G A U U C G U C A C UCA
CG
GC
GU
AA
UG
CA
GG
UC
GG
GG
AU
AU
UG
UA
UG
CU
CC
AG
GC
UC
GU
UG
UC
UA
UG
GG
GA
UU
CG
UC
AC
U
CA
CG
GC
GU
AA
UG
CA
GG
UC
GG
GG
AU
AU
UG
UA
UG
CU
CC
AG
GC
UC
GU
UG
UC
UA
UG
GG
GA
UU
CG
UC
AC
U
Paul Gardner intro2ncRNA
Types of RNA
Hoeppner, Gardner & Poole (2012) Comparative Analysis of RNA Families Reveals Distinct Repertoires for Each
Domain of Life. PLOS Computational Biology.
Paul Gardner intro2ncRNA
Molecular fossils
Hoeppner, Gardner & Poole (2012) Comparative Analysis of RNA Families Reveals Distinct Repertoires for Each
Domain of Life. PLOS Computational Biology.
Paul Gardner intro2ncRNA
Non-coding RNAs are often poorly conservedI Poor homology search tools or genuine evolutionary turnover?
0
20
40
60
80
100
Conservation of RNAs & Proteins in bacterial genomes
Con
serv
edfa
mili
es(%
)Freq.
RNA−seq species0
10
Pfam (N=6671)Rfam (N=331)
KingdomPhylum
ClassOrder
FamilyGenus
0.0 0.1 0.2 0.3 0.4 0.5 0.6
Phylogenetic distance
A
BC
Lindgreen et al. (2014) Robust Identification of Noncoding RNA from Transcriptomes Requires
Phylogenetically-Informed Sampling. PLOS Computational Biology.
Paul Gardner intro2ncRNA
Some example mechanisms for RNA function
I RNA spongesI RNA:protein (6S RNA, CsrB)I RNA:RNA (circRNAs, spacer)
I Guide RNAsI CRISPRsI snoRNAsI microRNAI spliceosomal RNAs
I AttenuatorsI RiboswitchesI ThermosensorsI Peptide leaders
Gather data
Analyze-Classify
Hypotheses-Predictions
ExperimentGCGAGCAGACGCACCGAACAGACACAGUGAGCAGGCGCCCCGAGCAGUCAUAACACUGAGACGCAGCGAGCGU-AACG
RAAAARCY
Y R
RGYUUUUUU U5'
0.0
1.0
2.0
A
CGU
CC
A
GA5
A
GA
U
CAGGUA10
CAGUCUGA
Paul Gardner intro2ncRNA
RNA sponges: 6S RNA, a bacterial sRNA
I One of the 1st ncRNAs to beidentified & sequenced (Hindley 1967& Brownlee 1971)
I Function wasn’t determined until 2002(Wasserman 2002)
I Structure mimics an open promoter
I A major regulator of gene expression
I Expressed at high levels duringstationary growth phase, binds σ70
with RNA polymerase
I Found in Proteobacteria, Firmicutes,Cyanobacteria, Bacteroidetes,Chlamydiae, ...
Barrick et al. (2005) 6S RNA is a widespread regulator of eubacterial RNA polymerase that resembles an open
promoter. RNA.
Paul Gardner intro2ncRNA
RNA sponges: CsrB RNA and CsrA
R A R Y C R R R R R A R A R G A RCAU
CAG G A
UG
AUG
C R Y Y C A G G A
Y
YC
AG G A
GR
G
R C A R R G R R A GGCUUC
AG G A
YGAAGC
A R G G ARYYYC
AG G A
UGRRRY
A A R G G A C A CCUCC
AG G A
YGGAG
A A U G A G AGCCGRU
CAG G A
UR
RUCGGU
GGGU
CAG G A
RRCY
R R R R R
YU
AG G A
U
AR
A Y A C R G G R Y G R U G
CAG G A
YG
G R A Y R Y G Y Y A G G A G R C C R A G G A A A A G U UUC
A
G GAU
GA
G C A G G G A G C A Y A AGUAGC
GG A
YU
GCURY
R A A A C G A A C CGGGRGCACUG
UU
Y A
ACAGUGCYCCC
U U U U U U U Y Y5´
CsrA_binding
RYY
AG G A
YGRY
5´
Toledo-Arana et al. (2007) Small noncoding RNAs controlling pathogenesis. Current Opinion in Microbiology.
Paul Gardner intro2ncRNA
Other RNA sponges: sRNA & miRNAs
I Lalaouna et al. (2015) A 3 External Transcribed Spacer in atRNA Transcript Acts as a Sponge for Small RNAs to PreventTranscriptional Noise. Molecular Cell.
I Hansen et al. (2013) Natural RNA circles function as efficientmicroRNA sponges. Nature.
Hentze & Preiss (2013) Circular RNAs: splicing’s enigma variations. The EMBO journal.
Paul Gardner intro2ncRNA
Guide RNAs: snoRNAs, miRNAs, snRNAs & CRISPRs
Paul Gardner intro2ncRNA
snoRNAs: small nucleolar RNAs
I Identified and characterised in the late1990s
I They guide covalent modifactions onrRNA, spliceosomal RNAs and somemRNAs
I Two main classes: C/D box & H/ACAbox
I C/D box snoRNAs guide methylationreactions
I H/ACA box snoRNAs guidepseudouridylation reactions
Gardner, Bateman & Poole (2010) SnoPatrol: how many snoRNA genes arethere?. Journal of Biology.
Paul Gardner intro2ncRNA
miRNAs
Paul Gardner intro2ncRNA
miRNAs
Paul Gardner intro2ncRNA
Excercise: Rfam database
I Navigate to: http://rfam.xfam.org/family/mir-10I Correct any errors you find in the Wikipedia entryI What species is miR-10 found in?I How many sequences are in the seed alignmnet? How many
regions are annotated as miR-10?I Who added this family to the Rfam database? How was the
consensus structure produced?
Paul Gardner intro2ncRNA
snRNAs and the spliceosome
I Spliceosome core: snRNAs (small nuclear RNA) U1,U2,..U6I Small RNAs: both recognition (base pairing) and catalysisI Recognize splice sites, interact with each other, other proteins
Figure from: https://commons.wikimedia.org/wiki/File:Spliceosome ball cycle new2.jpg
Paul Gardner intro2ncRNA
snRNAs and the spliceosome
Figure from: https://www.mpibpc.mpg.de/luehrmann
Paul Gardner intro2ncRNA
Zimmerly & Semper (2015) Evolution of group II introns. Mobile DNA
Paul Gardner intro2ncRNA
CRISPR/CAS background
I CRISPR-Cas:clustered, regularlyinterspaced shortpalindromic repeats
I Adaptive immunityfor 1
2 bacteria & mostarchaea
Westra, Buckling & Fineran (2014) CRISPRCas systems: beyond adaptive immunity. Nature Reviews Microbiology.
Paul Gardner intro2ncRNA
Exercise: inside-out genes and mir-sno clusters
I Check out the following genes in the UCSC genome browser:I GAS5, SNHG1, TRG11I SNORD116, miR-17, miR-200
I Are these genes conserved?I Find your favourite RNA sequences in RNAcentral
I Explore the different sources of information
Paul Gardner intro2ncRNA
Example 2: bacterial attenuators
I Riboswitches
I Thermoregulators
I Peptide leaders
Paul Gardner intro2ncRNA
Riboswitches
Paul Gardner intro2ncRNA
RNA discovery & characterissation is accelerating
I ApplicationsI Genome annotation (mRNAs, ncRNAs, spliceforms, UTRs)I Quantification (Listen to Alicia Oshlack)
I ExtensionsI Infer RNA structure (SHAPE) (Lucks et al. (2011))I RNA:RNA (CLASH) (Travis et al. (2014))I RNA:protein (RIP-seq) (Cook et al. (2015))
Paul Gardner intro2ncRNA
RNA-seq identifies 1,000s of new RNAs
SraB yceD rpmF E.RUF plsX
E.coli K12E.coli E24377AC.rodentiumS.entericaK.pneumoniae
rmf RNA motif rmf P.RUF pyrD
S.maltophiliaX.axonopodis
secY X.RUF rpsM
D. Enterobacteriaceae RUF E. Pseudomonas RUF
F. Xanthomonadaceae RUF
G A U U A C C AGCACGCCCYA UC
CGGGCGGC
G G G C
RGCCCAGGGGCUCCYYR
RGGAGCCCYU
UUUU
5'
terminator
GUCUCGY
GCG
CGG
GU
G
GA
YG
GYGG
UCCUGCGCYG
GAGUA
GCGCGGG
CGRY
CG
RRR
YY Y C R
G GY C A R
C
YR
UCC
GGCG
CCGG
A GC
RU
GGGCA
CAC U C C C
CA
YGC C G
GGUYCR
YGGAACCR A
GUUCCRY
GGGCUUC
CA
GY
AA
YCCGRGAC
C U U G Y U A
AUU
C
AGUUCACUU
5´
U AAUCACGC
R YGCGUGAU
G A AGCUU
AGU
GA
GG
AY
UUCCCCG
GC A
A
YGGGGAA
YA
CC
GA
ACC
RGGC
R G C G A C G A U A C C U U G5´
GNRA tetraloop
0-48 BPs
A. Enterobacteriaceae RUF
B. Pseudomonas RUF
C. Xanthomonadaceae RUF
Gaps
0-910-99100-9991,000-4,999>5,000
Number of RNA-seq reads
80%90% 70%
40%
nucleotidepresent
nucleotideidentity
N
N 90%N 80%
covaryingmutationsbasepairannotations
compatiblemutationsnomutationsobserved
R =AorG. Y=C orU.
Legend
70%
P.putidaP.aeruginosa-PAO1P.aeruginosa-PA14
T T A G C G C C G G A A A C C A G G C G T C A T G A G C C T G C A A C A T A T G G C C C T A T C G A C G A A A G C G T T A A G T C T T T A T G A C A A A T C G G T C A T T C A C A C G C C T G A A C G C T T T G G T T A G A A C T C C A G T T A A T C C G C C C A C C G C A A C G G T G T C G G G C G A − − G G G T C G T C A C G C C G G C A A C G A C C C C T T − T C G G C G A A A − − G C T T C G C C A G G C C T C C C C T G G G G G C C A A C G G G A C A T A A C A G T C A A C A A G T G A G G G C A A C A C C C T A T G A G A A G A C T T A A G C G T G A T C C G T T G G A A A G A G C C T T C T T G C G T G G T T A T C A G A A C G G C A T A A C C G G T A A A T C T C G T G A T C T T T G T C C G T T C A C C C A T C C T A C G A C G C G G C A G T C C T G G C T C A A C G G C T G G C G C G A G G G C C G T G G C G A C A A C T G G G A C G G C C T C A C T G G C A C G G C C G G C T T A C A A C G T C T C A A T C A A C T C C A G C A C G T G T A A G C G A C A A C A C G G A T A G C A C C G A T T T C C C C A A G G C A C G C C C C A T C C G G G C G G C G G G C G C A A G C C C A A G G G C T C C G C − A A G G A G C C C T T T T C A A T T C C − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − G C C G C G G C A A T G C G G C G A T G G C G T C C A C C G C T T C G C G G A T C A A C G C C G G T C C C T T G T A G A T G A A A C C C G A A T A G A T C T G C A C C A G G C T C G C C C C G G C G G C G A T C T T C T
T A G G C A T A T T T T T T T C C A T C A G A T A T A G C G T A T T G A T G A T A G C C A T T T T A A A C T A T G C G C − − − T T C G T T T T G C A G G T T G A T G T T T G T T A T C A G C A C T G A A C G A A A A T A A A G C A G T A A C C C G C A A T G T G T G C G A A T T A T T G G C A A A A G G C A A C C A C A G G C T G C C T T T T T C T T T G A C T C T A T G A C G T T A C A A A G T T A A T A T G C G C G C C C T A T G C A A A A G G T A A A A T T A C C C C T G A C T C T C G A T C C G G T T C G T A C G G C T C A A A A A C G C C T T G A T T A C C A G G G T A T C T A T A C C C C T G A T C A G G T T G A G C G C G T C G C C G A A T C C G T A G T C A G T G T G G A C A G T G A T G T G G A A T G C T C C A T G T C G T T C G C T A T C G A T A A C C A A C G T C T C G C A G T G T T A A A C G G C G A T G C G A A G G T G A C G G T A A C G C T C G A G T G T C A G C G T T G C G G G A A G C C G T T T A C T C A T C A G G T C T A C A C A A C G T A T T G T T T T A G T C C T G T G C G T T C A G A C G A A C A G G C T G A A G C A C T G C C G G A A G C G T A T G A A C C G A T T G A G G T T A A C G A A T T C G G T G A A A T C G A T C T G C T T G C A A T G G T T G A A G A T G A A A T C A T C C T C G C C T T G C C G G T A G T T C C G G T G C A T G A T T C T G A A C A C T G T G A A G T G T C C G A A G C G G A C A T G G T C T T T G G T G A A C T G C C T G A A G A A G C G C A A A A G C C A A A C C C A T T T G C C G T A T T A G C C A G C T T A A A G C G T A A G T A A T T G G T G C T C C C C G T T G G A T C G G G G A T A A A C C G T A A T T G A G G A G T A A G G T C C A T G G C C G T A C A A C A G A A T A A A C C A A C C C G T T C C A A A C G T G G C A T G C G T C G T T C C C A T G A C G C G C T G A C C G C A G T C A C C A G C C T G T C T G T A G A C A A A A C T T C T G G T G A A A A A C A C C T G C G T C A C C A C A T C A C T G C C G A C G G T T A C T A C C G C G G C C G C A A G G T C A T C G C T A A G T A A T C A C G C A − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − T C T G C − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − G T G A T G A A G C T T A G T G A G G A T T T T C C C C A G G C A A C T G G G G A A A G A C C A A A C C G G G C G G C G A C G A T A C C T T G A C A C G T C T A A C C C T G G C G T T A G A T G T C A T G G G A G G G G A T T T T G G C C C T T C C G T G A C A G T G C C T G C A G C A T T G C A G G C A C T G A A T T C T A A T T C G C A A C T C A C T C T T C T T T T A G T C G G C A A T T C C G A C G C C A T C A C G C C A T T A C T T G C T A A A G C T G A C T T T G A A C A A C G T T C G C G T C T G C A G A T T A T T C C T G C G C A G T C A G T
A T G G C G C A G G C T G G C A T T G G T A A C C T C G G C G G C G G G C T C G G C A A G T T C A C G G A A C T T C G C C A G C G G T T G C T G T T C G T C C T C G G G G C A T T G A T C G T T T A T C G C A T C G G C T G C T A T G T G C C G G T G C C T G G C G T G A A T C C C G A T G C C A T G C T T T C G T T G A T G C A G G C G C A G G G C G G C G G C A T C G T G G A C A T G T T C A A C A T G T T C T C G G G C G G C G C C C T G C A C C G T T T C A G T A T T T T T G C A T T G A A C G T G A T G C C G T A T A T C T C G G C A T C G A T C G T G A T C C A G T T G G C C A C G C A C A T C T T T C C C G C C C T C A A G G C G A T G C A G A A A G A A G G C G A A T C G G G C C G A C G C A A G A T C A C C C A A T A T T C G C G C A T C G G T G C G G T G T T G C T G G C G G T G G T G C A G G G C G G C A G T A T C G C G C T G G C A C T G C A G A A C C A G A C C G C C C C T G G T G G C G C T C C G G T G G T G T A T G C G C C G G G C A T G G G C T T C G T G C T C A C C G C G G T G A T C G C T T T G A C C G C T G G T A C C A T C T T C C T G A T G T G G G T A G G C G A G C A G G T T A C C G A G C G C G G C A T C G G T A A C G G C G T A T C G C T G A T C A T C T T T G C C G G C A T C G T G G C T G G C C T G C C G T C G G C G G C C A T C C A G A C G G T C G A A G C C T T C C G C G A A G G C A A T C T G A G C T T C A T T T C G C T G T T G T T G A T C G T C A T C A C C A T C C T G G C G T T C A C G C T G T T C G T C G T G T T T G T C G A G C G T G G G C A G C G G C G G A T C A C G G T C A A C T A C G C G C G C C G C C A G G G C G G T C G C A A T G C G T A C A T G A A C C A G A C C T C G T T C T T G C C G C T C A A G C T G A A C A T G G C C G G T G T G A T T C C G C C G A T C T T T G C G T C C A G C A T C C T G G C A T T C C C G G C A A C G T T G T C G A T G T G G T C G G G T C A G G C T G C − − A T C G G − G T G G T A T C G G C T C G T G G C T G C A G A A G A T T G C C A A C G C G C T T G G C C C C G G T G A G C C G G T A C A C A T G C T G G T C T T C G C T G C G C T G A T C A T C G G T T T T G C A T T C T T C T A C A C C G C G C T G G T G T T C A A C T C G C A G G A A A C C G C C G A C A A C C T C A A G A A A T C G G G C G C G C T G A T T C C G G G C A T C C G T C C A G G C A A G G C C A C C G C A G A T T A C G T C G A T G G C G T A C T G A C G C G C C T G A C A G C T G C C G G T T C G T T G T A C C T G G T A A T C G T C T G C C T G C T G C C G G A A A T C A T G C G C A C G C A G C T C G G C A C T T C G T T C C A C T T C G G G G G C A C C T C G C T A T T G A T T G C A G T G G T G G T G G T G A T G G A C T T C A T T G C G C A G A T C C A G G C G C A C C T G A T G T C G C A C C A G T A T G A G A G C T T G C T G A A G A A G G C C A A C C T C A A G G G C G G C T C A C G C G G C G G T C T T G C G C G C G G T T A A G T G G T A C A C T A G A T C T T C A T C − − − − − − A C G T G A A G A C G G C − C T G G T T C C C G G G C C A C G A T C T T C C G A T C A G A A G G G C G G C T C G C G C G A C G − T C T C G C G C G C G G G T G T G A C G G G G T G G T T C T G T G C G G G A G T A G C A C A G G C G A T T C − G G A G T G G T T T T C T G G A T C A G C A C C G T C C G G C G C C G G A G C G A G G G C A C A C T C C C C A C G C C G G G T C C A T G G A A C C T C T G G T T C C A C G G G C T T C A A A G C A A T C C G A G G C C T T G C T A T A A T T C C G A G T T C A C T T T − − T G A T C C A T C C T G C C G G A T G G − − − C G C C T G G G − − − C G C T G T C G G G C C A T C A C T C A G T T G G A G A A T C G C G T C A T G G C G C G T A T T G C A G G C G T C A A C C T G C C A G C C C A G A A G C A C G T C T G G G T C G G G T T G C A A A G C A T C T A C G G C A T C G G C C G T A C C C G T T C A A A G A A G C T C T G C G A A T C C G C A G G C G T T A C C T C G A C C A C G A A G A T T C G T G A T C T G T C C G A A C C C G A A A T C G A G C G C C T G C G C G C C G A A G T C G G C A A G T A T G T C G T C G A A G G C G A C C T G C G C C G C G A A A T C G G T A T C G C G A T C A A G C G A C T G A T G G A C C T C G G C T G C T A T C G C G G T C T G C G T C A T C G C C G T G G T C T T C C G C T G C G T G G T C A G C G C A C C C G T A C C A A C G C C C G C A C C C G C A A G G G T C C G C G C A A G G C G A T C A G G A A G T A A
Lindgreen et al. (2014) Robust identification of noncoding RNA from transcriptomes requiresphylogenetically-informed sampling. PLOS Computational Biology.
Paul Gardner intro2ncRNA
Non-coding RNA resources
Paul Gardner intro2ncRNA
Opportunities
I Rfam/RNAcentral are hiring:I Project LeaderI Software DeveloperI Database Biocurator
Paul Gardner intro2ncRNA
Some open questions
I How much transcription is ”functional”?I What’s a good negative control for transcriptome
experiments?I What causes variation in [protein]:[mRNA] ratios?
Lu, Vogel et al. (2007) Absolute protein expression profiling estimates the relative contributions of transcriptionaland translational regulation. Nature Biotechnology
Paul Gardner intro2ncRNA
Thanks
Paul Gardner intro2ncRNA