A Brief Overview of RNA Bioinformatics
Transcript of A Brief Overview of RNA Bioinformatics
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
1
A Brief Overview of RNA Bioinformatics
Sebastian WillUniversity of Vienna
Freiburg sRNA Meeting 2019
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
4
The Central Dogma(of RNA Bioinformatics)
Sequence =⇒ Structure =⇒ Function
Structure as Proxy of Function
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
4
The Central Dogma(of RNA Bioinformatics)
Sequence =⇒ Structure =⇒ Function
Structure as Proxy of Function
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
5
RNA Structure is Emergent
GCGGGAAUA
GCUCAGU
UG
G U AG A G C
AC
GA
CC
UU
GC C
AAGGUCGGGGU
CG C G A G
U U CG
AGUCUCGU
UUCCCGC
UC
CA
GCGGGUAUA
GCUCAGU
UG
G U AG A G C
A C GA C CUU G C
C A AG G
U C G G G GU CG C G A G
U U CG
AGUCUCGU
UUCCCGCUCC
A
GUGGUAAUA
GCUCAGU
UG
G U AG A G C
AC
GA
UC
UU
GC C
AAGGUCGGGGU
CG C C A G
U U CG
AGUCUGGU
UUACCGC
UC
CA
inconsistent
consistent
compensatory
Almost identical sequences — very different structuresVery different sequences — same structure
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
5
RNA Structure is Emergent
GCGGGAAUA
GCUCAGU
UG
G U AG A G C
AC
GA
CC
UU
GC C
AAGGUCGGGGU
CG C G A G
U U CG
AGUCUCGU
UUCCCGC
UC
CA
GCGGGUAUA
GCUCAGU
UG
G U AG A G C
A C GA C CUU G C
C A AG G
U C G G G GU CG C G A G
U U CG
AGUCUCGU
UUCCCGCUCC
A
GUGGUAAUA
GCUCAGU
UG
G U AG A G C
AC
GA
UC
UU
GC C
AAGGUCGGGGU
CG C C A G
U U CG
AGUCUGGU
UUACCGC
UC
CA
inconsistent
consistent
compensatory
No Shortcut “Sequence =⇒ Function”
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
6
More Dogmas
The world is simple!∗
∗in first approximation
Viable Shortcut
Sequence =⇒ 2D Structure =⇒ Function
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
6
More Dogmas
The world is simple!∗
∗in first approximation
Viable Shortcut
Sequence =⇒ 2D Structure =⇒ Function
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
6
More Dogmas
The world is simple!∗
∗in first approximation
GCGGAUUUA
GCUCAGD
DG
G G AG A G C
GCCAGAC
UG A A
YAU
CUGGAGGU
CC U G U GT P C
GAUC
CACAGAAUUCGCA C C A
D-LoopT-Loop
Acceptor Stem GCGGAUU
UA
GCUCA
GDDGG
GA G
AGCGCCAGAC
UG A A
YAU
CUGGA G
GUC
CUGUGTPC
GA U C
C A C A G A A U U C G C A C C A
Viable ShortcutSequence =⇒ 2D Structure =⇒ Function
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
7
Energies of RNA structures can be calculated
Nearest Neighbor Model (NNM)
• Free energies = sum of loop energies
• Loop energies measured experimentally(based on UV melting curves)
• Loop energies depend on• loop type • size • base composition
⇒ large energy parameter tablesGCU
UCCG
AA U
UCGGU
GC −3.4
−3.3
+3.5
+1.2
−2.4
Total energy−4.4 kcal/mol
+ distinguish RNA structures by free energy+ define minimum free energy (MFE)minimum free energy (MFE)minimum free energy (MFE)+ basis for entire tool set for RNA structure
- limitations (stay tuned)
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
7
Energies of RNA structures can be calculated
Nearest Neighbor Model (NNM)
• Free energies = sum of loop energies
• Loop energies measured experimentally(based on UV melting curves)
• Loop energies depend on• loop type • size • base composition
⇒ large energy parameter tablesGCU
UCCG
AA U
UCGGU
GC −3.4
−3.3
+3.5
+1.2
−2.4
Total energy−4.4 kcal/mol
+ distinguish RNA structures by free energy+ define minimum free energy (MFE)minimum free energy (MFE)minimum free energy (MFE)+ basis for entire tool set for RNA structure
- limitations (stay tuned)
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
8
From NNM to Structure Prediction
GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
8
From NNM to Structure Prediction
GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA
...
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA G
GGCUA
UUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCA
UAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG
CACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GCACC
CCU
GA U
AAGGGUGAGG
UCG C U G A
U U CG
AAUUCAGC
AUAGCCCA G
GGCUAU
UAG
CUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGC
AUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGC
UCAGU
U G G U U AGAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUCA
GU U G G U U
A GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GC
AC
CCCU
GA U
AAGGGUG
AGG
U C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GA G
CG
CACCCC
UG A U
AA
GGGUGA
GGU
C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G
A U UCG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G CGCACC
CCU
GA U
AAGGGUGAG
GUCG
C U G AU U C
GAAU
UCAGCAUAGCCCA G
GGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCU
CAG
UUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AU
UCA
GCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUU
GGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
8
From NNM to Structure Prediction
GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA
...
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA G
GGCUA
UUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCA
UAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG
CACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GCACC
CCU
GA U
AAGGGUGAGG
UCG C U G A
U U CG
AAUUCAGC
AUAGCCCA G
GGCUAU
UAG
CUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGC
AUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGC
UCAGU
U G G U U AGAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUCA
GU U G G U U
A GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GC
AC
CCCU
GA U
AAGGGUG
AGG
U C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GA G
CG
CACCCC
UG A U
AA
GGGUGA
GGU
C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G
A U UCG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G CGCACC
CCU
GA U
AAGGGUGAG
GUCG
C U G AU U C
GAAU
UCAGCAUAGCCCA G
GGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCU
CAG
UUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AU
UCA
GCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUU
GGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
-25.90 -25.90 -25.90 -26.70 -26.30 -27.00 -27.00 -27.00 -27.80 -26.10 -26.20 -26.20
-26.20 -26.20 -25.90 -25.90 -25.90 -25.90 -26.70 -26.10 -26.60 -26.60 -26.60 -26.00
-27.40 -25.90 -26.50 -26.40 -28.10 -26.40 -28.10 -26.40 -28.10 -26.40 -26.40 -28.90
-27.20 -27.30 -25.90 -26.50 -26.50 -26.20 -26.20 -26.20 -26.20 -27.00 -25.90 -26.00
-26.10 -26.10 -26.60 -26.10 -26.10 -26.60 -27.00 -27.00 -26.70 -26.70 -26.70 -26.70
-27.50 -26.10 -26.50 -26.40 -26.40 -26.10 -26.10 -26.10 -26.10 -26.90
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
8
From NNM to Structure Prediction
GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA
...
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA G
GGCUA
UUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCA
UAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG
CACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GCACC
CCU
GA U
AAGGGUGAGG
UCG C U G A
U U CG
AAUUCAGC
AUAGCCCA G
GGCUAU
UAG
CUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGC
AUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGC
UCAGU
U G G U U AGAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUCA
GU U G G U U
A GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GC
AC
CCCU
GA U
AAGGGUG
AGG
U C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GA G
CG
CACCCC
UG A U
AA
GGGUGA
GGU
C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G
A U UCG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G CGCACC
CCU
GA U
AAGGGUGAG
GUCG
C U G AU U C
GAAU
UCAGCAUAGCCCA G
GGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCU
CAG
UUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AU
UCA
GCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUU
GGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
MFE
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
9
Structure Prediction: Fast and Accurate
Performance of RNAFold (Vienna RNA Package 2.0)
102 103 104
Sequence Length
10-3
10-2
10-1
100
101
102
103
104
Run
time
[s]
length 100 in 0.01 s
length 1000 in 1s
[adapted from Vienna RNA Package 2.0, ALMOB 2011]
• Very fast folding algorithms: 0.01 seconds at length 100
• Very useful accuracy: ∼ 70% predicted base pairs correct
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
9
Structure Prediction: Fast and Accurate
Performance of RNAFold (Vienna RNA Package 2.0)
0
0.2
0.4
0.6
0.8
1
16S rRNA
23S rRNA
5S rRNA
7SK RNA
Cili. Telo. RNA
Cis-reg. element
GII Intron
GI Intron
Hairp. Ribozyme
Ham. Ribozyme
IRES
Other Ribozyme
Other RNA
Other rRNA
RNAIII
RNase E 5 UTR
RNase MRP RNA
RNase P RNA
snRNA
SRP RNA
Synthetic RNA
tmRNA
tRNA
Viral
mp; Phage
Y RNA
Sens
itivi
ty
[adapted from Vienna RNA Package 2.0, ALMOB 2011]
• Very fast folding algorithms: 0.01 seconds at length 100• Very useful accuracy: ∼ 70% predicted base pairs correct
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
9
Structure Prediction: Fast and Accurate
Performance of RNAFold (Vienna RNA Package 2.0)
0
0.2
0.4
0.6
0.8
1
16S rRNA
23S rRNA
5S rRNA
7SK RNA
Cili. Telo. RNA
Cis-reg. element
GII Intron
GI Intron
Hairp. Ribozyme
Ham. Ribozyme
IRES
Other Ribozyme
Other RNA
Other rRNA
RNAIII
RNase E 5 UTR
RNase MRP RNA
RNase P RNA
snRNA
SRP RNA
Synthetic RNA
tmRNA
tRNA
Viral
mp; Phage
Y RNA
PPV
[adapted from Vienna RNA Package 2.0, ALMOB 2011]
• Very fast folding algorithms: 0.01 seconds at length 100• Very useful accuracy: ∼ 70% predicted base pairs correct
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
10
Limitations
• Modified bases , , . . .
• Non-canonical base pairs
GGU
CAG
GUCC
GA A
AGGA
AGC
AGCC G
GU
CAG
GUCC
GA A
AGGA
AGC
AGCC
PseudoknotsA
AAA
A
A
AA
A
C
C
C C
C
C
C
C
C
C
UU
U
U
U
UU
U
U U U UUU
G G
G G
G
G
GG C
C
GG
CG
G G
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
11
RNAs Refold at ’Room Temperature’
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA G
GGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
+1.1
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGU
CGC U G A
U U CG
AAUUCAGCA
UAGCCCA
+2.5
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
+0.8
+1.7
MFE
The MFE misleads! Look at
• suboptimal structures
• structure ensembles
• kinetics (co-transcriptional!)
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
12
Suboptimals and Probabilities
GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA
Energies → Structure ProbabilitiesStructure Probabilities → Base Pair Probabilities
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
12
Suboptimals and Probabilities
GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA
G
G
G
C
U
AUU
AGCUC
AGU
U
G
GU U
AG A G C
GC
ACC
C
C
UG
A U
A
A
G
G
G
U
GAGGUCG C U G A
U UC
G
AAU
UCAGC
AU
A
G
C
C
C
A
G
G
G
C
U
AUU
AGCUC
AGU
U
G
GU U
AG A G C
GC
A
CCC
C
UG
A U
A
A
G
G
G
U
GAGGUCG C U G A
U UC
G
AAU
UCAGC
AU
A
G
C
C
C
A
G
G
G
C
U
AUU
AGCUC
AGU
U
G
GU U
AG A G C
GC
A
C
CCC
UG
A U
A
A
G
G
G
U
GAGGUCG C U G A
U UC
G
AAU
UCAGC
AU
A
G
C
C
C
A
G
G
G
C
U
AUU
AGCUC
AGU
U
G
GU U
AG A G C
GC
A
C
C
CC
U
GA U
A
AG
G
G
U
GAGGUCG C U G A
U UC
G
AAU
UCAGC
AU
A
G
C
C
C
A
G
G
G
C
U
A
UUAGCU
CAG
UU
G G U U AG
AG
C
GC
A
C
C
CC
U
GA U
A
AG
G
G
U
GA
G
GU
CG C U G A
U UC
G
AAU
UCAGCA
U
A
G
C
C
C
A
G
G
G
C
U
A
UUA
GCUC
AGU
U
G
GU U
AG A G C
GC
ACC
C
C
UG
A U
A
A
G
G
G
U
GAGGUCG C U G A
U UC
G
AAU
UCAGCA
U
A
G
C
C
C
A
G
G
G
C
U
A
UUA
GCUC
AGU
U
G
GU U
AG A G C
GC
A
CCC
C
UG
A U
A
A
G
G
G
U
GAGGUCG C U G A
U UC
G
AAU
UCAGCA
U
A
G
C
C
C
A
G
G
G
C
U
A
UUA
GCUC
AGU
U
G
GU U
AG A G C
GC
A
C
CCC
UG
A U
A
A
G
G
G
U
GAGGUCG C U G A
U UC
G
AAU
UCAGCA
U
A
G
C
C
C
A
G
G
G
C
U
A
UUA
GCUC
AGU
U
G
GU U
AG A G C
GC
A
C
C
CC
U
GA U
A
AG
G
G
U
GAGGUCG C U G A
U UC
G
AAU
UCAGCA
U
A
G
C
C
C
A
G
G
G
C
U
A
UUA
GCUCAGU
U
GG
U UAG A G C
GC
A
C
C
CC
U
GA U
A
AG
G
G
U
GAGGUCG C U G A
U UC
G
AAU
UCAGCA
U
A
G
C
C
C
A
G
G
G
C
U
A
UUA
GCUC
AGU
U
G
GU U
AG A G C
GCACC
C
C
UG
A U
A
A
G
G
GUGAGGUCG C U G A
U UC
G
AAU
UCAGCA
U
A
G
C
C
C
A G
G
G
C
U
A
UUAG
C
UC
A
GU
U
G
GU
U
AG
A
G
C
G
C
A
C
C
CC
U
GA U
A
AG
G
G
U
GAGGUC
G
C
UG
A
UU
C
G
AAU
UC
A
G
C
A
U
A
G
C
C
C
A
G
G
G
C
U
A
UUAG
C
UC
A
G
U
UG
G
U
U
AG A
G
C
GCA
C
C
C
C
UG
A U
A
A
G
G
GUGAG
G
U
C
G
C
UG
A
U
U
CGA
A
U
U
CA
G
C
A
U
A
G
C
C
C
A
G
G
G
C
U
A
UUAG
C
UC
A
G
U
UG
G
U
U
AG A
G
C
GCA
C
C
C
C
UG
A U
A
A
G
GGUGA
G
G
U
C
G
C
UG
A
U
U
CGA
A
U
U
CA
G
C
A
U
A
G
C
C
C
A
G
G
G
C
U
A
UUAG
C
UCA
G
U
UG
G
U
U
AG A
G
C
G
C
A
C
C
CC
U
GA U
A
AG
G
G
U
GAGGUC
G
C
UG
A
U
U
CGA
A
U
UC
A
G
C
A
U
A
G
C
C
C
A
G
G
G
C
U
A
UUAG
C
UC
A
G
U
U
G
GU
U
AG
A
G
C
G
C
A
C
C
CC
U
GA U
A
AG
G
G
U
GAGGUC
G
C
UG
A
UU
CGA
A
U
U
CA
G
C
A
U
A
G
C
C
C
A
Energies → Structure ProbabilitiesStructure Probabilities → Base Pair Probabilities
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
12
Suboptimals and Probabilities
GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA
...
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA G
GGCUA
UUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCA
UAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG
CACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GCACC
CCU
GA U
AAGGGUGAGG
UCG C U G A
U U CG
AAUUCAGC
AUAGCCCA G
GGCUAU
UAG
CUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGC
AUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGC
UCAGU
U G G U U AGAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUCA
GU U G G U U
A GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GC
AC
CCCU
GA U
AAGGGUG
AGG
U C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GA G
CG
CACCCC
UG A U
AA
GGGUGA
GGU
C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G
A U UCG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G CGCACC
CCU
GA U
AAGGGUGAG
GUCG
C U G AU U C
GAAU
UCAGCAUAGCCCA G
GGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCU
CAG
UUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AU
UCA
GCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUU
GGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
-25.90 -25.90 -25.90 -26.70 -26.30 -27.00 -27.00 -27.00 -27.80 -26.10 -26.20 -26.20
-26.20 -26.20 -25.90 -25.90 -25.90 -25.90 -26.70 -26.10 -26.60 -26.60 -26.60 -26.00
-27.40 -25.90 -26.50 -26.40 -28.10 -26.40 -28.10 -26.40 -28.10 -26.40 -26.40 -28.90
-27.20 -27.30 -25.90 -26.50 -26.50 -26.20 -26.20 -26.20 -26.20 -27.00 -25.90 -26.00
-26.10 -26.10 -26.60 -26.10 -26.10 -26.60 -27.00 -27.00 -26.70 -26.70 -26.70 -26.70
-27.50 -26.10 -26.50 -26.40 -26.40 -26.10 -26.10 -26.10 -26.10 -26.90
Energies → Structure Probabilities
Structure Probabilities → Base Pair Probabilities
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
12
Suboptimals and Probabilities
GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA
...
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA
GGGCUAUUAGCUC
AGUUGG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAUA
GCCCA G
GGCUA
UUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCA
UAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCCU
GA U
AAGGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
GCACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG
CACCCC
UG A U
AA
GGGUGAGGUC
G C U G AU U C
GAAU
UCAGCAU
AGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GCACC
CCU
GA U
AAGGGUGAGG
UCG C U G A
U U CG
AAUUCAGC
AUAGCCCA G
GGCUAU
UAG
CUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGC
AUAGCCCA
GGGCUAU
UAG
CU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGC
AUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCCU
GA U
AAGGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGC
UCAGU
U G G U U AGAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUCA
GU U G G U U
A GAG
CG
CACCCC
UG A U
AA
GGGUG A
GG
UC
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G C
GC
AC
CCCU
GA U
AAGGGUG
AGG
U C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUAGCU
CAGU U G G U U A
GA G
CG
CACCCC
UG A U
AA
GGGUGA
GGU
C
G C U G AU U C
GAAU
UCAGCAUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCCU
GA U
AAGGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G
A U UCG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUUA
GCUCAGUU
G G U UAG A G CG C
ACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
GGGCUAUU
AGCUC
AGUUGG
U U AG A G CGCACC
CCU
GA U
AAGGGUGAG
GUCG
C U G AU U C
GAAU
UCAGCAUAGCCCA G
GGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCUC
AG U
UGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA A
UU
CAGCAUAGCCCA
GGGCUAUUAGCU
CAG
UUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AU
UCA
GCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUG
GUUA
G AGCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUU
GGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGA
AUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGUUAG
AGC
GCACCCC
UG A U
AA
GGGUGA
GGUC
GCUGAUUCGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGG
UCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U A
AGGGUGAG
GUCGCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCCU
GA U
AAGGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
GGGCUAUUAGCU
CAGUUGGU
UAGA
GCGCACCCC
UG A U
AA
GGGUGAGGUC
GCU G
AUU
CGAAUUC
AGCAUAGCCCA
-25.90 -25.90 -25.90 -26.70 -26.30 -27.00 -27.00 -27.00 -27.80 -26.10 -26.20 -26.20
-26.20 -26.20 -25.90 -25.90 -25.90 -25.90 -26.70 -26.10 -26.60 -26.60 -26.60 -26.00
-27.40 -25.90 -26.50 -26.40 -28.10 -26.40 -28.10 -26.40 -28.10 -26.40 -26.40 -28.90
-27.20 -27.30 -25.90 -26.50 -26.50 -26.20 -26.20 -26.20 -26.20 -27.00 -25.90 -26.00
-26.10 -26.10 -26.60 -26.10 -26.10 -26.60 -27.00 -27.00 -26.70 -26.70 -26.70 -26.70
-27.50 -26.10 -26.50 -26.40 -26.40 -26.10 -26.10 -26.10 -26.10 -26.90
Energies → Structure ProbabilitiesStructure Probabilities → Base Pair Probabilities
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
13
Dotplots and Reliabilities
G G G C U A U U A G C U C A G U U G G U U A G A G C G C A C C C C U G A U A A G G G U G A G G U C G C U G A U U C G A A U U C A G C A U A G C C C A
G G G C U A U U A G C U C A G U U G G U U A G A G C G C A C C C C U G A U A A G G G U G A G G U C G C U G A U U C G A A U U C A G C A U A G C C C AGG
GC
UA
UU
AG
CU
CA
GU
UG
GU
UA
GA
GC
GC
AC
CC
CU
GA
UA
AG
GG
UG
AG
GU
CG
CU
GA
UU
CG
AA
UU
CA
GC
AU
AG
CC
CA
GG
GC
UA
UU
AG
CU
CA
GU
UG
GU
UA
GA
GC
GC
AC
CC
CU
GA
UA
AG
GG
UG
AG
GU
CG
CU
GA
UU
CG
AA
UU
CA
GC
AU
AG
CC
CA
GGGCUAUUA
GCUCAGUU
GG
U U AG A G C
G CACCCC
UG A U
AA
GGGUGAGGU
CG C U G A
U U CG
AAUUCAGC
AUAGCCCA
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
13
Dotplots and Reliabilities
G G G C U A U U A G C U C A G U U G G U U A G A G C G C A C C C C U G A U A A G G G U G A G G U C G C U G A U U C G A A U U C A G C A U A G C C C A
G G G C U A U U A G C U C A G U U G G U U A G A G C G C A C C C C U G A U A A G G G U G A G G U C G C U G A U U C G A A U U C A G C A U A G C C C AGG
GC
UA
UU
AG
CU
CA
GU
UG
GU
UA
GA
GC
GC
AC
CC
CU
GA
UA
AG
GG
UG
AG
GU
CG
CU
GA
UU
CG
AA
UU
CA
GC
AU
AG
CC
CA
GG
GC
UA
UU
AG
CU
CA
GU
UG
GU
UA
GA
GC
GC
AC
CC
CU
GA
UA
AG
GG
UG
AG
GU
CG
CU
GA
UU
CG
AA
UU
CA
GC
AU
AG
CC
CA
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
14
Integrating Prior Knowledge
• Knowledge on base pairing:
GCGGAUUUAG
CUCAGUU
GGG
AGAGCGC
C
AGACU
GA
AG
A U CUG
GA
GGUC
C
UGUGUUCGA
UCCA
C
A
GAAUUCGC
A
CCA
←G
C
G
G
AUUUAGCUCAGUUG
GGAG
A
GC G
C C AG A C U G A
A G AU
CU
GG
A GG
UC
CU G
UG
UUC
GA
UC
CAC
AG
A
A
U
U
C
G
CA
CCA
→GCGGAUUU
AGCUC
AGU
U
GG
G AG A G C
GCCA
GA
CU
GA A
G
AUCUGGAGG
UCC U G U G
U UCGA
UCCACAG
AAUUCGC
A
CCA
• Structure probing experiments (e.g. SHAPE)
• HomologyExamp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
14
Integrating Prior Knowledge
• Knowledge on base pairing:
GCGGAUUUAG
CUCAGUU
GGG
AGAGCGC
C
AGACU
GA
AG
A U CUG
GA
GGUC
C
UGUGUUCGA
UCCA
C
A
GAAUUCGC
A
CCA
←G
C
G
G
AUUUAGCUCAGUUG
GGAG
A
GC G
C C AG A C U G A
A G AU
CU
GG
A GG
UC
CU G
UG
UUC
GA
UC
CAC
AG
A
A
U
U
C
G
CA
CCA
→GCGGAUUU
AGCUC
AGU
U
GG
G AG A G C
GCCA
GA
CU
GA A
G
AUCUGGAGG
UCC U G U G
U UCGA
UCCACAG
AAUUCGC
A
CCA
• Structure probing experiments (e.g. SHAPE)
• HomologyExamp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
14
Integrating Prior Knowledge
• Knowledge on base pairing:
GCGGAUUUAG
CUCAGUU
GGG
AGAGCGC
C
AGACU
GA
AG
A U CUG
GA
GGUC
C
UGUGUUCGA
UCCA
C
A
GAAUUCGC
A
CCA
←G
C
G
G
AUUUAGCUCAGUUG
GGAG
A
GC G
C C AG A C U G A
A G AU
CU
GG
A GG
UC
CU G
UG
UUC
GA
UC
CAC
AG
A
A
U
U
C
G
CA
CCA
→GCGGAUUU
AGCUC
AGU
U
GG
G AG A G C
GCCA
GA
CU
GA A
G
AUCUGGAGG
UCC U G U G
U UCGA
UCCACAG
AAUUCGC
A
CCA
• Structure probing experiments (e.g. SHAPE)
• HomologyExamp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
15
Comparative Analysis with Alifold
Examp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-
⇓ Alifold
_ _ _ _ _ _ _ _ _ _ C G C U _ A A _ _ A C C A A C _ _ _ _ _ A G C _ C G C _ _ _ _ _ _ _ G _ G G C G A G A A C _ _
_ _ _ _ _ _ _ _ _ _ C G C U _ A A _ _ A C C A A C _ _ _ _ _ A G C _ C G C _ _ _ _ _ _ _ G _ G G C G A G A A C _ ___
__
__
__
__
CG
CU
_A
A_
_A
CC
AA
C_
__
__
AG
C_
CG
C_
__
__
__
G_
GG
CG
AG
AA
C_
_
__
__
__
__
__
CG
CU
_A
A_
_A
CC
AA
C_
__
__
AG
C_
CG
C_
__
__
__
G_
GG
CG
AG
AA
C_
_
____
____
_ _ CGCUGA
A__
ACCA
AC_
_ _ G _AGC
GCGC___
___GG
_GGCG A
GAAC
__
..........((((((...(((............))).......)))))).......Examp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-
[Fig. adapted from Vienna RNA Package 2.0, ALMOB 2011, alifold exanple]
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
15
Comparative Analysis with Alifold
Examp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-
⇓ Alifold
_ _ _ _ _ _ _ _ _ _ C G C U _ A A _ _ A C C A A C _ _ _ _ _ A G C _ C G C _ _ _ _ _ _ _ G _ G G C G A G A A C _ _
_ _ _ _ _ _ _ _ _ _ C G C U _ A A _ _ A C C A A C _ _ _ _ _ A G C _ C G C _ _ _ _ _ _ _ G _ G G C G A G A A C _ ___
__
__
__
__
CG
CU
_A
A_
_A
CC
AA
C_
__
__
AG
C_
CG
C_
__
__
__
G_
GG
CG
AG
AA
C_
_
__
__
__
__
__
CG
CU
_A
A_
_A
CC
AA
C_
__
__
AG
C_
CG
C_
__
__
__
G_
GG
CG
AG
AA
C_
_
____
____
_ _ CGCUGA
A__
ACCA
AC_
_ _ G _AGC
GCGC___
___GG
_GGCG A
GAAC
__
..........((((((...(((............))).......)))))).......Examp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-
[Fig. adapted from Vienna RNA Package 2.0, ALMOB 2011, alifold exanple]
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
16
Simultaneous Alignment and Folding(with LocARNA)
g
c
a
g
u
c
gu
g
gcc
gagu
g
g
uu a a
g g c
gu
cu
gac
u
cg a
a
a
ucagau u
cc
c
u c
ug
gg
ag
c
g u a g gu u
c
gaa
u
ccuacc
g
g
c
u
g
c
g
g
ccggggugg
ggu
a
g
ug g
c c a u c c u g gg
gg
ac
ugug
ga
uc c
cc
ug a
c
ccg
gguu
caau
uc
cc
gg
uc
cc
g
g
cc
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
16
Simultaneous Alignment and Folding(with LocARNA)
AC021639.5_181586-181505
g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c g
g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c ggc
ag
uc
gu
gg
cc
ga
gu
gg
uu
aa
gg
cg
uc
ug
ac
uc
ga
aa
uc
ag
au
uc
cc
uc
ug
gg
ag
cg
ua
gg
uu
cg
aa
uc
cu
ac
cg
gc
ug
cg
gc
ag
uc
gu
gg
cc
ga
gu
gg
uu
aa
gg
cg
uc
ug
ac
uc
ga
aa
uc
ag
au
uc
cc
uc
ug
gg
ag
cg
ua
gg
uu
cg
aa
uc
cu
ac
cg
gc
ug
cg
U67517.1_7511-7582
g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c c
g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c cgc
cg
gg
gu
gg
gg
ua
gu
gg
cc
au
cc
ug
gg
gg
ac
ug
ug
ga
uc
cc
cu
ga
cc
cg
gg
uu
ca
au
uc
cc
gg
uc
cc
gg
cc
gc
cg
gg
gu
gg
gg
ua
gu
gg
cc
au
cc
ug
gg
gg
ac
ug
ug
ga
uc
cc
cu
ga
cc
cg
gg
uu
ca
au
uc
cc
gg
uc
cc
gg
cc
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
17
Simultaneous Alignment and Folding(with LocARNA)
g
c
a
g
u
c
gu
g
gcc
gagu
g
g
uu a a
g g c
gu
cu
gac
u
cg a
a
a
ucagau u
cc
c
u c
ug
gg
ag
c
g u a g gu u
c
gaa
u
ccuacc
g
g
c
u
g
c
g
g
c
g
g
g
g
gu
g
cccgagccuggcc
aa
ag g
g g u c g g g c u c ag g
acccgaug
gc
gu
a
ggc
cugcg u g g g
u uc
aaa
u
cccacc
c
c
c
c
g
c
a
u
g
g
a
g
u
aua
gccaa
gu g g u
aa g
g
c
a
u
c
g
g
uu
u
uu g
g
ua
c
c
ggca
u
g
ca a a g g
u uc
g
aau
ccuuuu
a
c
u
c
c
a
g
a
g
u
a
a
a
gu
c
agcuaa
a
aa
a g c uu
u
u
g
g
gc
c
ca u
a
cc
c
c
a
a
a c a uguug g u
ua
aacc
cc
uucc
u
u
u
a
c
u
a
g
ccggggugg
ggu
a
g
ug g
c c a u c c u g gg
gg
ac
ugug
ga
uc c
cc
ug a
c
ccg
gguu
caau
uc
cc
gg
uc
cc
g
g
cc
c
g
g
a
a
a
guagcu
uagcuu
gg
ua
g a g ca
c
u
c
g
g
u
u
ug
g
g
a
c
c
g
a
g g ggucg c a g g
u uc
g
aau
ccuguc
u
u
u
c
c
g
a
gu
aa
a
cauaguuuaauca
a
aa c
a u u a g a u u g u g
aa
uc u a a
ca
a
u
a g a g gc u
c
g
aaa
ccucu
ug
cu
uacc
AC021639.5_181586-181505
g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c g
g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c ggc
ag
uc
gu
gg
cc
ga
gu
gg
uu
aa
gg
cg
uc
ug
ac
uc
ga
aa
uc
ag
au
uc
cc
uc
ug
gg
ag
cg
ua
gg
uu
cg
aa
uc
cu
ac
cg
gc
ug
cg
gc
ag
uc
gu
gg
cc
ga
gu
gg
uu
aa
gg
cg
uc
ug
ac
uc
ga
aa
uc
ag
au
uc
cc
uc
ug
gg
ag
cg
ua
gg
uu
cg
aa
uc
cu
ac
cg
gc
ug
cg
AP000063.1_59179-59095
g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c a
g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c agc
gg
gg
gu
gc
cc
ga
gc
cu
gg
cc
aa
ag
gg
gu
cg
gg
cu
ca
gg
ac
cc
ga
ug
gc
gu
ag
gc
cu
gc
gu
gg
gu
uc
aa
au
cc
ca
cc
cc
cc
gc
a
gc
gg
gg
gu
gc
cc
ga
gc
cu
gg
cc
aa
ag
gg
gu
cg
gg
cu
ca
gg
ac
cc
ga
ug
gc
gu
ag
gc
cu
gc
gu
gg
gu
uc
aa
au
cc
ca
cc
cc
cc
gc
a
AP000397.1_114390-114319
u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a g
u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a gug
ga
gu
au
ag
cc
aa
gu
gg
ua
ag
gc
au
cg
gu
uu
uu
gg
ua
cc
gg
ca
ug
ca
aa
gg
uu
cg
aa
uc
cu
uu
ua
cu
cc
ag
ug
ga
gu
au
ag
cc
aa
gu
gg
ua
ag
gc
au
cg
gu
uu
uu
gg
ua
cc
gg
ca
ug
ca
aa
gg
uu
cg
aa
uc
cu
uu
ua
cu
cc
ag
M10217.1_5910-5978
a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u a
a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u aag
ua
aa
gu
ca
gc
ua
aa
aa
ag
cu
uu
ug
gg
cc
ca
ua
cc
cc
aa
ac
au
gu
ug
gu
ua
aa
cc
cc
uu
cc
uu
ua
cu
a
ag
ua
aa
gu
ca
gc
ua
aa
aa
ag
cu
uu
ug
gg
cc
ca
ua
cc
cc
aa
ac
au
gu
ug
gu
ua
aa
cc
cc
uu
cc
uu
ua
cu
a
U67517.1_7511-7582
g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c c
g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c cgc
cg
gg
gu
gg
gg
ua
gu
gg
cc
au
cc
ug
gg
gg
ac
ug
ug
ga
uc
cc
cu
ga
cc
cg
gg
uu
ca
au
uc
cc
gg
uc
cc
gg
cc
gc
cg
gg
gu
gg
gg
ua
gu
gg
cc
au
cc
ug
gg
gg
ac
ug
ug
ga
uc
cc
cu
ga
cc
cg
gg
uu
ca
au
uc
cc
gg
uc
cc
gg
cc
X03715.1_388-461
c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g a
c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g acg
ga
aa
gu
ag
cu
ua
gc
uu
gg
ua
ga
gc
ac
uc
gg
uu
ug
gg
ac
cg
ag
gg
gu
cg
ca
gg
uu
cg
aa
uc
cu
gu
cu
uu
cc
ga
cg
ga
aa
gu
ag
cu
ua
gc
uu
gg
ua
ga
gc
ac
uc
gg
uu
ug
gg
ac
cg
ag
gg
gu
cg
ca
gg
uu
cg
aa
uc
cu
gu
cu
uu
cc
ga
X99256.1_11558-11626
g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c c
g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c cgu
aa
ac
au
ag
uu
ua
au
ca
aa
ac
au
ua
ga
uu
gu
ga
au
cu
aa
ca
au
ag
ag
gc
uc
ga
aa
cc
uc
uu
gc
uu
ac
c
gu
aa
ac
au
ag
uu
ua
au
ca
aa
ac
au
ua
ga
uu
gu
ga
au
cu
aa
ca
au
ag
ag
gc
uc
ga
aa
cc
uc
uu
gc
uu
ac
c
(((((((..(((.............))).(((((.......)))))..............
AC021639.5_181586-181505 GCAGUCGUGGCCGAGU---GGUUAAGGCGUCUGACUCGAAAUCAGAUUCCCUCUGGGAGC 57AP000063.1_59179-59095 GCGGGGGUGCCCGAGCCUGGCCAAAGGGGUCGGGCUCAGGACCCGAUGGCGUAGGCCUGC 60AP000397.1_114390-114319 UGGAGUAUAGCCAAG--UGG--UAAGGCAUCGGUUUUUGGUACCG---------GCAUGC 47X03715.1_388-461 CGGAAAGUAGCUUAGCUUGG--UAGAGCACUCGGUUUGGGACCGA---------GGGGUC 49U67517.1_7511-7582 GCCGGGGUGGGGUAGUGGCCAUCCUGG---GGGACUGUGGAUCCC----------CUGAC 47X99256.1_11558-11626 GUAAACAUAGUUUA------AUCAAAACAUUAGAUUGUGAAUCUAA----------CAAU 44M10217.1_5910-5978 AGUAAAGUCAGCUA------AAAAAGCUUUUGGGCCCAUACCCCAA----------ACAU 44
.........10........20........30........40........50........6
(((((.......)))))))))))).
AC021639.5_181586-181505 GUAGGUUCGAAUCCUACCGGCUGCG 82AP000063.1_59179-59095 GUGGGUUCAAAUCCCACCCCCCGCA 85AP000397.1_114390-114319 AAAGGUUCGAAUCCUUUUACUCCAG 72X03715.1_388-461 GCAGGUUCGAAUCCUGUCUUUCCGA 74U67517.1_7511-7582 CCGGGUUCAAUUCCCGGUCCCGGCC 72X99256.1_11558-11626 AGAGGCUCGAAACCUCUUGCUUACC 69M10217.1_5910-5978 GUUGGUUAAACCCCUUCCUUUACUA 69
0........70........80....
GSRRRVR
URGSY
KA
gy-u
gga
u H AA
R G c
ru
YG
GRY
UB
D GRA
YCCRa
u--
c - u --gs
VD
RYR Y R G GU U
CR
AAU
CCYDYYBYYYSC
V
YR
cG
GS
RY
au
DY
YR
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
17
Simultaneous Alignment and Folding(with LocARNA)
g
c
a
g
u
c
gu
g
gcc
gagu
g
g
uu a a
g g c
gu
cu
gac
u
cg a
a
a
ucagau u
cc
c
u c
ug
gg
ag
c
g u a g gu u
c
gaa
u
ccuacc
g
g
c
u
g
c
g
g
c
g
g
g
g
gu
g
cccgagccuggcc
aa
ag g
g g u c g g g c u c ag g
acccgaug
gc
gu
a
ggc
cugcg u g g g
u uc
aaa
u
cccacc
c
c
c
c
g
c
a
u
g
g
a
g
u
aua
gccaa
gu g g u
aa g
g
c
a
u
c
g
g
uu
u
uu g
g
ua
c
c
ggca
u
g
ca a a g g
u uc
g
aau
ccuuuu
a
c
u
c
c
a
g
a
g
u
a
a
a
gu
c
agcuaa
a
aa
a g c uu
u
u
g
g
gc
c
ca u
a
cc
c
c
a
a
a c a uguug g u
ua
aacc
cc
uucc
u
u
u
a
c
u
a
g
ccggggugg
ggu
a
g
ug g
c c a u c c u g gg
gg
ac
ugug
ga
uc c
cc
ug a
c
ccg
gguu
caau
uc
cc
gg
uc
cc
g
g
cc
c
g
g
a
a
a
guagcu
uagcuu
gg
ua
g a g ca
c
u
c
g
g
u
u
ug
g
g
a
c
c
g
a
g g ggucg c a g g
u uc
g
aau
ccuguc
u
u
u
c
c
g
a
gu
aa
a
cauaguuuaauca
a
aa c
a u u a g a u u g u g
aa
uc u a a
ca
a
u
a g a g gc u
c
g
aaa
ccucu
ug
cu
uacc
AC021639.5_181586-181505
g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c g
g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c ggc
ag
uc
gu
gg
cc
ga
gu
gg
uu
aa
gg
cg
uc
ug
ac
uc
ga
aa
uc
ag
au
uc
cc
uc
ug
gg
ag
cg
ua
gg
uu
cg
aa
uc
cu
ac
cg
gc
ug
cg
gc
ag
uc
gu
gg
cc
ga
gu
gg
uu
aa
gg
cg
uc
ug
ac
uc
ga
aa
uc
ag
au
uc
cc
uc
ug
gg
ag
cg
ua
gg
uu
cg
aa
uc
cu
ac
cg
gc
ug
cg
AP000063.1_59179-59095
g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c a
g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c agc
gg
gg
gu
gc
cc
ga
gc
cu
gg
cc
aa
ag
gg
gu
cg
gg
cu
ca
gg
ac
cc
ga
ug
gc
gu
ag
gc
cu
gc
gu
gg
gu
uc
aa
au
cc
ca
cc
cc
cc
gc
a
gc
gg
gg
gu
gc
cc
ga
gc
cu
gg
cc
aa
ag
gg
gu
cg
gg
cu
ca
gg
ac
cc
ga
ug
gc
gu
ag
gc
cu
gc
gu
gg
gu
uc
aa
au
cc
ca
cc
cc
cc
gc
a
AP000397.1_114390-114319
u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a g
u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a gug
ga
gu
au
ag
cc
aa
gu
gg
ua
ag
gc
au
cg
gu
uu
uu
gg
ua
cc
gg
ca
ug
ca
aa
gg
uu
cg
aa
uc
cu
uu
ua
cu
cc
ag
ug
ga
gu
au
ag
cc
aa
gu
gg
ua
ag
gc
au
cg
gu
uu
uu
gg
ua
cc
gg
ca
ug
ca
aa
gg
uu
cg
aa
uc
cu
uu
ua
cu
cc
ag
M10217.1_5910-5978
a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u a
a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u aag
ua
aa
gu
ca
gc
ua
aa
aa
ag
cu
uu
ug
gg
cc
ca
ua
cc
cc
aa
ac
au
gu
ug
gu
ua
aa
cc
cc
uu
cc
uu
ua
cu
a
ag
ua
aa
gu
ca
gc
ua
aa
aa
ag
cu
uu
ug
gg
cc
ca
ua
cc
cc
aa
ac
au
gu
ug
gu
ua
aa
cc
cc
uu
cc
uu
ua
cu
a
U67517.1_7511-7582
g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c c
g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c cgc
cg
gg
gu
gg
gg
ua
gu
gg
cc
au
cc
ug
gg
gg
ac
ug
ug
ga
uc
cc
cu
ga
cc
cg
gg
uu
ca
au
uc
cc
gg
uc
cc
gg
cc
gc
cg
gg
gu
gg
gg
ua
gu
gg
cc
au
cc
ug
gg
gg
ac
ug
ug
ga
uc
cc
cu
ga
cc
cg
gg
uu
ca
au
uc
cc
gg
uc
cc
gg
cc
X03715.1_388-461
c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g a
c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g acg
ga
aa
gu
ag
cu
ua
gc
uu
gg
ua
ga
gc
ac
uc
gg
uu
ug
gg
ac
cg
ag
gg
gu
cg
ca
gg
uu
cg
aa
uc
cu
gu
cu
uu
cc
ga
cg
ga
aa
gu
ag
cu
ua
gc
uu
gg
ua
ga
gc
ac
uc
gg
uu
ug
gg
ac
cg
ag
gg
gu
cg
ca
gg
uu
cg
aa
uc
cu
gu
cu
uu
cc
ga
X99256.1_11558-11626
g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c c
g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c cgu
aa
ac
au
ag
uu
ua
au
ca
aa
ac
au
ua
ga
uu
gu
ga
au
cu
aa
ca
au
ag
ag
gc
uc
ga
aa
cc
uc
uu
gc
uu
ac
c
gu
aa
ac
au
ag
uu
ua
au
ca
aa
ac
au
ua
ga
uu
gu
ga
au
cu
aa
ca
au
ag
ag
gc
uc
ga
aa
cc
uc
uu
gc
uu
ac
c
(((((((..(((.............))).(((((.......)))))..............
AC021639.5_181586-181505 GCAGUCGUGGCCGAGU---GGUUAAGGCGUCUGACUCGAAAUCAGAUUCCCUCUGGGAGC 57AP000063.1_59179-59095 GCGGGGGUGCCCGAGCCUGGCCAAAGGGGUCGGGCUCAGGACCCGAUGGCGUAGGCCUGC 60AP000397.1_114390-114319 UGGAGUAUAGCCAAG--UGG--UAAGGCAUCGGUUUUUGGUACCG---------GCAUGC 47X03715.1_388-461 CGGAAAGUAGCUUAGCUUGG--UAGAGCACUCGGUUUGGGACCGA---------GGGGUC 49U67517.1_7511-7582 GCCGGGGUGGGGUAGUGGCCAUCCUGG---GGGACUGUGGAUCCC----------CUGAC 47X99256.1_11558-11626 GUAAACAUAGUUUA------AUCAAAACAUUAGAUUGUGAAUCUAA----------CAAU 44M10217.1_5910-5978 AGUAAAGUCAGCUA------AAAAAGCUUUUGGGCCCAUACCCCAA----------ACAU 44
.........10........20........30........40........50........6
(((((.......)))))))))))).
AC021639.5_181586-181505 GUAGGUUCGAAUCCUACCGGCUGCG 82AP000063.1_59179-59095 GUGGGUUCAAAUCCCACCCCCCGCA 85AP000397.1_114390-114319 AAAGGUUCGAAUCCUUUUACUCCAG 72X03715.1_388-461 GCAGGUUCGAAUCCUGUCUUUCCGA 74U67517.1_7511-7582 CCGGGUUCAAUUCCCGGUCCCGGCC 72X99256.1_11558-11626 AGAGGCUCGAAACCUCUUGCUUACC 69M10217.1_5910-5978 GUUGGUUAAACCCCUUCCUUUACUA 69
0........70........80....
GSRRRVR
URGSY
KA
gy-u
gga
u H AA
R G c
ru
YG
GRY
UB
D GRA
YCCRa
u--
c - u --gs
VD
RYR Y R G GU U
CR
AAU
CCYDYYBYYYSC
V
YR
cG
GS
RY
au
DY
YR
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
17
Simultaneous Alignment and Folding(with LocARNA)
g
c
a
g
u
c
gu
g
gcc
gagu
g
g
uu a a
g g c
gu
cu
gac
u
cg a
a
a
ucagau u
cc
c
u c
ug
gg
ag
c
g u a g gu u
c
gaa
u
ccuacc
g
g
c
u
g
c
g
g
c
g
g
g
g
gu
g
cccgagccuggcc
aa
ag g
g g u c g g g c u c ag g
acccgaug
gc
gu
a
ggc
cugcg u g g g
u uc
aaa
u
cccacc
c
c
c
c
g
c
a
u
g
g
a
g
u
aua
gccaa
gu g g u
aa g
g
c
a
u
c
g
g
uu
u
uu g
g
ua
c
c
ggca
u
g
ca a a g g
u uc
g
aau
ccuuuu
a
c
u
c
c
a
g
a
g
u
a
a
a
gu
c
agcuaa
a
aa
a g c uu
u
u
g
g
gc
c
ca u
a
cc
c
c
a
a
a c a uguug g u
ua
aacc
cc
uucc
u
u
u
a
c
u
a
g
ccggggugg
ggu
a
g
ug g
c c a u c c u g gg
gg
ac
ugug
ga
uc c
cc
ug a
c
ccg
gguu
caau
uc
cc
gg
uc
cc
g
g
cc
c
g
g
a
a
a
guagcu
uagcuu
gg
ua
g a g ca
c
u
c
g
g
u
u
ug
g
g
a
c
c
g
a
g g ggucg c a g g
u uc
g
aau
ccuguc
u
u
u
c
c
g
a
gu
aa
a
cauaguuuaauca
a
aa c
a u u a g a u u g u g
aa
uc u a a
ca
a
u
a g a g gc u
c
g
aaa
ccucu
ug
cu
uacc
AC021639.5_181586-181505
g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c g
g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c ggc
ag
uc
gu
gg
cc
ga
gu
gg
uu
aa
gg
cg
uc
ug
ac
uc
ga
aa
uc
ag
au
uc
cc
uc
ug
gg
ag
cg
ua
gg
uu
cg
aa
uc
cu
ac
cg
gc
ug
cg
gc
ag
uc
gu
gg
cc
ga
gu
gg
uu
aa
gg
cg
uc
ug
ac
uc
ga
aa
uc
ag
au
uc
cc
uc
ug
gg
ag
cg
ua
gg
uu
cg
aa
uc
cu
ac
cg
gc
ug
cg
AP000063.1_59179-59095
g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c a
g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c agc
gg
gg
gu
gc
cc
ga
gc
cu
gg
cc
aa
ag
gg
gu
cg
gg
cu
ca
gg
ac
cc
ga
ug
gc
gu
ag
gc
cu
gc
gu
gg
gu
uc
aa
au
cc
ca
cc
cc
cc
gc
a
gc
gg
gg
gu
gc
cc
ga
gc
cu
gg
cc
aa
ag
gg
gu
cg
gg
cu
ca
gg
ac
cc
ga
ug
gc
gu
ag
gc
cu
gc
gu
gg
gu
uc
aa
au
cc
ca
cc
cc
cc
gc
a
AP000397.1_114390-114319
u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a g
u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a gug
ga
gu
au
ag
cc
aa
gu
gg
ua
ag
gc
au
cg
gu
uu
uu
gg
ua
cc
gg
ca
ug
ca
aa
gg
uu
cg
aa
uc
cu
uu
ua
cu
cc
ag
ug
ga
gu
au
ag
cc
aa
gu
gg
ua
ag
gc
au
cg
gu
uu
uu
gg
ua
cc
gg
ca
ug
ca
aa
gg
uu
cg
aa
uc
cu
uu
ua
cu
cc
ag
M10217.1_5910-5978
a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u a
a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u aag
ua
aa
gu
ca
gc
ua
aa
aa
ag
cu
uu
ug
gg
cc
ca
ua
cc
cc
aa
ac
au
gu
ug
gu
ua
aa
cc
cc
uu
cc
uu
ua
cu
a
ag
ua
aa
gu
ca
gc
ua
aa
aa
ag
cu
uu
ug
gg
cc
ca
ua
cc
cc
aa
ac
au
gu
ug
gu
ua
aa
cc
cc
uu
cc
uu
ua
cu
a
U67517.1_7511-7582
g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c c
g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c cgc
cg
gg
gu
gg
gg
ua
gu
gg
cc
au
cc
ug
gg
gg
ac
ug
ug
ga
uc
cc
cu
ga
cc
cg
gg
uu
ca
au
uc
cc
gg
uc
cc
gg
cc
gc
cg
gg
gu
gg
gg
ua
gu
gg
cc
au
cc
ug
gg
gg
ac
ug
ug
ga
uc
cc
cu
ga
cc
cg
gg
uu
ca
au
uc
cc
gg
uc
cc
gg
cc
X03715.1_388-461
c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g a
c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g acg
ga
aa
gu
ag
cu
ua
gc
uu
gg
ua
ga
gc
ac
uc
gg
uu
ug
gg
ac
cg
ag
gg
gu
cg
ca
gg
uu
cg
aa
uc
cu
gu
cu
uu
cc
ga
cg
ga
aa
gu
ag
cu
ua
gc
uu
gg
ua
ga
gc
ac
uc
gg
uu
ug
gg
ac
cg
ag
gg
gu
cg
ca
gg
uu
cg
aa
uc
cu
gu
cu
uu
cc
ga
X99256.1_11558-11626
g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c c
g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c cgu
aa
ac
au
ag
uu
ua
au
ca
aa
ac
au
ua
ga
uu
gu
ga
au
cu
aa
ca
au
ag
ag
gc
uc
ga
aa
cc
uc
uu
gc
uu
ac
c
gu
aa
ac
au
ag
uu
ua
au
ca
aa
ac
au
ua
ga
uu
gu
ga
au
cu
aa
ca
au
ag
ag
gc
uc
ga
aa
cc
uc
uu
gc
uu
ac
c
(((((((..(((.............))).(((((.......)))))..............
AC021639.5_181586-181505 GCAGUCGUGGCCGAGU---GGUUAAGGCGUCUGACUCGAAAUCAGAUUCCCUCUGGGAGC 57AP000063.1_59179-59095 GCGGGGGUGCCCGAGCCUGGCCAAAGGGGUCGGGCUCAGGACCCGAUGGCGUAGGCCUGC 60AP000397.1_114390-114319 UGGAGUAUAGCCAAG--UGG--UAAGGCAUCGGUUUUUGGUACCG---------GCAUGC 47X03715.1_388-461 CGGAAAGUAGCUUAGCUUGG--UAGAGCACUCGGUUUGGGACCGA---------GGGGUC 49U67517.1_7511-7582 GCCGGGGUGGGGUAGUGGCCAUCCUGG---GGGACUGUGGAUCCC----------CUGAC 47X99256.1_11558-11626 GUAAACAUAGUUUA------AUCAAAACAUUAGAUUGUGAAUCUAA----------CAAU 44M10217.1_5910-5978 AGUAAAGUCAGCUA------AAAAAGCUUUUGGGCCCAUACCCCAA----------ACAU 44
.........10........20........30........40........50........6
(((((.......)))))))))))).
AC021639.5_181586-181505 GUAGGUUCGAAUCCUACCGGCUGCG 82AP000063.1_59179-59095 GUGGGUUCAAAUCCCACCCCCCGCA 85AP000397.1_114390-114319 AAAGGUUCGAAUCCUUUUACUCCAG 72X03715.1_388-461 GCAGGUUCGAAUCCUGUCUUUCCGA 74U67517.1_7511-7582 CCGGGUUCAAUUCCCGGUCCCGGCC 72X99256.1_11558-11626 AGAGGCUCGAAACCUCUUGCUUACC 69M10217.1_5910-5978 GUUGGUUAAACCCCUUCCUUUACUA 69
0........70........80....
GSRRRVR
URGSY
KA
gy-u
gga
u H AA
R G c
ru
YG
GRY
UB
D GRA
YCCRa
u--
c - u --gs
VD
RYR Y R G GU U
CR
AAU
CCYDYYBYYYSC
V
YR
cG
GS
RY
au
DY
YR
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
18
Interaction Prediction
CGCUAG
AACA
A C U A U CUG UAG C G C G
AAAA C AGC
AC C G
AA
CCGCA
U G C G A A CU
GAGA
ACGCAACCAU
GCGCGCAC
C
• Similar to structure prediction: use NNM!• Predict intra- and inter-molecular structure
• strong restrictions (cofold), no KHP → fast• more freedom (Alkan et al.), KHP → slow
• IntaRNA: reasonable abstraction → fast• Use unpairing probabilities• E.g. genome-wide prediction of sRNA targets
[Cofold example figure adapted from Vienna RNA Package 2.0, ALMOB 2011]
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
18
Interaction Prediction
CGCUAG
AACA
A C U A U CUG UAG C G C G
AAAA C AGC
AC C G
AA
CCGCA
U G C G A A CU
GAGA
ACGCAACCAU
GCGCGCAC
C
• Similar to structure prediction: use NNM!
• Predict intra- and inter-molecular structure• strong restrictions (cofold), no KHP → fast• more freedom (Alkan et al.), KHP → slow
• IntaRNA: reasonable abstraction → fast• Use unpairing probabilities• E.g. genome-wide prediction of sRNA targets
[Cofold example figure adapted from Vienna RNA Package 2.0, ALMOB 2011]
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
18
Interaction Prediction
CGCUAG
AACA
A C U A U CUG UAG C G C G
AAAA C AGC
AC C G
AA
CCGCA
U G C G A A CU
GAGA
ACGCAACCAU
GCGCGCAC
C
• Similar to structure prediction: use NNM!• Predict intra- and inter-molecular structure
• strong restrictions (cofold), no KHP → fast• more freedom (Alkan et al.), KHP → slow
• IntaRNA: reasonable abstraction → fast• Use unpairing probabilities• E.g. genome-wide prediction of sRNA targets
[Cofold example figure adapted from Vienna RNA Package 2.0, ALMOB 2011]
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
18
Interaction Prediction
CGCUAG
AACA
A C U A U CUG UAG C G C G
AAAA C AGC
AC C G
AA
CCGCA
U G C G A A CU
GAGA
ACGCAACCAU
GCGCGCAC
C
• Similar to structure prediction: use NNM!• Predict intra- and inter-molecular structure
• strong restrictions (cofold), no KHP → fast• more freedom (Alkan et al.), KHP → slow
• IntaRNA: reasonable abstraction → fast• Use unpairing probabilities• E.g. genome-wide prediction of sRNA targets
[Cofold example figure adapted from Vienna RNA Package 2.0, ALMOB 2011]
RN
AB
ioin
form
ati
cs·
S.
Wil
l·
19
RNA Bioinformatics—Take home
• Secondary structure as proxy for RNA function
• Nearest neighbor model (NNM) enablesprediction of MFE structures and probabilities
• Solid fundament to construct methods for• Integrating prior knowledge• Simultaneous alignment and folding• Prediction of RNA interactions• . . . pseudoknots, modifications, non-canonical base pairs,
3D structure, kinetics, design
• Building blocks of pipelines to learn about RNA functione.g. sRNA target prediction