A C G T A A T G G T T A AC T A G T T A G G A A T C G C G C A T T A T G T C C A C G T T A G G T T G A...
-
Upload
lucrezia-tortora -
Category
Documents
-
view
223 -
download
2
Transcript of A C G T A A T G G T T A AC T A G T T A G G A A T C G C G C A T T A T G T C C A C G T T A G G T T G A...
A C G G T A
A T G G T T A AC T A G T T A G G A A T C G C G C A T T A T G T C C
A C G T T A G G T T G A A C G G C A G G T T T A A A T C G A T T C C
A C G T T A T G A A A T T G G G G C A G G T T T A A C G C G C C C
CA G A T
A U G G UU A A C U A G UU A G G A A U C G C G C A U U A U G U C C
A C G U U A G G U U G A A C G G C A G G U U U A A A U C G A U U C C
A CG G UA
CA G A U
A C G U U A UG A A A U U G G G G C A G G U U U A A C G C G C C C
Metionina
Valina
Asparagina
STOP
Serina
Treonina
Prolina Lisina
Leucina
Glicina
Glutamina
M V N M S
T
V
P
M LK G Q V
M
V
N S
T
P
L
K G
Q
ATTACGGCCATGCGGAGCCGGAAG
CCATG
presente in ?
algoritmo che richiede un numero di confrontipari alla lunghezza di
confronto approssimato di stringhe
ALLINEAMENTO
T G - T A - C G G A - - A T C G G AT - C T - C C G - A C C A T C G G A
T G T A C G G A A T C G G A
T C T C C G A C C A T C G G A
4
3
+
=
7T G C TAC C G G A C C A T C G G A
T G T A C G G A A T C G G AT C TC C GA C CA T CG G A
T C TC C GA C CA T CG G A
T G T A C G G A A T C G G A
T C - T - C C - G A C C A T C G G AT - G T A C - G G A - - A T C G G A
4
3
+
=
7
T G - T A - C G G A - - A T C G G AT - C T - C C G - A C C A T C G G A
cammino minimo
quante operazioni ?
N.B. : il numero di cammini è molto elevato
µn+mn∂n=m=10=)184:756n=m=20=)137:846:528:820impossibile la valutazione esplicita !
RICORSIONE !
V(n;m)= min V(n°1;m°1)(n;m°1) (n;m)
(n°1;m)(n°1;m°1)
V(0;0)=0V(n;m°1)+1
V(n°1;m)+1
ogni arco viene considerato esattamente una volta
numero operazioni = numero archi = mndue sequenze di 1000 basi richiedonoun milione di operazioni
Diverso modello: sostituzioni ammesse
T G T A C G G A A T C G G A
T C T C C G A C C A T C G G A
T G T A C G G A - - A T C G G A
T C T C C G - A C C A T C G G A
4
2
2
8
T G T A C G G A A T C G G AT C TC C GA C CA T CG G A
14
6
T G T A C G G A - A T C G G A
T C T C C G A C C A T C G G A
T G T A C G G A - - A T C G G A
T C T C C G - A C C A T C G G A
T G T A C G G A A T C G G A
T C T C C G A C C A T C G G A
A C T C A G A C A A T G A
T G T A C G - G A A T C G G A
T C T C C G A C C A T C G G A
A C T C A G A C A A T - - G A
ALLINEAMENTO MULTIPLO
Numero confronti = prodotto lunghezze stringhe
3 stringhe lunghe 1000
un miliardo di operazioni !
TAGA CTGA
CTAGA ATGA
ATAGA
TACA TAGA
AGGA ATGA
?
?
?
?
TAGA
ATGA
CTGA
CTAGA
A G - G A
- T A C A- T A G AC T - G A
A T - G AA G - G A
- T A C A- T A G AC T - G A
A T - G A
AUGCCGAUUCAACGGUCCUACUCGGACUUUACC
M P I Q R S Y S D F T
M R I S R S D S D Y T
punteggio (M<->M, P<-> R ...) basatosulle probabilità di mutazione
RICOSTRUZIONE DEI FRAMMENTI
ACGTTACGTTACGGATCGGATTCACGGCGATT
AACAAGCTTCGGAATCGTTACCGGATCGGTTAGG
CGAATTAGTGGCGAA
GGCCTTAAACGACGATGCATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTGTCGCGACGCGCGATGTGTAGAGCTTGATCTCGGATATACGCGATATTGTGAATA
ACGTTACGTTACGAATCGGATTCACGGCGATT
AACCAGCTTCGGAATCG
TTACCGGATCGGTTAGG
CGAATTAGTGGCGAA
AGCCTTAAACGACGATGCATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTGTCGCGACGCGCGATGTGCAGAGCTTGATCTCGGATATACGCGATATTGTGAATA
ACGTTACGTTACGGATCGGATTTACGGCGATT
AACAAGCTTCGGAATCGTTACCGGATCGGTTAGG
AGAATTAGTGGCGAA
GGCCTTAAACGACGATGCATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTCTCGCGACGCGCGATGTGTAGAGCTTGATCTCGGATATACGCGCTATTGTGAATA
ACATTACGTTACGGATCGGATTCACGGCGACT
AACAAGCTTCGGAATCGTTACCGGATCGGTTAAG
CGAATTAGTGGCGAA
GGCCTTAAACGACGTTGCATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTGTCGCGACGCGCGATTTGTAGAGCTTGATCTCGGATATACGCAATATTGTGAATA
ACGTTACGTTACTGATCGGATTCACGGCGATT
AACAAGCGTCGGAATCGTTACCGGATCGGTTAGG
AGAATTAGTGGCGAA
GGCCTTAAACGACGATGCATTGGAATATCGATCGCGCGAATGTGCATA
AACGGACTGTCGCGACGCGCGATGTGTAGAGCTTGTTCTCGGATATACGCGATATTGTGAATA
ACGTTACGTTACGGATCGGATTCACGGCAATT
AACAAGCTTCGGAATAGTTACCGGATCGGTTAGG
CGAATTAGTGGCGAA
GGCCTTAAACGACGATGTATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTGTCGCGACGCTCGATGTGTAGAGCTTGATCTAGGATATACGCGATATTGTGAATA
ACGTTACGTTACGGATCGGATTCACGGCGATT
AACAAGCTTCGGAATCGTTACCGGATCGGTTAGG
CGAATTAGTGGCGAA
GGCCTTAAACGACGATGCATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTGTCGCGACGCGCGATGTGTAGAGCTTGATCTCGGATATACGCGATATTGTGAATA
ACGTTACGTTACGAATCGGATTCACGGCGATT
AACCAGCTTCGGAATCG
TTACCGGATCGGTTAGG
CGAATTAGTGGCGAA
AGCCTTAAACGACGATGCATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTGTCGCGACGCGCGATGTGCAGAGCTTGATCTCGGATATACGCGATATTGTGAATA
ACGTTACGTTACGGATCGGATTTACGGCGATT
AACAAGCTTCGGAATCGTTACCGGATCGGTTAGG
AGAATTAGTGGCGAA
GGCCTTAAACGACGATGCATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTCTCGCGACGCGCGATGTGTAGAGCTTGATCTCGGATATACGCGCTATTGTGAATA
ACATTACGTTACGGATCGGATTCACGGCGACT
AACAAGCTTCGGAATCGTTACCGGATCGGTTAAG
CGAATTAGTGGCGAA
GGCCTTAAACGACGTTGCATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTGTCGCGACGCGCGATTTGTAGAGCTTGATCTCGGATATACGCAATATTGTGAATA
ACGTTACGTTACTGATCGGATTCACGGCGATT
AACAAGCGTCGGAATCGTTACCGGATCGGTTAGG
AGAATTAGTGGCGAA
GGCCTTAAACGACGATGCATTGGAATATCGATCGCGCGAATGTGCATA
AACGGACTGTCGCGACGCGCGATGTGTAGAGCTTGTTCTCGGATATACGCGATATTGTGAATA
ACGTTACGTTACGGATCGGATTCACGGCAATT
AACAAGCTTCGGAATAGTTACCGGATCGGTTAGG
CGAATTAGTGGCGAA
GGCCTTAAACGACGATGTATTCGAATATCGATCGCGCGAATGTGCATA
ACCGGACTGTCGCGACGCTCGATGTGTAGAGCTTGATCTAGGATATACGCGATATTGTGAATA
ACCGTCGTGCTTACTACCGT
- - ACCGT - -- - - - CGTGCTTAC - - - - -- TACCGT - -
TTAC - - - - -- TACCGT - -- - ACCGT - -- - - - CGTGC
1 +1 +2 =___
4
TTACCGTGC
TAGG AGGT CGTC GTCG
TAGGAGGT
1
TAGGAGGT 3 TAGG
AGGT
CGTC
GTCG
1
34 4
441
2
4
2
4
4
1
34 4
441
2
4
2
4
4
TAGG
AGGT
CGTC
GTCG
CGTC
- GTCG
- - - - TAGG
- AGGT
CGTCGTAGGT
lunghezza 10
1
34 4
441
2
4
2
4
4
TAGG
AGGT
CGTC
GTCG
TAGG
- AGGT
- - GTCG
- - CGTC
TAGGTCGTC
lunghezza 9
CGTCGTAGGT
ALBERI FILOGENETICI
A B C D E F
A
B
C
D
E
F
a b c d e
0
1
1
0
0
1 0
0
0
1
1
1
0
0
1
1
1
10
0
0
0
1
1
0
0
0
1
1
1
00110
00010
00100 10010 00011
00101
10011
00100
10010
01011 00010
1
0
1 0 0
1
0
1
0
0 0
0
0
0 0 1
1
1
0
0
1 0
1
0
1 0 0
1
0
1
0
0 0
A
B
C
D
E
F
a b c d e
0
1
1
0
0
1 0
0
0
1
1
1
0
0
1
1
1
10
0
0
0
1
1
0
0
0
1
1
1
esiste un albero filogenetico perfetto con A,B,C,D,E,F nodi?
2 foglie
3 foglie
4 foglie
A B
A
AB
BC CA BC
5 foglie
12 6 18
60 30 30 120
a b c d e
A
B
C
D
E
F
0
1
1
0
0
1 0
0
0
1
1
1
0
0
1
1
1
10
0
0
0
1
1
0
0
0
1
1
1
caratteri ordinati: solo 0 --> 1 ammesso
problema facile
A
B
C
D
E
F
a b c d e
0
1
1
0
0
1 0
0
0
0
1
0
0
0
0
0
0
10
0
0
1
0
1
0
0
0
0
1
1
ABCDEF
a b c d e
011001 0
0
00
10
0
0
00
010
0
0
10
1
0
00
0
1
1
a b c d eE 0 0 00 0
C 1 1 01 0
B 1 0 00 0
F 1 0 01 0
D 0 0 00 1A 0 0 10 1
a
cb
d e
C F
B E
A D
A
B
C
D
E
F
a b c d e
1
1
1
0
1
0 1
1
0
1
1
1
0
0
0
0
1
01
0
1
1
0
1
1
0
0
0
0
0
1
1
1
1
1
0
f
1
1
0
1
0
1
g
A
B
C
D
E
F
a b c d e
0
0
0
1
0
1 0
0
1
0
0
0
0
0
0
0
1
00
1
0
0
1
0
1
0
0
0
0
0
0
0
0
0
0
1
f
0
0
1
0
1
0
g
caratteri non ordinati (filogenia perfetta)
1001011
1101011 1001010
0101011 1101011
0100011
0101011
1101001
1101011
1001010 1001010
DF
A
C
E B