1 Multiple Sequence Alignment 暨南大學資訊工程學系 黃光璿 2004/05/31.

Post on 24-Dec-2015

270 views 9 download

Transcript of 1 Multiple Sequence Alignment 暨南大學資訊工程學系 黃光璿 2004/05/31.

1

Multiple Sequence Alignment

暨南大學資訊工程學系黃光璿2004/05/31

2

What is a multiple alignment?

3

4

An alignment of ten I-set immunoglobin superfamily

5

Motivation

A multiple alignment may suggest a common structure of the protein produ

cts; a common function; a common evolutionary source.

6

Issues

How to define meaningful scoring function for an alignment? evolutionary correct alignment --- more difficult! structure alignment

How to find the best alignment? by algorithms

7

Three types of alignment problems DNA protein

joined by disulfide bond RNA

more difficult due to long-range correlation

We focus on alignment problems of sequences of DNAs or proteins.

8

9

10

11

12

To prove that a computational problem is NP-hard, we need to reduce an NP-complete (hard) problem to

this problem.

13

When a computational problem is NP-hard, we deal with it by heuristic: convince other people by experiment

s approximation: how to analyze the performanc

e? randomization: how to design a reasonable alg

orithm

14

15

16

17

18

19

20

Branch & bound heuristic for the DP algorithm of the Sum-of-pairs Carrillo & Lipman (1988) The idea was implemented in the famous p

roblem MSA. Lipman, Altshul, Kececiogly, 1989

MSA can align 6 sequences of length ~200 in reasonable time.

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

參考資料及圖片出處

1. Biological Sequence Analysis – Probabilistic Models of Proteins and Nucleic AcidsR. Durbin, S. Eddy, A. Krogh, and G. Mitchison,

Cambridge University Press, 1998.