About BoostThreader

20
LEE, JUYONG 2009. 08. 26 About BoostThreader

description

About BoostThreader. Lee, Juyong 2009. 08. 26. What is BoostThreader?. A Sequence-Structure threading program Published by J. Xu’s group Known to be good for hard cases Does not work…… for me……. Let’s thread!. 준비물 : sequence protein structure scoring function algorithm. - PowerPoint PPT Presentation

Transcript of About BoostThreader

Page 1: About BoostThreader

LEE, JUYONG2009. 08. 26

About BoostThreader

Page 2: About BoostThreader

What is BoostThreader?

A Sequence-Structure threading program

Published by J. Xu’s group

Known to be good for hard cases

Does not work…… for me……

Page 3: About BoostThreader

Let’s thread!

준비물 : sequence protein structure scoring function algorithm

AB

C D EF

G

Good

BAD

Deletion

Match

Page 4: About BoostThreader

Three algorithms for Alignment!

Generative modelTraditional

Hidden Markov ChainNot that old

Conditional Random FieldUp to date

Dynamic programming

I’m your father

I’m Andrei Andreyevich Markov.

Page 5: About BoostThreader

Dynamic programing

Finding the best scoring path on the alignment matrix

Initial FinalThe path

The alignment!

Page 6: About BoostThreader

More about Dynamic Programming

SEQUENCE

ST

RU

CT

UR

E

deletion

insertion m

atch

A―

Aa

―a

g = Gap penalty = -1

h = Gap penalty = -1

F(i+1, j+1)

Follow the maximum scoring path!

g

hf

Page 7: About BoostThreader

In Conventional seq.-str. alignment

Linear sum of similarities of propertiesFunctions for Match and Gap cases are only

needed! Fmatch= w1*predicted SS * real SS

+ w2*predicted SA * real SA + w3*predicted residue depth * real depth + …

Fgap= Opening penalty+# of gaps * Extension penalty

Only consider next step!

Page 8: About BoostThreader

What’s different in BoostThreader?

Dependent on the current and next step both! Nine scoring functions are necessary!

Gap penalty is context-dependent

Trained from reference alignments! DALI, TMalign etc……

Regression Trees are used as scoring function Not Linear function!

Page 9: About BoostThreader

Regression Tree 는 또 뭔가요 ?

Page 10: About BoostThreader

쉬어가는페이지

Hey nature, Not all flies are not Drosophilia

Page 11: About BoostThreader

Regression Tree!

100 대의 중고차

1500cc 가 넘는가 ?

20 만 km이상

뛰었는가 ?

5 년이 넘었는가 ?

예아니요

평균 8 백만원

아니요 예

평균 5 백만원

아니요 예

평균 15 백만원 평균 11 백만원

Training!

Page 12: About BoostThreader

Example in ThreadingSequence – predicted

properties Structure – observed

properties

SS 가 같은가 ?

SA 정도가 같은가 ?

SA 정도가 같은가 ?

예아니요

확률 0.110 개 중에 1 개

아니요 예

확률 0.310 개 중에 3 개

아니요 예

확률 0.610 개 중에 6 개

확률 0.910 개중에 9 개

Estimate Prob. from examples

Page 13: About BoostThreader

Advantage of Tree

Fast

Interaction between variables can be easily considered

Page 14: About BoostThreader

What’s really happening in BoostThreader?

Initial Setting Set all F0 (uv,seq(i),str(j))= 0 P ~ exp(F)

30 개의 정답 Sequence-Structure alignment!

Calculate Prob. of all possible state transition! Probabilities of all examples! Forward-backward algorithm

Page 15: About BoostThreader

“All Possible” Transitions?

AB–DEa b c d–mmimd

ABab

For MM

Generate examples!

ABbc

ABcd

BDab

BDbc

BDcd

DEab

DEbc

DEcd

Page 16: About BoostThreader

Examples(2)

AB–DEa b c d–mmimd

B-ab

For MI

Generate examples!

B-bc

B-cd

A-ab

A-bc

A-cd

D-ab

D-bc

D-cd

E-ab

E-bc

E-cd

Page 17: About BoostThreader

Inside BoostThreader

Examples and their probabilities Calculated with the current scoring functions

Modify Scoring Functions 정답이면 F 값 증가 ! : F1=F0 + (1 – P )

오답이면 F 값 감소 ! : F1=F0 - P

Add trees until prediction quality doesn’t increase F=F0+F1+F2+F3+F4+F5+……

Page 18: About BoostThreader

Performance

Page 19: About BoostThreader

Summary

BoostThreader considers “Current” and “Next” step

Scoring function consists of Regression Trees

Trees are trained based on Examples~

Page 20: About BoostThreader

감사합니다 !