Optimized Virtual Screening Miklós Vargyas Zsuzsanna Szabó György Pirok Ferenc Csizmadia ChemAxon...

Optimized Virtual Screening

Miklós VargyasZsuzsanna SzabóGyörgy PirokFerenc Csizmadia

ChemAxon Ltd.

Matthias StegerModest von Korff

AXOVAN AGAllschwil, Switzerland

(Axovan is now Actelion.)

Drug research

structures foundcorporate database

Is it searching for a needle in a haystack?

structures found (virtual hits)

query structures (known actives)

corporate database (targets)

Find something similar to a fistful of needles

Drug research

Molecular similarity

Chemical, pharmacological or biological properties of two compounds match.

The more the common features, the higher the similarity between two molecules.

Chemical

Pharmacophore

What is it?

Molecular similarity

How to calculate it?

)&()()(

)&(),(

yxByBxB

yxByxT

iii yxyxE

Sequences/vectors of bits, or numeric values that can be compared by distance functions, similarity metrics.

Quantitative assessment of similarity/dissimilarity of structuresneed a numerically tractable formmolecular descriptors, fingerprints, structural keys

hashed binary fingerprintencodes topological properties of the chemical graph: connectivity,

edge label (bond type), node label (atom type)allows the comparison of two molecules with respect to their

chemical structure

Molecular descriptors

Example 1: chemical fingerprint

Construction

1. find all 0, 1, …, n step walks in the chemical graph2. generate a bit array for each walks with given number of bits set3. merge the bit arrays with logical OR operation

ExampleCH3 – CH2 – OH

walks from the first carbon atom

length walk bit array

0 C 1010000000

1 C – H 0001010000

1 C – C 0001000100

2 C – C – H 0001000010

2 C – C – O 0100010000

3 C – C – O – H 0000011000

merge bit arrays for the first carbon atom: 1111011110

0100010100010100010000000001101010011010100000010100000000100000

0100010100010100010000000001101010011010100000000100000000100000

Example 2: pharmacophore fingerprint

encodes pharmacophore properties of molecules as frequency counts of pharmacophore point pairs at given topological distance

allows the comparison of two molecules with respect to their pharmacophore

Construction

1. map pharmacophore point type to atoms2. calculate length of shortest path between each pair of atoms3. assign a histogram to every pharmacophore point pairs and count

the frequency of the pair with respect to its distance

Example 2: pharmacophore fingerprint

Pharmacophore point type based coloring of atoms: acceptor, donor, hydrophobic, none.

000000010000110100000010101000000000011000001000010000100000100001000101100100100101100110100111001111010000001100000001100010000100010100011101010000110000101000010011000010100000000100100000000110111001110111111010000010001000011011011000000010011010000001000101001101000100000000100000000100100000001001000010001010000100011100011101000100001011101100110110010010001101001100001000010111010011010101011111100001000001111110001000010000100010100001000101001111010100001000100000000100100000101001000010001010000001000100010100010100100000000000001010000010000100000100000000010001010001001100000000000000000001010000001000000000000000000001000101000101000000000000001010000100100000000001000000000000000101010101111100111110100000000000011010100011100100001100101000010001010001100001000001100000000001000100000011000000000110000000000001000000000100001000000000000010101000000001000001001000000100010100010100000000100000000000010000000000000100001000011000000100010000110001001010000001010010101110001000010000100010100001000111000101000100001000010011100100100000100011000000001010000101010100010100010100100000000000010010000010010100100100010000

targets

query fingerprint

proximity

target fingerprints

0101010100010100010100100000000000010010000010010100100100010000

Virtual screening using fingerprints

Individual query structure

000000010000110100000010101000000000011000001000010000100000100001000101100100100101100110100111001111010000001100000001100010000100010100011101010000110000101000010011000010100000000100100000000110111001110111111010000010001000011011011000000010011010000001000101001101000100000000100000000100100000001001000010001010000100011100011101000100001011101100110110010010001101001100001000010111010011010101011111100001000001111110001000010000100010100001000101001111010100001000100000000100100000101001000010001010000001000100010100010100100000000000001010000010000100000100000000010001010001001100000000000000000001010000001000000000000000000001000101000101000000000000001010000100100000000001000000000000000101010101111100111110100000000000011010100011100100001100101000010001010001100001000001100000000001000100000011000000000110000000000001000000000100001000000000000010101000000001000001001000000100010100010100000000100000000000010000000000000100001000011000000100010000110001001010000001010010101110001000010000100010100001000111000101000100001000010011100100100000100011000000001010000101010100010100010100100000000000010010000010010100100100010000

queries

targets

hypothesis fingerprint

proximity

target fingerprints

Virtual screening using fingerprints

Multiple query structures010001010001110101000011000010100001001100001010000000010010000000011011100111011111101000001000100001101101100000001001101000000100010100110100010000000010000000010010000000100100001000101000010111010011010101011111100001000001111110001000010000100010100000010001000101000101001000000000000010100000100001000001000000000100010100010100000000000000101000010010000000000100000000000000010101010111110011111010000000000001101010001110010000110010100001000101000110000100000110000000000100010000001100000000011000000000000100000000010000100000000000001010100000000100000100100000

0101110100110101010111111000010000011111100010000100001000101000

Hypothesis fingerprints

allows faster operation compiles features common to each individual actives

Active 1 0 2 7 1 0 1 6 4 0 0 9 0

Active 2 1 6 0 4 3 3 1 2 2 0 5 1

Active 3 2 4 4 1 0 2 5 3 4 3 4 5

Minimum 0 2 0 1 0 1 1 2 0 0 4 0

Average 1 4 3.67 2 1 2 4 3 2 1.33 6 2

Median 1.5 4 5.5 1 0 2 5 3 3 0 5 3

Hypothesis types

Advantages

Hypothesis fingerprints

Advantages Disadvantages

Minimum •strict conditions for hits if actives are fairly similar

• false results with asymmetric metrics

•misses common features of highly diverse sets

•very sensitive to one missing feature

Average •captures common features of more diverse active sets

• less selective if actives are very similar

Median •captures common features of more diverse active sets

•specific treatment of the absence of a feature

• less sensitive to outliers

• less selective if actives are very similar

Does this work?Slide 15

Active set Pharmacophore fingerprint

Chemical fingerprint

name size Tanimoto Euclidean Tanimoto Euclidean

5-HT3 12 20.14 12.55 776.19 461.44

ACE 89 1.99 1.42 3.71 1.74

Angiotensin2 10 22.80 27.81 183.45 173.91

Beta2 50 3.59 1.52 7.52 2.65

D2 13 61.25 27.64 302.52 155.61

delta 20 109.53 11.66 114.48 56.22

Ftp 35 50.92 46.88 571.50 575.16

mGluR1 18 70.47 5.59 347.72 130.14

NPY-5 139 1.09 1.00 1.46 1.44

Thrombin 8 2.46 2.56 3.71 1.67

Then why do we need optimization?Too many hits

Then why do we need optimization?

0.47 0.55

Inconsistent dissimilarity values

What can be optimized?

22, 1),(

iiii yx

iiiasymmetricweighted

Euclidean yxwyxwyxD

i iiii iiii ii iiii i

i iiiasymmetricscaledTanimoto

yxsyxsyyxsx

yxsyxD

),min(),min(1),min(

),min(1),(,

1,0 asymmetry factor

Nis scaling factor

1,0 asymmetry factor

1,0iw weights

Parameterized metrics

Optimization of metrics

selected targets

training set

test set

known actives

query set

training set

testset

Step 1 optimize parameters for maximum enrichmentStep 2 validate metrics over an independent test set

query set

training set

Step 1 optimize parameters for maximum enrichment

Target hits

Active hits

1111100010000100001000101000

query fingerprint

One step of the algorithm

potential variable value

temporarily fixed value

running variable value

final value

test set

Step 2 validate metrics over an independent test set

Target hits

Active hits

query set

1111100010000100001000101000

query fingerprint

Results

0.47 0.55

Similar structures get closer

Results

Hit set size reduction Active set: 18 mGlu-R1 antagonistsTarget set: 10000 randomly selected drug-like structures + 7 spikes

Metric Enrichment Test hits

Random hitsT

Basic 70.47 5.43 172.00Scaled 7.63 6.00 1101.71Asymmetric 99.36 5.29 106.00Scaled Asymmetric 11.94 5.86 731.14E

Basic 5.59 5.43 1456.57Normalized 11.33 5.14 791.29Asymmetric Normalized 18.58 4.71 368.71Weighted Normalized 296.30 4.14 27.57Weighted Asymmetric Normalized 281.30 3.43 17.00

Results

Improvement by optimization

Active set size Euclidean Optimized Improvement ratio

5-HT3 12 12.55 239.24 49.26

ACE 89 1.42 6.50 4.64

Angiotensin2 10 27.81 85.45 11.15

Beta2 50 1.52 24.70 17.42

D2 13 27.64 123.25 11.19

delta 20 11.66 243.57 69.11

Ftp 35 46.88 71.54 5.35

mGluR1 18 5.59 296.30 70.93

NPY-5 139 1.00 3.22 3.25

Thrombin 8 2.56 4.57 2.62

Results

Active Hit Distribution

offers a more intuitive way to evaluate the efficiency of screeningbased on sorting random set hits and known actives on

dissimilarity values and counting the number of random set hits preceding each active in the sorted list

0.0140.0150.0170.0200.0220.0230.0270.0410.043

number of actives

number of virtual

Results

ACE (pharmacophore similarity)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Number of actives among the hits

Euclidean

OptimizedEuclidean

Results

NPY-5 (pharmacophore similarity)

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49

Number of Active Hits

f Hits

Tanimoto Euclidean Optimized Ideal

Results

β2-adrenoceptor (pharmacophore similarity)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Number of Active Hits

Tanimto Euclidean Optimized Ideal

Results

Structural or pharmacophore fingerprint?

* Average 1-Tanimoto coefficient between each pair of compounds in the active set, based on chemical fingerprint.

Active set size chemical pharmacophore diversity*

5-HT3 12 692.21 239.24 0.30

ACE 89 4.29 6.50 0.56

Angiotensin2 10 190.76 85.45 0.40

Beta2 50 10.98 24.70 0.50

D2 13 358.10 123.25 0.30

delta 20 249.40 243.57 0.32

Ftp 35 575.16 71.54 0.30

mGluR1 18 350.86 296.30 0.37

NPY-5 139 1.52 3.22 0.47

Thrombin 8 3.59 4.57 0.46

Results

Scaffold hopping

Acknowledgements

Nóra MátéSzilárd Dóránt

Bernard Przybylski (Axovan)

Contributors:

The research was supported by

(Axovan is now part of Actelion.)

BibliographyJ. Xu: GMA: A Generic Match Algorithm for Structural Homomorphism,

Isomorphism, and Maximal Common Substructure Match and its Applications, J. Chem. Inf. Comput. Sci., 1996, 36, 1, 25-34.

L. Xue, F. L. Stahura, J. W. Godden, J. Bajorath: Fingerprint Scaling Increases the Probability of Identifying Molecules with Similar Activity in Virtual Screening Calculations, J. Chem. Inf. Comput. Sci., 2001, 41, 3, 746-753.

G. Schneider, W. Neidhart, T. Giller, and G. Schmid: 'Scaffold-Hopping' by Topological Pharmacophore Search: A Contribution to Virtual Screening, Angew. Chem. Int. Ed., 1999, 38, 19, 2894-2896

D. Horvath: High Throughput Conformational Sampling and Fuzzy Similarity Metrics: A Novel Approach to Similarity Searching and Focused Combinatorial Library Design and its Role in the Drug Discovery Laboratory; manuscript

J. Bajorath: Virtual screening in drug discovery: Methods, expectations and reality http://www.currentdrugdiscovery.com/pdf/2002/3/BAJORATH.pdf

Optimized Virtual Screening Miklós Vargyas Zsuzsanna Szabó György Pirok Ferenc Csizmadia ChemAxon...

Documents

Transcript of Optimized Virtual Screening Miklós Vargyas Zsuzsanna Szabó György Pirok Ferenc Csizmadia ChemAxon...

Hamburg 9. – 11. November 2011 - Buch.de · Modellbasierte Systementwicklung 1 Werkzeuge für den Schmied funktionaler Architekturen 3 Andreas Korff, Jesko G. Lamm, Tim Weilkiens

Le folklore hongrois et l'Europe de l'Estvargyaslajos.hu/docs/1960/57Le_folklore_Hongrois_et_lEurope_de_lEst_1969.pdfLe folklore hongrois et l'Europe de l'Est par L. VARGYAS Budapest

Datazione con il radiocarbonio: fondamenti e applicazioni...C si possa produrre anche in natura per effetto dei raggi cosmici. 1940: S. Korff scopre che i neutroni sono effettivamente

As Maquinações de Kal Korff Contra BILLY MEIER A Verdade A Respeito

A magyar nepzene alapjai emelt irasbeli javitasi 0801...(Vargyas Lajos: A magyarság népzenéje c. könyv Planétás Kiadó Budapest, 2002. CD melléklet) Régi stílusú, pentaton

Projekte der Bildungs- und Unterrichtsforschung · KiM – Kind im Mittelpunkt. ... Baader,Schröer, Korff, Oppermann, Raitelhuber, Roman & Schröder Chancengleichheit in der strukturierten

Ludwig van Beethoven - ReadingSample€¦ · Suhrkamp BasisBiographien 46 Ludwig van Beethoven Bearbeitet von Malte Korff Originalausgabe 2010. Taschenbuch. 157 S. Paperback ISBN

KORFF + Co - LOARloar.at/download/G109.pdfKORFF + Co.KG FLUGHANDBUCH G109B Dieselstrasse 5 Motor LIMBACH L2400 DT1 D-63128 Dietzenbach Propeller MTV-1-A/L 170-05 I. Allgemeines I.1

Sehr verehrte Damen, liebe Kameraden, · Namens des Vorstands und persönlich danke ich Ihnen für Ihre treue ... Artikels von Christian Korff in der Zeitschrift MILITARIA ... und

InvestmentJan C. Knappe Rechtsanwalt und Fachanwalt für Handels-und Gesellschaftsrecht, München Dr. Matthias Korff Steuerberater, München Dr. Martin Krause Dipl.-Kfm., Rechtsanwalt

Molon Motor & Coil v. Merkle-Korff - Complaint

Encéphalopathies épileptiques : les tableaux ... · Encéphalopathies épileptiques : les tableaux électrocliniques de l'enfance | Christian M. Korff Epileptologie 2009; 26 165

17. RHEINISCH-WESTFÄLISCHES SEMINAR zur Geschichte und ... · L‘ Évangile selon Saint Nicolas – Bourbaki im Nachkriegsdeutschland 11.30 Kaffeepause 11.45 Emese Vargyas (Mainz)

Az emberi tisztesség balladája - EPAepa.oszk.hu/02500/02518/00278/pdf/EPA02518_irodalomtortenet_1997_03_373-386.pdfA megoldás kulcsa ismét a balladai vonást hangsúlyozza. Vargyas

POEZIA PREROMANTICĂ - licart2010.files.wordpress.com · matur”); astfel, H.A. Korff identifică “pre-romantismul” german cu perioada mişcării “Sturm und Drang”, pe care

Bibliografie zum wissenschaftlichen Gesamtwerk Wilhelm Korffs · Bibliografie von Wilhelm Korff 2 - Die Energiefrage. Entdeckung ihrer ethischen Dimension. Unter Mitarbeit von Stephan

Magda Belmontesi...idratante Hydraenergy di Korff (47,50 euro) è per pelli secche. 3. Crema Biorivitalizzante Viso di Collistar con attivatore di collagene e acido ialuronico è un

Az IdŐ rostájában - ZTIzti.hu/sipos_gyujtesek/pdf/103.pdf · Az IdŐ rostájában Tanulmányok Vargyas Lajos 90. születésnapjára Szerkesztette Andrásfalvy Bertalan Domokos

A SZOMSZÉDSÁG ÉS KÖZÖSSÉG - Vargyas Lajosvargyaslajos.hu/docs/1972/02A_magyar_es_a_delszlav... · 2017. 8. 23. · 72.74051 Akadémiai Nyomda, Budapest - Felelős vezető: Bernát

Alutus LEADER kiadvány (09) - Vargyas