Post on 30-Jun-2020
MOLECULAR BIOTECHNOLOGY
Design and Synthesis of Bioactive Compounds
Luigi Vitagliano
Istituto di Biostrutture e BioimmaginiConsiglio Nazionale delle RicercheVia Mezzocannone 16, 80134 Napoli
MOLECULAR BIOTECHNOLOGY
Design and Synthesis of Bioactive Compounds
Lecturer : Luigi Vitagliano
Phone: 0812534506
E-mail: luigi.vitagliano@cnr.it
Slides of the lectures can be downloaded from the page
Additional material for further reading will be provided
throughout the course
Synthetic program of the course
Structural Biology
The state of the art
Experimental approaches
X-ray crystallography
Electron microscopy
Computational approaches
Protein structure prediction methods
Molecular dynamics simulation
Structure validation
Successful examples: The ribosome machinery and the DNA structure
Rational Design of bioactive molecules
Properties of protein structures
Ligand-protein interactions
Structure-based drug discovery
Ligand-based drug discovery
Fragment-based drug discoveryClassical and recent successful examples of rational drug design
Lesson 1
Part a
Structural Biology
• The state of the art
• Experimental approaches
• X-ray crystallography
Overview of the Structure-based drug discovery
A key factor in the success of this strategy is the accurate structural
characterization of the target that is frequently a protein or a nucleic acid
Structure-based drug designi
Biomolecoles
‘Small’ molecoles: Water, Vitamines, Hormons, Lipids, Metabolites…
Macromolecules: Proteins, Nuclecic acids (DNA/RNA), Polysaccharides
Molecules found in living organisms
HemoglobinC2952H4664O832N812S8Fe4
Proteins are characterized by a remarkable structural complexity being made of thousands of atoms
WaterH2O
GlucoseC6H12O6
Complexity of Nucleic acids: DNA and RNAChains of DNA are made of millions of nucleotides
Current record for sequence deconding is of 2.3 million bases of human DNA in one step
Formula Cxxxxxxxx Oxxxxxxx Hxxxxxxxx Pxxxxxx Nxxxxxxxx
Chimie dans l’espace - Discovery and importance of
stereochemistry (1875)
The definition of the three-
dimensional structure is crucial for
macromolecules since they may
adopt, in principle, an enormous
number of distinct states
Structure – function relationships in water
H2O
The central oxygen atom binds two hydrogen atoms
and has two lone pairs of electrons
VESPR Rules
Bent structure
Negative pole
Positive pole
Structure – function relationships in water
+
- +
+ -
-
+
+ - +
+ --
+ - +
+ ---
-
Ionic crystal
+
-
Hydration
A biochemical/biological problem
Identification of the key molecular players
Preparation of the samples for the structural investigation
Experiment(s)
Analysis of the biomolecule structure
If everything works, you end up with ……
…..many additional biochemical/biological problems
The structural biological approach
Protein Chemistry
Linear polypeptide chain
N Nitrogen
O Oxygen
C Carbon
H Hydrogen
R Side Chain
Protein three-dimensional structure
Structure - Function relationship
Fibrous proteins These proteins are characterized by very repetitive sequence motif (examples Collagen, wool,
silk…) - They have frequently a structural (mechanical) role – They share significant analogies
with synthetic polymers.
Globular proteins These are characterized by a rough globular shape and presents a large variability in
aminoacid composition and sequence.
Intrinsically disordered proteins (IDP)These are proteins that do not present an intrinsic well-defined three-dimensional structure in
their biological active state. They tend to become structured upon interaction with other
biological partners. They often have a low complexity sequence (repetitions of specific
aminoacid traits).
All type of proteins play crucial role in biological processes. From a chronological point of view
fibrous proteins have been the first to be structurally characterized. They were followed by
globular proteins, that for their huge diversity and complexity are far to be fully characterized.
IDPs represent an emerging subject of the scientific community.
Proteins are often classified according to their structural properties
The first structural characterization of a globular protein was accomplished by
Kendrew (Myoglobin 1958)
The aspect of Kendrew’s model of myoglobin was a horrible, visceral-looking
object, since he had used a long sausage of plasticine
What about the structure of globular proteins?
Upon seeing the structure of myoglobin (Fig. 1) at 6 Å resolution, Kendrew et al. said, “Perhaps the most remarkable features of the molecule are its complexity and its lack of symmetry. The arrangement seems to be almost totally lacking in the kind of regularities which one instinctively anticipates, and it is more complicated than has been predicated by any theory of protein structure. Though the detailed principles of construction do not yet emerge, we may hope that they will do so at a later stage of the analysis.”
Enchanting protein shapes of globular proteins
90°
Scaring protein shapes
Resuscitation promoting factor B da M. tuberculosis
.
A classical exampleHemoglobin Sickle Cell Anaemia
This is genetically tramistted since in heterozygotes the mutation producese resistenceagainst malaria The mutated gene is wdespreadin Africa
Due to its high concentration mutated Hb may form fibers
A single residue mutation may cause a disease
Mutation : Glu Val
In erythrocytesHemoglobin isabundant: 340 mg/ml
Deoxy form
Normal and mutated Hb
Fructose intoleranceMutations in Aldolase B
Incidence 1/20000.
It is fundamental the elimination of fructose from diet
A single residue mutation may cause a disease
Misfolded proteins
Globular proteins in specific conditions may aggregate to form insoluble species Amyloid fibers
Beta-rich structures
Misfolding of proteins ‘
Trasmissible encelopathiesCow
Human Disease: vCJD variant of the diseaseCreutzfeldt-Jakob (nota dal 1996)
Determination of the three-dimensional structure of
biological macromolecules
The status of the art
1 Å 1 nm 10 nm 1 mm 10 mm 0.1 mm100 nm
NMR
X-ray crystallography
AFM
single particle Cryo-EM
Electron tomography
Light microscopy
atoms proteins viruses bacteria cells
Techniques generally used for the structural
characterization of Bio-macromolecules
The paradigm of function-structure relationships
Structure - Function
Experiments
X – Ray crystallography
NMR Nuclear Magnetic resonance
Cryo-EM Electron Microscopy
‘Modeling’
Molecular dynamics
‘Docking’
Computations
Number of protein structures reported in the
Protein Data Bank
The vast majority of these structures (85%) has been so far
determined by X-ray crystallography
Some classical triumps of x-ray crystallography
a) Ribosome
b) Rhodopsin
c) Reovirus core particle
d) HMG-CoA reductase
e) RNA polymerase II
f) ATP synthase
g) Nucleosome core particle
h) Mismatch repairprotein
g) HIV envelope glycoprotein
gp120
Not only large macromolecules……..
Examples of recent fundamental peptide structural studies
Amyloid-forming peptides (4 to 7 residues long)These structures became models for fibers of proteins involved in neurodegenerative diseases
X-ray structures
David Eisenberg group
Nelson et al. 2005, Nature
Sawaya et al 2007 Nature
Wiltzius et al. 2009 Nature Structure and Molecular biology
Sawaya et al. 2009 PNAS
Apostol et al. 2010 JBC
X-ray crystallography and proteins: a happy marriageSince 1901 26 Nobel prizes have been awarded to 43 sceintists in disciplines
related to crystallography. A significant portion of these were for the characterization of large biomolecules
Recent Nobel prizes awarded to studies conducted using X-ray diffraction
2012 Chemistry
Lefkowitz e Kobilka –G-coupled receptors- Protein Crystallography
2011 Chemistry
Shechtman – Discovery of quasicristalls– Basic crystallography
2010 Physics
Geim e Novoselov – Graphene discovery and characterization – Materials crystallography
2009 Chemistry
Ramakrishnan, Steitz e Yonath – Ribosome structure– Protein Crystallography
2006 Chemistry
Kornberg – Tracription factors– Protein Crystallography
2003 Chemistry
MacKinnon – Potassium channels– Protein Crystallography
2017 CrioElectron Microscopy, 2013 Computational biology - 2002 NMR of proteins
This scenario is going to change soon….The coming era of cryo-electron microscopy
After fifty years (and thousands of publications)
the mecahnism is still debated, Sept 2013
Protein funtionality: not just structure
Importance of protein dynamics
How does haemoglobin work?
Determination of the three-dimensional structure of
biological macromolecules by X-ray diffraction
techniques
Basic experiment
Main components
X-ray
Sample
Detector
Proprieties of the X-rays
Electromagnetic radiation with wavelength
between 0.1 and 1000 Å
Conventional X-ray generator
X-ray sources for structural studies on proteins
Conventional
Synchrotron
Syncrothron are devices in which charged particles (electron or
positron) circulate with speed close to that of light
The acceleration of these particles generates radiations
Components of a synchrotron
Bending magnets
Focus
Diffraction
Image
Scheme of diffraction and imaging
Object Object
Detector
Lens
Diffraction
Diffraction
pattern
Diffraction
Single emitter
Constructive
interference
Destructive
interference
Diffraction & Interference
Diffusion from a multiple source
Diffusion from disordered scatters
Diffusion from ordered scatters
A diffraction image
Take home messages
Scatters in these experiments are electrons – therefore we get
direct information on them
Sharp diffraction is a result of the ordered organization of the
electrons in the sample
All electrons of our sample contribute to the scattering in each
direction – each spot on the diffraction diagram depends on the
location of all electrons
Crystallography versus Spectroscopy
Single crystal diffraction For systems with well defined shapes that could generate ordered
crystals - organic molecules, oligonucleotides and globular proteins
Fiber diffraction analysesFibrous systems
X-ray diffraction analyses
Three-dimensional crystals
A diffraction image
In fibers the molecular entities are ordered along the so-called
fiber axis
Fiber X-ray diffraction analysis
Theory
Fiber diffraction
The discovery of the alpha-helix (1951)
The fiber diffraction
pattern
The 3D model
by Pauling and Corey
Wool diffraction pattern:
meridional reflection (5.1 Å)
equatorial reflection (10 Å)
Successful modelling of other biomacromolecules came using
the same approach
Beta-sheet Pauling and Corey, 1951
Collagen Ramachandran and
Triple Helix Kartha 1954
Rich and Crick, 1961
The most famous achievement of X-ray fiber diffraction
analysis
The DNA structure
analisys
Single Crystal X-ray diffraction analysis
Data Collection
Data Analysis
PROTEIN
CRYSTALLOGRAPHY
Main steps
Single crystal X-ray crystallography:
The experimental setup
Example of protein crystals
X-ray sources for protein crystallography
Synchrotron
Conventional
Crystallization
Phase transition: liquid - solid
Aqueous solution
of a protein
Amorphous solid
Crystal
Isotropic solid characterized by an
irregular organization of the
atoms/molecules (for example glass)
Homogeneous and anisotropic solid
characterized by a regular organization
of the atoms/molecules in the space
Crystals are characterized by the repetition in the
three-dimensional space of a specific volume
that is defined as unit cell
a
b
c
Unit cell
The repeating structural motif may be:
an ion, an atom, a molecule or an ensemble
of molecules.
Three-dimensional lattice
The seven crystal classes
The unit cell is defined by three
axes a, b, c and three angles.
Different combinations are
possible
Examples of protein crystals
Due to their internal symmetry crystal are solids with planar
faces as polyhedra
Main features of protein crystals
• High solvent (water) content (20-80%)
• Large solvent channels – the interactions between
mates in the crystals are weak
• Limited level of order – Limited size of protein crystals
• High fragility and sensitioty to external conditions
[Linear dimensions 0.1-1.0 mm (= 1015-1020 molecules);
Inorganic crystals may be seen at macroscopic level (for example quarz)]
[Salt bridges, hydrogen bonds, van der Waals interactions – a small fraction
of the protein atoms are involved in packing contacts]
[pH variations, temperature, ionic strength…]
Crystal packing:
Elongation factor G
Solvent channels
• the protein may
be active
• it is possible to
diffuse small
molecules in the
crystals as ligands
and inhibitors
(soaking)
Ca trace
Needs for crystallization
• Purity = absence of contaminants
• Homogeneity = absence of conformational (inter-
domain flexibility, equilibrium between conformers, aggregation, partial
denaturation…) and/or sequence (fragmentation, uncomplete
post-translational modifications… ) heterogeneity
• Quantity
• Biological activity
• Purity
Effects of contaminants:
They may prevent crystal formation
They may perturb the crystal growth
They may favor the growth of disordered crystals
SDS gel: occurrence and
oimension of peptide/protein
contaminants
Needs for crystallization
How can crystallization may be induced?
Phase transition: liquid - solid
Solubility (cs): maximum solute concentration at a given
temperature
Sovrasaturation (cp/cs) non-equilibrium; the concentration
of the solute (cp) is larger than its solubility (cs)
solubility
Solubility curve = solid-liquid
equilibrium
Co
nc. p
rote
in
Conc. precipitant
(A parameter that decreases
the protein solubility:
temperatura, pH, salt, …)
Phase Diagramprecipitation
sovrasaturation
1) Nucleation: formation of stable aggragates high
sovrasaturation
2) Growth It may occur in the nucleation zone but is more
favore in the metastable region
3) End of the growth
Studies carried out on lisozyme
indicate that critical aggregates
are made of 30-40 molecules.
Main chemico-physical parameters that affect
protein solubility
• Protein concentration
• Concentration and nature of the precipitating agent :
Salts: protein dehydration
Organic solvent : for example alcohols may reduce the
dielectric constant
Organic polymers (PEG, poliamynes)
• Temperature
• pH of the buffer
• Ionic strength
Vapour diffusion techniques(Hanging - Sitting Drop)
Closed system: [ drop of (protein + precipitants) + precipitants+ air] it
evolves toward the equilibrium through the evaporation of the water
from the drop to the reservoir
Hanging Sitting
Examples of experimental settings
Soluzione
madre
Factors that (may) affect protein crystallization
• Nature and concentration of the precipitant
• Protein concentration
• Buffer and pH
• Temperature
• Ionic strength
• Purity and homogeneity of the sample
• Additives, effettors, ligands
• Source of the macromolecule (organism, recom.)
• Redox status of the environement
• Ions
• Detergents
• Gravity, convection, sedimentation
Protein crystallization faclities
RobotsCrystallization under microgravity
conditions
“Beuty is not enough”
What is important is the diffraction pattern!
A real diffraction image
A more real case – resolution is limited
Diffraction spot magnification
Intensities of the pixels that constitute the spot
Integration of the spot intensity
Measure of the spot intensities
These intensities are proportional to the square of the
module of the structure factore F(hkl)
The structure factor is linked to the electron density
through an operation called Fourier
h k l
lzkykxi2explkhFV
1zvxρ
Protein crystallography: basic concepts
Electron density
map
model
fitting
Model building in the elctron density maps
Main chain
Side chains
Risolution structural details
3.5Å 4Å
bassa risoluzione
6Å: Solo caratteristiche grossolane
del modello sono rintracciabili. Ad
es. a-eliche.
3Å: Polypeptide chain trace
2Å: Backbone conformation
1.5Å : Individual atoms in the maps.
Detailed description of the solven
< 1.2Å: Hydrogen atoms (may)
become visible. 1.0Å altissima risol. 2.5Å media risol.
Atomic resolution: better than 1.2 Å
“ultra-high resolution”: RNase A 0.87 Å
Individual atoms
Double conformations
Hydrogen atoms
JMB, 297, 713-732, 2000
ATOM 1 N VAL A 1 10.720 19.523 6.163 1.00 21.36
ATOM 2 CA VAL A 1 10.228 20.761 6.807 1.00 24.26
ATOM 3 C VAL A 1 8.705 20.714 6.878 1.00 18.62
ATOM 5 CB VAL A 1 10.602 22.000 5.966 1.00 27.19
ATOM 6 CG1 VAL A 1 10.307 23.296 6.700 1.00 31.86
ATOM 7 CG2 VAL A 1 12.065 21.951 5.544 1.00 31.74
ATOM 8 N LEU A 2 8.091 21.453 7.775 1.00 16.19
ATOM 9 CA LEU A 2 6.624 21.451 7.763 1.00 17.31
ATOM 10 C LEU A 2 6.176 22.578 6.821 1.00 18.55
ATOM 11 O LEU A 2 6.567 23.730 7.022 1.00 18.72
ATOM 12 CB LEU A 2 6.020 21.707 9.129 1.00 18.34
ATOM 13 CG LEU A 2 6.386 20.649 10.198 1.00 17.39
ATOM 14 CD1 LEU A 2 5.998 21.119 11.577 1.00 17.99
ATOM 15 CD2 LEU A 2 5.730 19.337 9.795 1.00 16.96
ATOM 16 N SER A 3 5.380 22.237 5.852 1.00 15.02
ATOM 17 CA SER A 3 4.831 23.237 4.928 1.00 16.59
ATOM 18 C SER A 3 3.725 24.027 5.568 1.00 14.84
ATOM 19 O SER A 3 3.095 23.717 6.591 1.00 14.40…
X Y Z
Coordinates Occup. B-factorAtom
Residue
A typical PDB file