Post on 20-Feb-2019
01/06/2017 v1.0.0
NMRANNOTATION
RémiServienPatrickTardivel
MarieTremblay-FrancoCécileCanlet
WHYDOWENEEDANAUTOMATEDNMRANNOTATIONTOOL
2
Metabolomic workflow
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
-0,25
-0,20
-0,15
-0,10
-0,05
-0,00
0,05
0,10
0,15
0,20
0,25
-0,30 -0,25 -0,20 -0,15 -0,10 -0,05 0,00 0,05 0,10 0,15 0,20 0,25 0,30
t[2]
t[1]SIMCA-P+ 12.0.1 - 2011-11-02 16:45:27 (UTC+1)
Urine, plasma, tissues…
Analytical analysis(NMR, MS…)
NMR and MS spectra
Data ReductionMultivariate
Statistical Analysis
GlucoseTaurineGlycine
Glutamate LactateLysineLeucine
Metabolites Identification
Control mice
Exposedmice
1HNMRspectrum ofreference compound
Glycine
9 8 7 6 5 4 3 2 1 0Chemical Shift (ppm)
3.57
2 1OH
NH23
O
§ Each group of protons gives one signal§ A singlet for Glycine
9.0 8.5 8.0 7.5 7.0 6.5 6.0 5.5 5.0 4.5 4.0Chemical Shift (ppm)
4.21
4.21
4.22
4.23
4.23
4.26
4.27
4.37
4.38
4.39
4.44
4.49
4.52
4.53
4.55
4.55
4.56
4.77
4.79
4.79
6.03
6.04
6.106.118.15
8.21
8.22
8.41
8.84
8.84
8.85
8.85
9.15
9.16
9.35
NAD
10
11
O
1213
14a,14b
O OHOHP
15
O
OH
O
O- O
OP16
17a,17b
1819
O20
21
N+
OH
OH
26
25
22
24
23
27
NH2
O
N9
5
4
6
N3
N12
8
N7
NH2
• 1 signal is observed for each type of proton• 18 different signals for this compound• Complex spectrum
1HNMRspectrum ofreference compound
1HNMRspectrum ofbiological matrices
5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0Chemical Shift (ppm)
• No separation of compounds beforeNMR analysis
• A metabolite can have many signalsat different chemical shifts
• Many signals overlap
Identification of metabolites in mixtures is really complex
Metabolite identification(manually)
• Based on1Hchemical shift,coupling patternandcouplingconstants
• Comparison with annotated spectra ofsimilar matricesinliterature• Spiking :areference compoundis added inthesample• Comparison with NMRspectra ofreference compounds
– database in-house– Chenomx (commercial)– HMDB(freeware)
• Multi-dimensional NMR
2DNMRexperiments mostly used
• 1H-1HCOSY(Correlation SpectroscopY)experiment :correlations 1H-1Hvia3bonds
• 1H-1HTOCSY(TOtal Correlation SpectroscopY)experiment :correlations between allprotonswithinagiven spinsystem
• 1H-13CHSQC(Heteronuclear SingleQuantumCoherence)experiment :correlations 1H-13Cvia1bond
• 1H-13CHMBC(Heteronuclear MultipleBondCoherence)experiment :correlations 1H-13Cvia2,3or4bonds
C C
H H
C H
C C
H
C
C C
H H
C
H
Informationis splitted intwo dimensions
Informationonhydrocarbon skeletal
Example :1HNMRspectrum ofaqueous pup extract (PND2)ofrat
2D1H-1HCOSYNMRspectrum ofaqueous pup extract (PND2)
CH3-CH(OH)COOH
1,33 ppm
4,11 ppm lactatethreonine
alanine
glutamate
glutamineglutathion
glutathion
leucineIle valine
valine
lysine lysine
citrate
choline
taurine
b-glucose
glucose
3.78 ppm
1.33 ppm
4.11 ppm
TMSP
taurine
taurine
glucose
glucose
glycinecitrate
leucine
arginine
lysine
choline
lactate
threonine
uridine
inosine
lactatemethionine
betaine
1,33 ppm4,11 ppm 3,25 ppm
22,35 ppm
71,25 ppm
2D1H-13CHSQCNMRspectrum ofaqueous pup extract (PND2)ofrat
www.hmdb.ca/
Comparison with NMRspectra ofreference compounds:Human Metabolome database
Many responses (299) : the expert must check all the spectra to find the right metabolite
Comparison with NMRspectra ofreference compounds:Human Metabolome database
Comparison with NMRspectra ofreference compounds:Human Metabolome database
Comparison with NMRspectra ofreference compounds:Human Metabolome database
13 results
Comparison with NMRspectra ofreference compounds:Database in-house
Aqueous pupextract
Lactate
CreatineTaurine
Alanine
Valine
Glutamate
• Time-consuming• Complicated• Some resonances arenotidentified• Anautomated NMRannotationtool will be helpful :
– PackageRBatman:very complex andnotuser-friendly– BAYESIL(http://bayesil.ca/):webinterface(limited toserum,plasma,andCSFbiofluids)
– Chenomx :commercialsoftwareHao J,Liebeke M,Astle W,DeIorio M,Bundy JG,Ebbels TMD.Bayesian deconvolution andquantificationofmetabolites incomplex1DNMRspectra using BATMAN,2014,NatureProtocols,9(6):1416-1427.Hao,J.,Astle W,DeIorio M,Ebbels TMD.BATMANanRpackagefortheautomated quantificationofmetabolites from NMRusing aBayesian Model.Bioinformatics,2012,28(15):2088-2090.Ravanbakhsh S,LiuP,Bjordahl TC,Mandal R,GrantJR,WilsonM,EisnerR,Sinelnikov I,HuX,Luchinat C,greiner R,Wishart DS.(2015)Accurate,Fully-Automated NMRSpectralProfiling forMetabolomics.PLoS ONE 10(5):e0124219.
Metabolite identification(manually)
We developed a new tool for the annotation of NMR spectra : ASICS (Automatic Statistical Identification in Complex Spectra)
Objectives
Ø Automated identification of compounds in NMR spectra of complex mixtures
Ø Estimation of the proportion of the metabolites in the mixture
Modeling
Mixture Y spectrum Library Z1, …., Zn
Problems : noise and peaks warping
NoiseModeling
Several NMRspectra ofglucoseallow tomodelthenoise
Warping problems
Ø Metabolite concentration is different in pure compound spectrum and in mixture spectrum : peaks do not have necessary the same shape
Ø Experimental conditions are different between pure compound and mixture (pH, ionic strength, analyticalvariability…) : peaks can be shifted
Warping modeling
Two NMRspectra ofasame mixtureobtained indifferentconditions
Warping modeling
Before modeling After modeling
NMRspectra ofcholinepureorinthemixture
We used awarping function :F
Warping
Ø Shift depend on pH : shift is not the same for all compounds
Ø Deformations are localized
Ø Peaks can be shifted, but also expanded or contracted
Ø Maximum deformation given by the expert
Ø NMR spectra are recorded at pH 7 to limit the shifts
Metabolite spectrum modeling
• f1(t) :NMRspectrum ofpurecompound• NMRspectrum observed with noise
• Library{Z1,….,Z200}
Mixturespectrum modeling
Ø Amixturegis modeled asapositivefunction g(t):
Ø Signalis noised andwarped :
Ø a is therelativeproportionofmetabolite inthemixturespectrum :areaunderthecurve ofthemetabolite divided bythetotalareaofthespectrum
Ø Determine non-zero proportionsØ Estimate theproportions:quantification
Method
Ø Afalsedetection is ametabolite identified butthis metabolite isnotpresent inthemixture
Ø Thedetection threshold is theproportionfrom which we aresurethat themetabolite is identified
Ø Compromisebetween low probability offalsedetection andlowthreshold
Ø Inthis method :Ø Controltheprobability toobtain oneorseveral falsedetectionsØ Give thedetection threshold ofeach metaboliteØ Minimize thedetection threshold
Realdataset
A mixture of 6 metabolites :
Metabolites ProportionsD-Glucose 5.8 %Creatinine 21.0 %
L-Phenylalanine 3.3 %L-Proline 6.7 %
Benzoic acid 8.4 %Choline chloride 54.6 %
Library of 36 metabolites
Automatic cleaning
Black:mixturespectrum;Red :hydroxybenzoic spectrum
Eliminationof18metabolites
Identification
Metabolites Identification DetectionThreshold
Benzoic acid Yes 1.6 %
Choline Yes 2.3 %
Creatinine Yes 1.1 %
Glucose Yes 3.1 %
Phenylalanine Yes 2.1 %
Proline Yes 2.6 %
Acetic acid No 0.9 %
Galactose No 2.5 %
Mannitol No 1.6 %
Mannose No 2.4 %
Autres No < 2.5 %
Quantification
Metabolites Estimatedproportions
Real proportions
Benzoic acid 5.7 % 8.4 %Choline 55.8 % 54.6 %
Creatinine 13.3 % 21 %Glucose 5.5 % 5.8 %
Phenylalanine 3.9 % 3.3 %Proline 5.7 % 5.9 %
Detection thresholds are too high
Improvement ofdetection thresholds
Metabolites Identification New thresholds Previousthresholds
Acetic acid No 0.16 % 0.9 %Galactose No 0.19 % 2.5 %Mannitol No 0.24 % 1.6 %Mannose No 0.17 % 2.4 %Others No < 0.68% < 2.5 %
Synthesis Urine
• Mixin500mlofwater:- 3,8gofpotassiumchloride- 8,5gofsodiumchloride- 24,5gofurea- 1,03gofcitric acid- 0,34gofascorbic acid- 1,18gofpotassiumdihydrogenophosphate- 1,4gofcreatinine- 0,64gofsodiumhydroxide- 0,47gofsodiumbicarbonate- 0,28mlofphosphoric acid
Synthesis Urine
• Additionofmetabolites:- Hippuric acid :8.2mM- TMAO:7mM- acetic acid :0.8mM- Alanine:1mM- Betaine :1mM- Carnitine :0.7mM- Dimethylamine :1mM- Ethanolamine :1.5mM- Formate:0.4mM- Glucose:1mM- Glutamine:2mM- Glycine:0.6mM- Guanidinoacetate :0.8mM- Lactate:0.6mM- Lysine:1mM- Malonate :1.7mM- Trigonelline :0.6mM- Tyrosine:0.3mM
21metabolites detectables inNMR
1HNMRspectrum
Annotation:comparison with other tools
True Positive
False Positive
False Negative
True Negative
Accuracy (%)
Compounds in library
Computing Time
ASICS 17 10 4 145 92 176 2 mn 38s
MetaboHunter 4 51 17 795 92 867 < 1 mn
Batman 21 125 0 1 18 147 74 hours
Bayesil 12 17 7 53 73 89 10 mn 48s
Chenomx 15 48 6 269 54 338 < 1 mn
ASICS detects 17 metabolites among 21, but there are 10 false positive. The metabolites must be checked by the NMR expert.
The accuracy is acceptable compared to other tools
Quantification:comparison with other toolsCompound Real proportion ASICS’s
proportionBayesil’sproportion
Chenomx’sproportion(automatic fit)
BATMAN’sproportion
Creatinine 1 1 1 1 1Citric acid 0.434 0.693 12.38 Not identified 0.089Hippuric acid 0.338 0.344 Not identified 0.312 0.072Trimethylamine-N-Oxide
0.286 0.439 Not in library 0.311 1.311
Ascorbic acid 0.156 Not identified Not identified Not identified 0.568Malonic acid 0.073 Not identified Not identified 0.015 0.058Ethanolamine 0.062 0.044 Not identified Not identified 0.259L-Lysine 0.044 0.076 0.512 Not identified 0.214Dimethylamine 0.047 0.054 0.079 Not identified 0.025Betaine 0.042 0.053 0.246 0.055 0.754L-Alanine 0.042 0.056 0.219 0.045 0.230D-Glucose 0.041 0.059 0.705 0.046 0.023Guanidinoacetic acid 0.033 0.052 Not identified 0.022 0.226
L-Carnitine 0.033 0.045 0.159 0.023 0.029L-Glutamine 0.032 Not identified 4.100 Not identified 0.342Acetic acid 0.032 0.031 Not identified 0.035 0.129Glycine 0.031 0.036 0.089 0.032 0.304Lactic acid 0.028 0.027 0.307 0.018 0.025Trigonelline 0.026 0.011 Not in library 0.018 0.044Formic acid 0.017 Not identified 0.006 0.029 0.007L-Tyrosine 0.012 0.024 Not identified 0.014 0.571
The quantifications provided by ASICS or by Chenomx both fit quite well the order of magnitude of the real proportion of the different metabolites, but these quantifications are not accurate
Conclusion
• The program is powerful for detecting the presence of compounds in mixture
• The program is not very powerful for estimating the trueproportion, but we are working on this problem
• At the moment, pH of compounds in the library is 7, but wecan import spectra at different pH
• At the moment, we have 175 compounds in the library but wewill record more compounds (MetaboHub)
• More tests are needed (other mixtures)• This tool can be used to help the expert to annotate the
metabolites in a complex mixture • Publication has been submitted
HOWTODOWITHGALAXY?
39
FORM
Name of the zip file
Exclusion zone
Maximum shiftExecute
MATRIXRESULT
Metabolites names
Relative concentrations
GRAPHRESULT
GRAPHRESULT
GRAPHRESULT
Example :Synthesis Urine
• Upload the1HNMRspectrum ofsynthesis urine• Region toexclude :6.5-4.5ppmforurine• Determine themetabolites present inthis sample• Displaythespectra• Comparetheresults with thelist ofmetabolitespresent inthemixture
Example :Synthesis Urine• Listofmetabolites present inthesample :
– Creatinine (24.8mM;0.378)– Citric acid (10.7mM;0.163)– Hippuric acid (8.2mM;0.125)– TMAO(7mM;0.107)– Glutamine(2mM;0.030)– Malonate (1.7mM;0.026)– Ethanolamine (1.5mM;0.023)– Alanine(1mM;0.015)– Betaine (1mM;0.015)– Dimethylamine (1mM;0.015)– D-Glucose(1mM;0.015)– Lysine(1mM;0.015)– Acetic acid (0.8mM;0.012)– Guanidinoacetate (0.8mM;0.012)– Carnitine (0.7mM;0.011)– Glycine(0.6mM;0.009)– Lactate(0.6mM;0.009)– Trigonelline (0.6mM;0.009)– Formate(0.4mM;0.006)– Tyrosine(0.3mM;0.004)