1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet...
-
date post
24-Jan-2016 -
Category
Documents
-
view
217 -
download
0
Transcript of 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet...
![Page 1: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/1.jpg)
1
Cours parole du 9 Mars 2005enseignants: Dr. Dijana Petrovska-Delacrétaz
et Gérard Chollet
Reconnaissance du locuteur
1. Introduction, Historique, Domaines d’applications
2. Les indices de l’identité dans la parole
3. Vérification du locuteur1. Théorie de la decision
2. Dépendante / Indépendante du texte
4. L’imposture vocale
5. Vérification audio-visuelle de l’identité
6. Evaluations
7. Conclusions
![Page 2: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/2.jpg)
2
Why should a computer recognize who is speaking ?
• Protection of individual property (habitation, bank account, personal data, messages, mobile phone, PDA,...)
• Limited access (secured areas, data bases)
• Personalization (only respond to its master’s voice)
• Locate a particular person in an audio-visual document (information retrieval)
• Who is speaking in a meeting ?
• Is a suspect the criminal ? (forensic applications)
![Page 3: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/3.jpg)
3
Tasks in Automatic Speaker Recognition
• Speaker verification (Voice Biometrics) Are you really who you claim to be ?
• Identification (Speaker ID) : Is this speech segment coming from a known speaker ? How large is the set of speakers (population of the
world) ? • Speaker detection, segmentation, indexing, retrieval, tracking :
Looking for recordings of a particular speaker• Combining Speech and Speaker Recognition
Adaptation to a new speaker, speaker typology Personalization in dialogue systems
![Page 4: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/4.jpg)
4
Applications
• Access ControlPhysical facilities, Computer networks, Websites
• Transaction AuthenticationTelephone banking, e-Commerce
• Speech data ManagementVoice messaging, Search engines
• Law EnforcementForensics, Home incarceration
![Page 5: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/5.jpg)
5
Voice Biometric
• AvantagesOften the only modality over the telephone,Low cost (microphone, A/D), UbiquityPossible integration on a smart (SIM) card Natural bimodal fusion : speaking face
• DisadvantagesLack of discretionPossibility of imitation and electronic impostureLack of robustness to noise, distortion,…Temporal drift
![Page 6: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/6.jpg)
6
Speaker Identity in Speech• Differences in
Vocal tract shapes and muscular controlFundamental frequency (typical values)
100 Hz (Male), 200 Hz (Female), 300 Hz (Child)Glottal waveformPhonotacticsLexical usage
• The differences between Voices of Twins is a limit case• Voices can also be imitated or disguised
![Page 7: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/7.jpg)
7
spectral envelope of / i: /
f
A
Speaker A
Speaker B
Speaker Identity
• segmental factors (~30ms) glottal excitation:
fundamental frequency, amplitude,voice quality (e.g., breathiness)
vocal tract:characterized by its transfer function and represented by MFCCs (Mel Freq. Cepstral Coef)
• suprasegmental factors speaking speed (timing and rhythm of speech units) intonation patterns dialect, accent, pronunciation habits
![Page 8: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/8.jpg)
8
What are the sources of difficulty ?
• Intra-speaker variability of the speech signal (due to stress, pathologies, environmental conditions,…)
• Recording conditions (filtering, noise,…)
• Channel mismatch between enrolment and testing
• Temporal drift
• Intentional imposture
• Voice disguise
![Page 9: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/9.jpg)
9
Acoustic features
• Short term spectral analysis
![Page 10: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/10.jpg)
10
Intra- and Inter-speaker variability
![Page 11: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/11.jpg)
11
Speaker Verification
Typology of approaches (EAGLES Handbook) Text dependent
Public password Private password Customized password Text prompted
Text independent Incremental enrolment Evaluation
![Page 12: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/12.jpg)
12
History of Speaker Recognition
![Page 13: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/13.jpg)
13
Current approaches
![Page 14: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/14.jpg)
14
Dynamic Time Warping (DTW)
Best path
),()Y,X( 2jid yx
“Bonjour” locuteur test Y
“Bon
jour
” lo
cute
ur X
“Bonjour” locuteur 1
“Bonjour” locuteur 2
“Bonjour” locuteur n
DODDINGTON 1974, ROSENBERG 1976, FURUI 1981, etc.
![Page 15: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/15.jpg)
15
Vector Quantization (VQ)
bestquant.
),()Y,X( X2
jiCd y
Dictionnaire locuteur 1
Dictionnaire locuteur 2
Dictionnaire locuteur n
“Bonjour” locuteur test Y
Dic
tionn
aire
locu
teur
X
SOONG, ROSENBERG 1987
![Page 16: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/16.jpg)
16
Hidden Markov Models (HMM)
Bestpath
)S(Plog)Y,X(iXjy
“Bonjour” locuteur 1
“Bonjour” locuteur 2
“Bonjour” locuteur n
“Bonjour” locuteur test Y
“Bon
jour
” lo
cute
ur X
ROSENBERG 1990, TSENG 1992
![Page 17: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/17.jpg)
17
Ergodic HMM
Best path
)S(Plog)Y,X(iXjy
HMM locuteur 1
HMM locuteur 2
HMM locuteur n
“Bonjour” locuteur test Y
HM
M lo
cute
ur X
PORITZ 1982, SAVIC 1990
![Page 18: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/18.jpg)
18
Gaussian Mixture Models (GMM)
REYNOLDS 1995
![Page 19: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/19.jpg)
19
HMM structure depends on the application
![Page 20: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/20.jpg)
20
Some issues in Text-dependent Speaker Verification Systems :
The CAVE and PICASSO projects
• Sequences of digitsSpeaker independent HMM of each digitAdaptation of these HMMs to the client voice (during
enrolment and incremental enrolment)EER of less than 1 % can be achieved
• Customized passwordThe client chooses his password using some feedback from
the system• Deliberate imposture
![Page 21: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/21.jpg)
21
Gaussian Mixture Model
• Parametric representation of the probability distribution of observations:
![Page 22: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/22.jpg)
22
Gaussian Mixture Models
8 Gaussians per mixture
![Page 23: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/23.jpg)
23
GMM speaker modeling
Front-endGMM
MODELING
WORLDGMM
MODEL
Front-end GMM model adaptation
TARGETGMM
MODEL
![Page 24: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/24.jpg)
24
Baseline GMM method
HYPOTH.TARGET
GMM MOD.
Front-end
WORLDGMM
MODEL
Test Speech
xPxPLog ]
)/()/([
LLR SCORE
)/( xP
)/( xP
=
![Page 25: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/25.jpg)
25
• Two types of errors :False rejection (a client is rejected)False acceptation (an impostor is accepted)
• Decision theory : given an observation O and a claimed identity
H0 hypothesis : it comes from an impostorH1 hypothesis : it comes from our client
• H1 is chosen if and only if P(H1|O) > P(H0|O) which could be rewritten (using Bayes law) as
Decision theory for identity verification
)1()(
)(
)1(
HPHoP
HoOP
HOP
![Page 26: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/26.jpg)
26
Signal detection theory
![Page 27: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/27.jpg)
27
Decision
![Page 28: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/28.jpg)
28
Distribution of scores
![Page 29: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/29.jpg)
29
Detection Error Tradeoff (DET) Curve
![Page 30: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/30.jpg)
30
Evaluation
• Decision cost (FA, FR, priors, costs,…)
• Receiver Operating Characteristic Curve
• Reference systems (open software)
• Evaluations (algorithms, field trials, ergonomy,…)
![Page 31: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/31.jpg)
31
NIST Speaker Verification Evaluations• A reference standard to compare algorithms and stimulate
new developments• Distribution (via LDC) of development and test databases
with :Increasing difficulty (from land line to mobile)Several hundreds of speakers (2 mn of training
data per client),Several thousands test accesses (5 to 50 sec per
access),• Participation of 15-20 labs every year (MIT, IBM, Nuance,
Queensland Univ, ELISA consortium,….)• Annual workshop, Special issues in Journals, …
![Page 32: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/32.jpg)
32
National Institute of Standards & Technology (NIST)Speaker Verification Evaluations
• Annual evaluation since 1995• Common paradigm for comparing technologies
![Page 33: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/33.jpg)
33
Speaker Verification (text independent)
• The ELISA consortiumENST, LIA, IRISA, ...http://www.lia.univ-avignon.fr/equipes/RAL/elisa/index_en.html
• BECARS : Balamand-ENST CEDRE Automatic Recognition of Speakers
• NIST evaluationshttp://www.nist.gov/speech/tests/spk/index.htm
![Page 34: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/34.jpg)
34
NIST evaluations : Results
ENST 2003
![Page 35: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/35.jpg)
35
Evaluations: NIST 2004
![Page 36: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/36.jpg)
36
Combining Speech Recognition and Speaker Verification.
• Speaker independent phone HMMs
• Selection of segments or segment classes which are speaker specific
• Preliminary evaluations are performed on the NIST extended data set (one hour of training data per speaker)
![Page 37: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/37.jpg)
37
ALISP : Automatic Language Independent Speech ProcessingData-driven speech segmentation
![Page 38: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/38.jpg)
38
Searching in client and world speech dictionaries for speaker verification purposes
![Page 39: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/39.jpg)
39
Fusion
![Page 40: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/40.jpg)
40
Fusion results
![Page 41: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/41.jpg)
41
Voice Transformations and Forgery (occasional, dedicated)
• Isolated individuals with few resources or “professional impostors” with a dedicated budget can menace the security of speaker recognition systems
• Voice transformation technologies (e.g. segmental synthesis using an inventory of client speech data) are nowadays available
• Speaker recognition research should explicitly address this forgery issue and define appropriate countermeasures
Prevention by predicting many different forgery scenarios
![Page 42: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/42.jpg)
42
Voice Forgery using ALISP
The same words or not
Impostor
The same words or not
client
transformation
A modification of a source speaker‘s speech to imitate a target speaker
![Page 43: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/43.jpg)
43
Conversion system: ALISP encoder
Speech
MFCC analysis
HNM
HMM recognition
Harmonic envelope
Symbol index
- Representative index- DTW path
Choice of the best representative
unit
Prosody (energy+pitch)
MFCC + delta
Database of HNM Representatives
HMM models
Noise envelope
![Page 44: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/44.jpg)
44
Conversion system: ALISP Decoder
Concatenation of HNM
parameters for each
representative
HNM Synthesis
Speech signalSymbol index
Pitch, energy, timing
Representative index
DTW path
![Page 45: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/45.jpg)
45
Preliminary results: DET curves
• Fabefore forgery: 16 ± 2.0 % (1700 files)
• Faafter forgery: 26 ± 2.0 % (1700 files)
![Page 46: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/46.jpg)
46
Preliminary results
True distributions
![Page 47: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/47.jpg)
47
Multimodal Identity Verification
• M2VTS (face and speech)front view and profilepseudo-3D with coherent light
• BIOMET:
(face, speech, fingerprint, signature, hand shape)data collectionreuse of the M2VTS and DAVID data basesexperiments on the fusion of modalities
![Page 48: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/48.jpg)
48
Speaking Faces : Motivations
• In many situation a video sequence is acquired• Fusion of face and speech increases robustness• Forgery is more difficult
![Page 49: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/49.jpg)
49
Talking Face Recognition(hybrid verification)
![Page 50: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/50.jpg)
50
Lip features
• Tracking lip movements
![Page 51: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/51.jpg)
51
A talking face model
• Using Hidden Markov Models (HMMs)
Acoustic parameters
Visual parameters
![Page 52: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/52.jpg)
52
Imposture Model
![Page 53: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/53.jpg)
53
Cloning
![Page 54: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/54.jpg)
54
Conclusions, Perspectives
• Deliberate imposture is a challenge for speech only systems
• Verification of identity based on features extracted from talking faces should be developped
• Common databases and evaluation protocols are necessary
• Free access to reference systems will facilitate future developments
![Page 55: 1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.](https://reader036.fdocument.pub/reader036/viewer/2022062410/56649d615503460f94a423c5/html5/thumbnails/55.jpg)
55
BioSecure Residential Workshop
• Aug. 1st - 26th, 2005 in ENST, Paris• Reference systems for speech, face, talking face,
fingerprint, iris, hand, signature, …• Comparative evaluations on large databases (BIOMET,
BANCA, FVC,…)• Fusion of modalities
http://www.biosecure.info