nhận dạng người nói với hệ nhúng
Transcript of nhận dạng người nói với hệ nhúng
-
7/29/2019 nhn dng ngi ni vi h nhng
1/27
Lun vn Thc sTrng i hc Bch Khoa H NiNgnh X l thng tin & truyn thng
Thit k h nhng nhn dng ngi ni trnT-Engine SH7760
Sinh vin thc hin : Nguyn Thnh Kin
Gio vin hng dn : Ts. Trnh Vn Loan
-
7/29/2019 nhn dng ngi ni vi h nhng
2/27
Ni dung trnh by
1. Gii thiu ti.
2. Nhn dng ngi ni.
3. Thit k h nhng T-Engine.
4. Thit k phn mm nhn dng ngi ni.
5. Kt qu t c & hng pht trin.
-
7/29/2019 nhn dng ngi ni vi h nhng
3/27
1. Gii thiu ti
1.1. L do la chn ti.
1.2. Nhim v ca ti.
-
7/29/2019 nhn dng ngi ni vi h nhng
4/27
1.1. L do la chn ti
Tng tc gia con ngi v my tnh ngy cngi hi tnh trc quan cao.
Ting ni l phng tin giao tip thng dngnht c con ngi s dng.
Yu cu tng tc ngi - my thng quaging ni l mt nhu cu tt yu.
Bn cnh cc h nhng chuyn dng ngycng pht trin v c s dng rng ri cho
php to ra cc thit b thng minh vi kchthc nh nhng hiu c ting ni con ngi.
-
7/29/2019 nhn dng ngi ni vi h nhng
5/27
1.2. Nhim v ca ti
Xy dng chng trnh nhn dng ngini s dng m hnh GMM vi t nhn
dng bt k. Thit k h nhng da trn chip SH7760
thc hin chng trnh nhn dng.
-
7/29/2019 nhn dng ngi ni vi h nhng
6/27
2. Tng quan nhn dng ngi ni
Nhn dng ngi ni c hai dng: nh danh ngi ni (speaker identification)
Xc thc ngi ni (speaker verification)
2.12.2
-
7/29/2019 nhn dng ngi ni vi h nhng
7/27
2.1. Trch chn c trng
Tin x l
Phn khung
Hm ca s
Phng php trch chn c trng MFCC
-
7/29/2019 nhn dng ngi ni vi h nhng
8/27
2.1.1 Tin x l
Lc hiu chnh: H(z)=1-az-1vi 0.95 a < 0.97
Loi b khong lng: Ngng nng lng ca cc khung Threshold = MinValue + Ratio * (MeanValueMinValue)
(Ratio ~ 0.3)
Pht hin ting ni (Voice activation detection).Da trn cc thng s ca tn hiu: Hm nng lng ngn hn if ((log10(SP) - log10(NP))>g_dblNoiseThreshold)
bSpeechFlag = TRUE;
-
7/29/2019 nhn dng ngi ni vi h nhng
9/27
2.1.2 Phn khung
Tn hiu ting ni c chia thnh cc khung ckch thc bng nhau.
-
7/29/2019 nhn dng ngi ni vi h nhng
10/27
2.1.3 Hm ca s
Ca s Hamming :w(k)=0.540.46cos(2k/(k+1))
Ca s Hanning:w(k)=0.5 0.5cos(2k/(k+1))
Ca s Hamming
-
7/29/2019 nhn dng ngi ni vi h nhng
11/27
2.1.4 Trch chn vector c trng
Cc c trng c s dng hin nay: Dng h s LPC (LPC- Linear Prediction
Coding)
Dng cc h s LPL (Perceptional LinearPrediction).
Dng h s MFCC (Mel Frequency CepstralCoefficients)
-
7/29/2019 nhn dng ngi ni vi h nhng
12/27
2.1.4 Trch chn vector c trngKhung ting ni
Tin x l
+ ca s ho
|FFT|
ph bin
Ph lc MEL
log ( . )
DCT
Kt qu
Vector MFCC
Lc ph
Bng lc Mel
Khung ca s
View sourcecode
http://extractfeature.c.shs/http://extractfeature.c.shs/ -
7/29/2019 nhn dng ngi ni vi h nhng
13/27
2.2.M hnh hn hp Gauss - GMM
-
7/29/2019 nhn dng ngi ni vi h nhng
14/27
2.2.M hnh hn hp Gauss - GMM
M hnh hn hp Gauss l t hp ca nhiuthnh phn, mi thnh phn l mt phn b
chun hay phn b Gauss.Mt hn hp Gauss
)()|(1
xbpxpM
i
ii
Trong
x
l vector D chiu
)()'(2
1exp
)2(
1)(
1
212 iii
i
Di
xxxb
i
l vector trung bnh
i
ip
l ma trn hip bin
l trng s ca thnh phn trong hn hp
-
7/29/2019 nhn dng ngi ni vi h nhng
15/27
2.2.M hnh hn hp Gauss - GMM
Mt m hnh hn hp Gauss c biu dinbng cc tham s
(a) s thnh phn Gauss(b) vector trung bnh v ma trn hip bin ca
tng thnh phn(c) trng s ca tng thnh phn
B tham s cho mt m hnh Gauss l
iii
p ,,
Mi ,,1
-
7/29/2019 nhn dng ngi ni vi h nhng
16/27
3. Thit k h nhng T-Engine
T-Engine l chun m cho cc h thngnhng thi gian thc c v phn cng v
h iu hnh thi gian thc: Phn cng: T-Engine board
H iu hnh thi gian thc: T-Kernel
-
7/29/2019 nhn dng ngi ni vi h nhng
17/27
S khi mch nhng
-
7/29/2019 nhn dng ngi ni vi h nhng
18/27
4. Thit k phn mm nhn dng ngi ni
-
7/29/2019 nhn dng ngi ni vi h nhng
19/27
Hun luyn m hnh
Ngi hunluyn c vo
cu hun luynt 3 n 5 ln
-
7/29/2019 nhn dng ngi ni vi h nhng
20/27
Nhn dng ngi ni t ni bt k
Vic nhn dngc thc hin
hai ch : Nhn dng thi
gian thc
Nhn dng xcthc ngi ni
-
7/29/2019 nhn dng ngi ni vi h nhng
21/27
Cc gii thut ci thin chtlng nhn dng
Xc lp ngng im s nhn dng chotng ngi ni
Sinh t ngu nhin cho hun luyn Nhn dng vi nhiu t khc nhau trong
nhiu ln
-
7/29/2019 nhn dng ngi ni vi h nhng
22/27
5. Kt qu t c
Xy dng thnh cngh thng nhng nhn
dng ngi ni vi tni bt k
chnh xc nhn
dng t c 97%
-
7/29/2019 nhn dng ngi ni vi h nhng
23/27
Mt s giao din chng trnh
Nhp thng tinngi hun luyn
Nhn dng Thit lp ngngcho tng ngi
hun luyn
-
7/29/2019 nhn dng ngi ni vi h nhng
24/27
Kt qu th nghim
H thng c th nghim cho 30 ngi,vi tn s ghi m l 44100Hz, 16bit, mono
Mi ngi c cu hun luyn 2 ln, kimtra nhn dng 10 ln vi 10 t bt k.
-
7/29/2019 nhn dng ngi ni vi h nhng
25/27
Name Giitn
h
Tui a phng S ln c thun luyn
S lnkim tra
Kt qu(t l ng)
Le Hoai Phuong Nam 23 H Ni 2 10 100%
Ngo Chi Minh Nam 23 H Ni 2 10 90%
Nguyen Canh Diep Nam 17 Vnh Ph 2 10 100%
Nguyen Hai Ha Nam 23 H Ni 2 10 100%
Nguyen Ngoc Hung Nam 19 Hi Dng 2 10 100%
Nguyen Quang Hiep Nam 23 H Ni 2 10 100%
Nguyen Thi Hau N 23 Bc Giang 2 10 90%
Nguyen Tien Manh Nam 23 H Ni 2 10 100%
Nguyen Xuan Giang Nam 31 H Nam 2 10 100%
Pham Thi Nhan N 23 Bc Ninh 2 10 80%
Phan Van Diep Nam 23 Ngh An 2 10 100%
Tran Manh Linh Nam 23 H Ni 2 10 90%
Vuong Quang Hung Nam 18 H Ni 2 10 100%
Bui Thi Yen Nu 20 Hanoi 2 10 100%
Dang Thi May Nu 20 Nam Dinh 2 10 90%
Do Dinh Sy Nam 21 Nam Dinh 2 10 100%
Pham Hung Duc Nam 21 Phu Tho 2 10 100%
Trinh Xuan Kien Nam 21 Ha noi 2 10 100%
Kt qu trung bnh t c 97%
-
7/29/2019 nhn dng ngi ni vi h nhng
26/27
Hng pht trin
Hin ti, module codec thu m ca mchcn nhiu, phn cng ny s c chun
ha li gim nhiu, tng chnh xcnhn dng.
B sung thm tham s v tn s c bn
F0 cho cc thanh iu vo m hnh nng cao chnh xc nhn dng
-
7/29/2019 nhn dng ngi ni vi h nhng
27/27
Cu hi ca hi ng
Em xin chn thnh cm n!