Hybrid multichannel signal separation using supervised nonnegative matrix factorization with...

31
Hybrid Multichannel Signal Separation Using Supervised Nonnegative Matrix Factorization Daichi Kitamura, (The Graduate University for Advanced Studies, Japan) Hiroshi Saruwatari, (The University of Tokyo, Japan) Satoshi Nakamura, (Nara Institute of Science and Technology, Japan) Yu Takahashi, (Yamaha Corporation, Japan) Kazunobu Kondo, (Yamaha Corporation, Japan) Hirokazu Kameoka, (The University of Tokyo, Japan) Asia-Pacific Signal and Information Processing Association ASC 2014 Special session – Recent Advances in Audio and Acoustic Signal processing

Transcript of Hybrid multichannel signal separation using supervised nonnegative matrix factorization with...

Page 1: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

Hybrid Multichannel Signal Separation Using Supervised Nonnegative Matrix Factorization

Daichi Kitamura, (The Graduate University for Advanced Studies, Japan)

Hiroshi Saruwatari, (The University of Tokyo, Japan)

Satoshi Nakamura, (Nara Institute of Science and Technology, Japan)

Yu Takahashi, (Yamaha Corporation, Japan)

Kazunobu Kondo, (Yamaha Corporation, Japan)

Hirokazu Kameoka, (The University of Tokyo, Japan)

Asia-Pacific Signal and Information Processing Association ASC 2014Special session – Recent Advances in Audio and Acoustic Signal processing

Page 2: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

2

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 3: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

3

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 4: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

4

Research background• Signal separation have received much attention.

• Music signal separation based on nonnegative matrix factorization (NMF) is a very active research area.

• Supervised NMF (SNMF) achieves the highest separation performance.

• To improve its performance, SNMF-based multichannel signal separation method is required.

• Automatic music transcription• 3D audio system, etc.

Applications

Separate!

Separate the target signal from multichannel signals with high accuracy.

Page 5: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

5

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 6: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

6

• NMF can extract significant spectral patterns.

– Basis matrix has frequently-appearing spectral patterns in .

NMF [Lee, et al., 2001]

Amplitude

Am

plitu

de

Observed matrix(spectrogram)

Basis matrix(spectral patterns)

Activation matrix(Time-varying gain)

Time

: Number of frequency bins: Number of time frames: Number of bases

Time

Fre

quen

cy

Fre

quen

cy

Basis

Page 7: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

7

• SNMF – Supervised spectral separation method

Supervised NMF [Smaragdis, et al., 2007]

Separation process Optimize

Training process

Supervised basis matrix (spectral dictionary)

Sample sounds of target signal

Fixed

Sample sound

Target signal Other signalMixed signal

Page 8: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

8

Problems of SNMF• SNMF is only for a single-channel signal

– For multichannel signal, SNMF cannot use information between channels.

• When many interference sources exist, separation performance of SNMF markedly degrades.

Separate

Residual components

Page 9: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

9

• Multichannel NMF – is a natural extension of NMF for a multichannel signal– uses spatial information for the clustering of bases to

achieve the unsupervised separation task.

Multichannel NMF [Sawada, et al., 2013]

Problems: Multichannel NMF involve strong dependence on initial values and lack robustness.

Microphone array

Page 10: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

10

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– Motivation and strategy– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 11: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

11

• Sawada’s multichannel NMF– is unified method to solve spatial and spectral separations.– Maximizes a likelihood:

– For supervised situation, target spectral patterns is given.

– Too much difficult to solve (lack robustness)– Computationally inefficient (much computational time)

Motivation and strategy

Spatial direction of target signal

Source components of all signals

Target Other

Observed spectrograms

Page 12: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

12

• Proposed hybrid method– divides the problems as follows:

– The spatial separation should be carried out with classical D.O.A. estimation methods.• These methods are very efficient and stable.

– Divide and conquer method

Motivation and strategy

Unsupervised spatial separation

Supervised spectral separation

Approximation

Classical D.O.A. estimation SNMF-based method

Page 13: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

13

Directional clustering [Araki, et al., 2007]

• Directional clustering– Unsupervised spatial separation method– k-means clustering (fast and stable)

• Problems– Artificial distortion arises owing to the binary masking.

Right

L R

CenterLeft

L R

Center

Binary masking

Input signal (stereo) Separated signal

1 

1 

1 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

1  1

1 

0 

0 

1 

0 

0 

0 

0 

0 

1  1

1  1

1 

1 

Fre

quen

cy

Time

C 

C 

C 

R  L

R 

C 

L 

L 

L 

R 

R 

C 

C  C

C 

R 

R 

C 

R 

R 

L 

L 

L 

C CC  C

C 

C 

Fre

quen

cy

Time

Binary maskSpectrogram

Entry-wise product

Page 14: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

14

Proposed method: hybrid separation• Hybrid separation method

Input stereo signal

Spatial separation method (Directional clustering)

SNMF-based separation method(SNMF with spectrogram restoration)

Separated signal

L R

Page 15: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

15

SNMF with spectrogram restoration

: Holes

Time

Fre

que

ncy

Separated clusterSpectral holes (lost components)

The proposed SNMF treats these holes as unseen observationsSupervised basis

Extrapolate the fittest bases

(dictionary of target signal)

Fix up

Page 16: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

16

SNMF with spectrogram restoration

Center RightLeftDirection

sour

ce c

ompo

nent

z

(b)

Center RightLeftDirection

sour

ce c

ompo

nent (a)

Target

Center RightLeftDirection

sour

ce c

ompo

nent (c)

Extrapolated components

Freq

uenc

y of

Freq

uenc

y of

Freq

uenc

y of

After

Input

After

signal

directionalclustering

super-resolution-based SNMF

Binary masking

Time

Fre

quen

cyObserved spectrogram

Target

Interference

Time

Time

Fre

quen

cy

Extrapolate

Fre

quen

cy

Separated cluster

Reconstructed data

Supervised spectral bases

Directional clustering

SNMF with spectrogram restoration

Page 17: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

17

• The divergence is defined at all grids except for the holes by using the Binary mask matrix .

Decomposition model and cost function

Decomposition model:

Supervised bases (Fixed)

: Entries of matrices, , and , respectively

: Weighting parameters,: Binary complement, : Frobenius norm

Cost function:

: Binary masking matrix obtained from directional clustering

Page 18: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

18

• The divergence is defined at all grids except for the holes by using the Binary mask matrix .

Decomposition model and cost function

Decomposition model:

Supervised bases (Fixed)

: Entries of matrices, , and , respectively

: Weighting parameters,: Binary complement, : Frobenius norm

Cost function:

: Binary masking matrix obtained from directional clustering

Binary index to exclude the holes

Page 19: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

19

• The divergence is defined at all grids except for the holes by using the Binary mask matrix .

Decomposition model and cost function

Decomposition model:

Supervised bases (Fixed)

: Entries of matrices, , and , respectively

: Weighting parameters,: Binary complement, : Frobenius norm

Regularization term

Cost function:

: Binary masking matrix obtained from directional clustering

Binary index to exclude the holes

Page 20: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

20

• The divergence is defined at all grids except for the holes by using the Binary mask matrix .

Decomposition model and cost function

Decomposition model:

Supervised bases (Fixed)

: Entries of matrices, , and , respectively

: Weighting parameters,: Binary complement, : Frobenius norm

Regularization termPenalty term[Kitamura, et al. 2014]

Cost function:

: Binary masking matrix obtained from directional clustering

Binary index to exclude the holes

Page 21: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

21

• : -divergence [Eguchi, et al., 2001]

– EUC-distance

– KL-divergence

– IS-divergence

Generalized divergence: b -divergence

The best criterion for signal separation [Kitamura, et al., 2014]

Page 22: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

22

• We used two -divergences for the main cost and the regularization cost as and .

Decomposition model and cost function

Decomposition model:

Cost function: Supervised bases (Fixed)

Page 23: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

23

Update rules• We can obtain the update rules for the optimization of

the variables matrices , , and .

Update rules:

Page 24: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

24

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 25: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

25

• Mixed signal includes four melodies (sources).• Three compositions of instruments

– We evaluated the average score of 36 patterns.

Experimental condition

Center

12 3

Left Right

Target source

Supervision signal

24 notes that cover all the notes in the target melody

Dataset Melody 1 Melody 2 Midrange BassNo. 1 Oboe Flute Piano TromboneNo. 2 Trumpet Violin Harpsichord FagottoNo. 3 Horn Clarinet Piano Cello

Page 26: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

26

14

12

10

8

6

4

2

0

SD

R [

dB]

43210NMF

• Signal-to-distortion ratio (SDR)– total quality of the separation, which includes the degree of

separation and absence of artificial distortion.

Experimental result: closed data

Good

Bad

Conventional SNMF(single-channel SNMF)

Proposed hybrid method

Directional clustering

Supervised Multichannel NMF [Sawada]

KL-divergence EUC-distance

Page 27: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

27

SNMF with spectrogram restoration• SNMF with spectrogram restoration has two tasks.

• The optimal divergence for source separation is KL-divergence ( ).

• In contrast, a divergence with higher value is suitable for the basis extrapolation.

Source separation

SNMF with spectrogram restoration

Basis extrapolation

Page 28: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

28

Trade-off: separation and restoration• The optimal divergence for SNMF with spectrogram

restoration and its hybrid method is based on the trade-off between separation and restoration abilities.

-10-8-6-4-20

Am

plitu

de [d

B]

543210Frequency [kHz]

-10-8-6-4-20

Am

plitu

de [d

B]

543210Frequency [kHz]

Sparseness: strong Sparseness: weak

Per

form

ance

Separation

Total performance of the hybrid method

Restoration

0 1 2 3 4

Page 29: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

29

• Closed data experiment– used different Tone generator for training and test signals

Experimental condition

Supervision signal

24 notes that cover all the notes in the target melody

Provided by Tone generator A

Provided by Tone generator B (more real sound)

+ back ground noise (SNR = 10 dB)

Center

12 3

Left Right

Target source

Page 30: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

30

10

8

6

4

2

0

-2

-4

SD

R [

dB]

43210NMF

• Signal-to-distortion ratio (SDR)– total quality of the separation, which includes the degree of

separation and absence of artificial distortion.

Experimental result: open data

Good

Bad

Conventional SNMF(single-channel SNMF)

Proposed hybrid method

Directional clustering

Supervised Multichannel NMF [Sawada]

KL-divergence EUC-distance

Page 31: Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

31

Conclusions• We proposed a hybrid multichannel signal separation

method combining directional clustering and SNMF with spectrogram restoration.

• There is a trade-off between separation and restoration abilities.

Thank you for your attention!

You can hear a demonstration from my HP!