基于局部泛化误差界的 RBF 网络训练方法研究

基于局部泛化误差界的 RBF 网络训练方法研究

报告人：刘晓艳

主要内容课题来源及背景和意义研究现状及分析所做的工作遇到的问题及进一步的工作参考文献

课题来源及背景和意义 RBF神经网络的结构选择中，即隐含层神经元个数的确定问题，一直是难点。合理的选择其结构会提高 RBF神经网络的泛化能力。

局部泛化误差模型，考虑分类器在输入空间局部区域上的泛化能力。对于量化的考察对于网络的容错能力（ error-tolerance）和泛化能力 (generalization ability) 有一定启发意义。

神经网络的敏感性标示着这种分类器的 variance特性，而经验误差的大小则是标示着分类器的 bias特性，将两者有机的结合起来作为一种评价分类器泛化能力的标准可能会有很好的效果。（ criteria）

研究现状及分析介绍局部泛化误差模型现状敏感性 SM (sensitivity measure)

敏感性定义及其计算

敏感性用途

Constructive for neural network

Center selection

Feature/sample/weight accuracy selection

敏感性定义及其计算

定义：衡量网络输出对于输入或权重（或其他的参数）的扰动而改变程度的定量度量。

对象上：

Sensitivity to input perturbation

Sensitivity to weight perturbation

Sensitivity to neuron perturbation

计算方式上：Partial derivative sensitivity analysis

stochastic sensitivity analysis

要求激活函数对于输入是可微的并且输入扰动必

须很小

考察输出变化的期望或方差概率特性

敏感性的应用正是由于敏感性考察网络各参数的变化对于网络输出的影响程度，因而，

基于敏感性分析来优化或调整各参数的选择即成为它的主要应用方向。

敏感性引用于 RBF 神经网络的中心选择（结构选择）

1. “Sensitivity analysis applied to the construction of radial basis function networks”－－－D. Shia, D.S. Yeung, J. Gao

2. “LOCALIZED GENERALIZATION ERROR AND ITS APPLICATION TO RBFNN TRAINING”－－－ WING W.Y. NG, DANIEL S. YEUNG, DE-FENG WANG, ERIC C. C. TSANG, XI-ZHAO WANG

3.“Hidden neuron pruning multilayer perceptrons using a sensitivity measure”－－ Daniel s. yeung， xiao－ qin zeng

敏感性用于 sample selection（ Active learning ）

“ Active Learning Using Localized Generalization Error of Candidate Sample as Criterion”－－ Patrick P. K. Chan，Wing W. Y. Ng， Daniel S. Yeung

敏感性用于 feature selection wing

研究现状及分析

现存的局部泛化误差模型理论

2

2

( )1

22

( ) ( ( ) ( )) ( )

1 1( ( ) ( ))

(2 )

(( ) )

Q

Q b

Q

SM S

N

nS xb

emp S

R Q f x F x p x dx

f x F x dxN Q

R E y A

differences between the maximum and minimum values of the target output

分类问题中，目标输出的最大最小值之差至少为 1，那么将该模型用于结构选择时就会出现问题。

研究现状及分析现存的局部泛化误差模型用于 RBFNN 的结构选择思想：两个分类器 f1， f2，如果存在 Q1，使得

f2 has a better

Generalization capability

RSM （ Q1 ）＝ a, for f1 RSM （ Q2 ）＝ a ， for f2 Q1 < Q2

在相同误差界标准下，设计分类器使得它覆盖的 Q邻域比较大，认为覆盖的邻域面积越大，得到的分类器的泛化能力越好。

分析：界的阈值 a的取值标准难以确定，现存的方法建议 a取 0.25，这样在解上述二次方程时就会出现问题。

研究现状及分析 2

2(( ) )Qemp SR E y A a

分类问题中取值大于 1

0.25

1. 由于在解方程时存在矛盾之处，造成该模型用于 RBFNN结构设计时存在问题。

2. 有关界的表达式，存在常数 A其值是否相对过大的问题，相对于前两项如果取值过大的话，其失去意义。

3. 单纯的将经验误差作为训练 RBF 分类器的标准的话，存在过拟和以及得到的分类器的泛化能力不高的缺点。

所做的工作将经验误差项和敏感性项的加和做为一种新的评价分类器泛化能力的标准（ QNB－ Q neighborhood balance）。考察其合理性。

将 QNB用于 RBFNN的结构选择，设计网络结构。用范数形式简化现有的局部泛化误差模型的分析表达式。得到一种基于范数的局部泛化误差界的分析式。

QNB 作为一个衡量分类器评价标准的合理性empQNB R SM

measure for classifier complexity

图示（ 1）：“ simple” classifier

Low SM ， but bad training error

QNB 作为一个衡量分类器评价标准的合理性图示（ 2 ）：“ complex” classifier

high SM ， but bad generalization capability and maybe overfitting

VC 维较大

QNB 作为一个衡量分类器评价标准的合理性

图示（ 3 ）：“ good fit” classifier ， what we expected

Good balance between

Training error SM

QNB 作为一个衡量分类器评价标准的合理性 (实验 )

Sensitivity measure 衡量 RBFNN复杂程度

Iris dataset

Ionosphere dataset

QNB 作为一个衡量分类器评价标准的合理性 (实验 )

Hidden number（ K ）

QNB 用于 RBFNN 的结构选择 (architecture selection)Algorithm:

Step 1: Start with the number of the hidden neurons by 1.

Step 2: Perform k-means clustering to find the location of centers for the hidden numbers.

Step 3: Select the width of each neuron to be half of the maximum distance between the center itself and other neurons.

Step 4: Using pseudo-inverse method to obtain the weight.

Step 5: For a selected Q value, compute the current neural networks error bound by the following equation:

2 2emp

1

1R ( ( ) ( )) ( )

Q

N

i i Si

ST SM f x F x E yN

Step 6: Find the minimum error bound, and output the corresponding hidden neuron’s number .

初步实验情况

Hidden number 9 13 8 8 7 10 9 7 7 9 8.7(average)

Train accuracy 0.9619 0.9810 0.9238 0.9714 0.9524 0.9619 0.9714 0.9619 0.9810 0.9714 0.9638(average)

Test accuracy 0.9333 0.9556 0.8667 1 0.9333 0.9778 0.9333 0.9111 0.9556 0.9333 0.9400(average)

(Iris, Q=0.1) information： 4 ×150 ， 3 classes

(Pima, Q=0.1) information： 8×768 ， 2 classes

Hidden number 23 22 15 18 17 22 22 26 18 21 20.4(average)

Train accuracy 0.7989 0.7877 0.7914 0.7803 0.7877 0.7952 0.8082 0.8007 0.8007 0.7952 0.7946(average)

Test accuracy 0.7749 0.7706 0.7662 0.7489 0.7662 0.7792 0.7359 0.7489 0.7749 0.7619 0.7628(average)

初步实验情况

Hidden number

7 8 8 7 9 7 9 7 8 8 7.8000(average)

Train accuracy

0.9597 0.9919 0.9758 0.9839 0.9839 0.9597 0.9758 0.9597 0.9758 0.9758 0.9742(average)

Test accuracy

0.9259 0.9444 1 0.9074 0.9815 0.9815 0.9444 0.9630 0.9815 0.9630 0.9593(average)

(Wine, Q=1.5) information： 13×178 ， 3 classes

Hidden number

17 18 16 15 19 18 14 18 16 16 16.7（ average ）

Train accuracy

0.9184 0.9388 0.9347 0.9143 0.9469 0.9347 0.9429 0.9510 0.9347 0.9224 0.9339（ average ）

Test accuracy

0.9245 0.9245 0.9528 0.9340 0.9151 0.9245 0.9245 0.9057 0.9434 0.9340 0.9283（ average ）

(Ionosphere, Q=1.0) information： 34 ×351 ， 2 classes

初步实验情况

Hidden number

17 18 21 16 18 17 14 17 15 16 16。 9（ average）

Train accuracy

0.9700 0.9730 0.9790 0.9700 0.9730 0.9700 0.9790 0.9820 0.9790 0.9760 0.9751（ average）

Test accuracy

0.9722 0.9722 0.9583 0.9792 0.9722 0.9722 0.9514 0.9444 0.9514 0.9722 0.9646（ average）

(Datazhao, Q=0.5) information： 25×477 ， 5 classes

遇到的问题及进一步的工作实验结果显示了算法的可行性，在保证分类精度的前提下，

最后选择的隐含层个数比较少，网络结构比较精简。 SM(sensitivity measure) 与 RBFNN 的隐含单元个数之间的

关系描述为：小振荡爬升。（非严格单调）这样， SM作为网络复杂程度的度量的话是比较粗的估计。 QNB 作为 measure for classifier generalization capability 的理论依据。

目前 QNB 中的两项采用的线性组合的方式，能否考虑用其他方式将这两参数信息融合后作为一个新的参量标准，用 RBFNN 的 architecture selection 如何？

QNB 能否用于对于 RBFNN 中心位置的选择？（ Supervised learning ）

参考文献 [1]D. Shi, D.S. Yeung, J. Gao “Sensitivity analysis applied to the construction of ra

dial basis function networks”,Neural Networks 18 (2005) 951–957 [2] Wing W.Y.Ng, Daniel S.Yeung, Ian Cloete, “Quantitative Study on Effect of Cent

er Selection to RBFNN Classification Performance”, 2004 IEEE International Conference on Systems, Man and Cybernetics.

[3] Wing W.Y. NG, Daniel S. YEUNG, Xi-Zhao Wang, “Localized Generalization Error and Its Application to RBFNN Training”, Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, 18-21 August 2005

[4] Friedhelm Schwenker, Hans A. Kestler, Gunther Palm,“Three learning phases for radial-basis-function networks”, Neural Networks 14 (2001) 439-458.

[5]Wing W.Y. NG, Daniel S. YEUNG, Xi-Zhao Wang and I. Cloete, “A Study of the Difference Between Partial Derivative and Stochastic Neural Network Sensitivity Analysis for Applications in Supervised Pattern Classification Problems”, Proc. Of International Conference on Machine Learning and Cybernetics, pp. 4283 - 4288, 2004

[6]Wing W.Y. NG and Daniel S. YEUNG “Selection of Weight Quantization Accuracy for Radial Basis Function Neural Network Using Stochastic Sensitivity Measure”. IEE Electronic Letters, vol. 39, pp. 787 – 789.

基于局部泛化误差界的 RBF 网络训练方法研究

Documents

Transcript of 基于局部泛化误差界的 RBF 网络训练方法研究