Distance-based neural networks - 会津大学web-ext.u-aizu.ac.jp/~qf-zhao/TEACHING/AI/lec12.pdf ·...

Lecture 12 of Artificial Intelligence

Distance-based neural networks

Topics of today

• Pattern recognition again• Feature extraction/selection• Self-organizing neural network• Winner-take-all learning strategy• Learning vector quantization• R4-rule

Pattern recognition again

• Pattern classification is the process for partitioning a domain into various meaningful concepts; and pattern recognition is the process for determining the label of an observed pattern.

• Example: – Domain: Chinese characters (Kanji)– Concepts: Nouns, verbs, adjectives, …– Given observation: 城 noun; 走 verb

• A concept is also called a class, a category, a group, a cluster, etc.

Examples of patterns recognition• Characters (kanji, kana, digits, etc.)• Biometrics (finger prints, vein, iris, etc.)• Signals (speech, sound, speaker, etc.)• Images (face, expression, human, etc.) • Videos (gaits, activity, gesture, etc.)• Objects (human, car, obstacle, etc.)• Anomaly (product, text, sequence, etc.)• Time series (risk, chance, etc.)

Vector representation of patterns

• To classify or recognize objects in a computer, it is necessary to represent them numerically.

• We usually transform an object into an n-dimensional vector, which is a point in the n-D Euclidean space, as follows:

• Each element of the vector is called a feature, and the vector itself is called a feature vector. The set of all feature vectors is called the feature space.

tnxxxx ),,,( 21=

Feature extraction and selection

1] 4 4 2 6 2 1 3 4 3 2 3[

• The right figure shows an example of handwriting character recognition.

• Instead of using the whole picture, we can extract some features for recognition.

• The method given here is just an illustration.

• Usually, the number of extracted features can be large, and we should select some of the most useful ones.

• Feature extraction / selection is the first step for pattern recognition.

323431

2 6 2 4 4 1

Distance-based learning

• The right figure shows a neural network implementation of the nearest neighbor classifier (NNC).

• There are p neurons, and each of them is a prototype.

• This network is called Kohonen neural network (KNN), and is useful both for self-organization (WTA) and for supervised learning (LVQ).

A neuron in a KNN

iii wxy

The output of the neuron provides a measure of similarity between the input pattern x and the memorized one w.

KNN is smarter

• Using the KNN, it is not necessary to remember all training patterns.

• Instead, a small number of representatives can be obtained through learning.

• The representatives are not the training data, but abstracted patterns (e.g. smile face, angry face, etc.).

• There are several ways for obtaining the neurons.• Here, we first introduce a method called winner-take-all

(WTA) that is useful for self-organization.

Winner-take-all Learning勝者独占学習

)( 1 km

km wxww −+=+ α

Step 1: Initialize the weights of all neurons at random.Step 2: For any x in the training set, find the most similar

neuron, let it be wm. This neuron is called the winner.Step 3: Update the weights of the winner as follows:

• Where α is a positive number in (0,1). Usually, α is called the learning rate.

Physical meaning of the WTA• Using WTA, we move the winner

towards the input pattern, so that it can win again if a similar pattern is provided.

• This way, a neuron can become an expert for recognizing a group of “similar patterns”, and the weight of the neuron can be used as an “icon” of this group of patterns.

Example

• The right figure shows patterns generated according to 4 different distributions.

• At the beginning, the neurons cannot represent the patterns well.

• After WTA-based learning, however, the neurons become the centers of the patterns.

Distance-based supervised learning

• In WTA-based learning, it is not necessary to have teacher signals. Thus, it is an un-supervised learning.

• We can also conduct supervised learning based on the distance.• One of the distance-based supervised learning methods is Learning

Vector Quantization (LVQ). Supervised learning can be more accurate (see the figure given below).

Supervised learning can be more efficient

• Using self-organized learning, we can get cluster centers.

• But if we know the class labels of the cluster centers, we can reduce the number of representatives for each class.

Learning vector quantization

Step 1:Step 2:

Step 3:

Step 4:

Initialize the weights of all neurons at random.Take a pattern x from the training set, and find thewinner r.If r has the same label as x, update it using Eq.(6.18); otherwise, update using Eq. (6.19)

(6.18)(6.19)

Terminate if the given condition is satisfied.Otherwise, return to Step 2.

)( rxrr −+= α)( rxrr −−= α

Example 6.6 p. 131

Class 0 Class 1-0.96, -0.27-0.86, 0.510.86, -0.500.65, -0.76

-0.400000,0.920.800000, 0.600.060000, 1.000.930000, 0.36

Training data

Initial neurons Neurons after learning0.882983 0.469404 0;-0.619131 -0.785288 0;0.901795 -0.432164 1;-0.151238 -0.988497 1;

0.842607 -0.538529 0;-0.983217 -0.182438 0;0.665301 0.746575 1;-0.151238 -0.988497 1;

Training results

Example 6.6

R4-rule: A method for determining the network size

Zhao and Higuchi, IEEE Trans. on Neural Networks, 1996

認識Recognition

復習Review

記憶Remembrance

忘却Reduction

1st R: Recognition• Recognition is an

operation for：– Evaluating the current

system; and – Evaluating the fitness of each

neuron.• The first result is used to

select the next operation; and the second is used to delete useless neurons.

Method for evaluating the neurons

1. Determine the winner: For any given pattern x, a neuron y is the winner if：– y has the same class label as x; – y is closer to x than any neuron with a different class

label;– y has the highest fitness among all neurons satisfying

the above two conditions.

2. Increase the fitness of the winner y.

2nd R: Remembrance

• Remembrance is an operation for remembering un-recognizable patterns (e.g. new patterns or difficult patterns).

• Usually, only one of the un-recognizable patterns is remembered a time.– Method for selection: random.– Method for remembrance: as is.

3rd R: Reduction

• Reduction is an operation for deleting useless neurons.• If the current system is good enough, we can select one

of the neurons with low fitness, and delete.

4th R: Review

• Review is an operation for updating the current system.

• We can use any LVQ algorithm to implement “review”.

• We recommend the DSM (decision surface mapping) algorithm because it is more efficient.

Examples

The generalized XOR problem The straight line class boundaries problem

Results

Generalized XOR SLCB problemMethods Number of

neurons Error rate (%) Number of neurons Error rate (%)

NNC 6400 0.91 6400 0.97CNN 231 1.11 307 1.17RNN 162 1.27 216 1.14RCE 183 0.69 243 0.56

R4-rule 4 0.13 10 0.48

Evaluation of algorithms• To evaluate the performance of a learning algorithm, we usually

divide the data into two parts: training data and testing data.• Training data are used to train a learning model, and testing data is

used to test.• The performance of the model for the training data can be used to

check if learning is actually conducted or not.• The performance of the model for the testing data can be used to

check the “generalization ability” of the model. This is usually considered more important in machine learning.

Evaluation of algorithms

• If we just test the model using one testing data set, the result might be biased.

• To void this problem we often use n-fold cross validation.• In n-fold cross validation, we divide the data into n

(equal) parts, and use each of them for testing and the remaining for training.

• We can thus train n models, and the average performance is often used to draw a conclusion.

• The standard deviation is often used to measure the confidence of the conclusion.

Homework of today

1. Solve Problem 6.5 in p. 132 of the textbook, and submit the results during the exercise class.

2. Based on the skeleton program, • Complete a program for WTA-based self-organized learning.• There is a data file generated from 5 different distributions. Find

the representative patterns using this data file, and draw a figure.• You may use “make plot” to draw the figure.• Write your comments based on the figure in “summary.txt”.

Quizzes of today• KNN is the short for .

• WTA is the short for .

• LVQ is the short for .

• R4-rule has the following 4 operations:– R ,

– R ,

– R , and

– R .

• Try to explain the physical meaning of the WTA algorithm.

• Why KNN is smarter compared with the NNC using all training data?

Distance-based neural networks - 会津大学web-ext.u-aizu.ac.jp/~qf-zhao/TEACHING/AI/lec12.pdf ·...

Documents

Transcript of Distance-based neural networks - 会津大学web-ext.u-aizu.ac.jp/~qf-zhao/TEACHING/AI/lec12.pdf ·...

Cormen Algo-lec12

Architecture and Design of an ... - web-ext.u-aizu.ac.jpweb-ext.u-aizu.ac.jp/~benab/publications/treport/YukiTanaka-TR2015.pdf · scheduler を含んでいる）に加えて、次の隣接ノードへのフリットの転送する

Prog.0 Special Lecure 2015 - University of Aizuweb-ext.u-aizu.ac.jp/course/prog0/images/0/08/20150619...インターネットバンキング 12 お母さんは、月1回、4年間の仕送りの際に、パートの合

L05-FA - 会津大学 - University of Aizuweb-ext.u-aizu.ac.jp/~hamada/AF/L05-FA-Handouts.pdf4/15/16 3 q 1 q 2 q 3 1 0, λ q 0,1 The string 1 gets rejected: on “1” the automaton

Language Processing Systems - 会津大学 - University of Aizuweb-ext.u-aizu.ac.jp/~hamada/LP/LP07-week1.pdf · · 2007-10-09Assembly Language Translation ... signs in infix form.

Advancing Human-Centric Solutions with Software ...web-ext.u-aizu.ac.jp/~pyshe/presentations/gt-themes...Advancing Human-Centric Solutions with Software Engineering Practices Software

Manual PhotoshopCS6 Lec12

어서와Java는처음이지 제17장파일입출력 - GNUselab.gnu.ac.kr/oop/kor/ppts/gnu-lee-oop-kor-lec12-1... · 2017. 6. 6. · gnu-lee-oop-kor-lec12-1-chap17 Created Date: 6/6/2017

CFG: Parsing - 会津大学 - University of Aizuweb-ext.u-aizu.ac.jp/~hamada/AF/L5-Final-2-new.pdf · •Generative aspect of CFG: ... such as a+(b×c) = a+b×c ? (a+b)×c. ... 2

lec12 [호환 모드] - Sangji Universitycompiler.sangji.ac.kr/lecture/java/2008_1/lec12.pdf · 2019. 2. 14. · Microsoft PowerPoint - lec12 [호환 모드] Author: Main Created

Biclustering Multivariate Data for Correlated Subspace Miningweb-ext.u-aizu.ac.jp/~shigeo/pdf/pvis2015-preprint.pdf · feature subspaces from multivariate data, including data clustering

!#%$ ./$0 1 2web-ext.u-aizu.ac.jp/labs/ce-as/ME1/G3LCR/teisei0.pdfPcQ ~> m 7ed 8gfha[iMj kRl+` j=m n N \=o[pcq r0sut-v-rxw suy{z pcq!r}| z \ ~ 7 F9H n 98 ¤ ¤ S \^oh p qR|6 % o^ J0

ACM-ICPC - 会津大学公式ウェブサイトweb-ext.u-aizu.ac.jp/course/prog0/images/8/84/ACM-ICPC.pdf2012/6/14 2 Problem •ACM-ICPC 2011 国内予選 問題A •何個かの正の整数nに対して、nより大きく2n

САНКТ ПЕТЕРБУРГСКИЙ ГОСУДАРСТВЕННЫЙ ...web-ext.u-aizu.ac.jp/~mozgovoy/homepage/papers/amcp... · 2006-10-10 · правило, бурно и эмоционально

Санкт Петербургский государственный университетweb-ext.u-aizu.ac.jp/~mozgovoy/homepage/papers/amcp_thesis.pdfрефератов по «Войне

PowerPoint Presentation for AI - 会津大学web-ext.u-aizu.ac.jp/~qf-zhao/TEACHING/AI/lec01.pdfHomework for lecture 1 • Write a report using about 500 words in English or 500 characters

Se304 lec12

線形代数I 演習問題の答え (1)web-ext.u-aizu.ac.jp/~k-asai/classes/lalg/lalgans20a.pdf線形代数I演習問題の答え(1) by K. Asai 1. (1) S = (x,y,z) とおく． −−→

Information Theory - 会津大学公式ウェブサイトweb-ext.u-aizu.ac.jp/~hamada/IT/L11-IT.pdf · Information Theory Mohamed Hamada Software Engineering Lab The University of

Lec12 La naturaleza como fuente de salud

ACM-ICPC - 会津大学公式ウェブサイトweb-ext.u-aizu.ac.jp/course/prog0/images/8/84/ACM-ICPC.pdf2012/6/14 2 Problem •ACM-ICPC 2011 国内予選問題A •何個かの正の整数nに対して、nより大きく2n