기계학습(Machine learning) 입문하기

22
Terry Taewoong Um ([email protected]) University of Waterloo Department of Electrical & Computer Engineering Terry Taewoong Um INTRODUCTION TO MACHINE LEARNING AND DEEP LEARNING 1 T- robotics.blogspot.com Facebook.com/ TRobotics

Transcript of 기계학습(Machine learning) 입문하기

PowerPoint

Terry Taewoong Um ([email protected]) University of Waterloo Department of Electrical & Computer Engineering

Terry Taewoong Um

Introduction to Machine Learning and Deep Learning1T-robotics.blogspot.comFacebook.com/TRobotics

1

Terry Taewoong Um ([email protected])CAUTIONI cannot explain everythingYou cannot get every details 2

Try to get a big pictureGet some useful keywordsConnect with your research

2

Terry Taewoong Um ([email protected])ContentsWhat is Machine Learning?

What is Deep Learning?3

3

Terry Taewoong Um ([email protected])Contents4What is Machine Learning?

4

Terry Taewoong Um ([email protected])What is Machine Learning?"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E T. Michell (1997)

Example: A program for soccer tactics5T : Win the gameP : GoalsE : (x) Players movements (y) Evaluation

5

Terry Taewoong Um ([email protected])What is Machine Learning?6

Toward learning robot table tennis, J. Peters et al. (2012) https://youtu.be/SH3bADiB7uQ"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E T. Michell (1997)

6

Terry Taewoong Um ([email protected])Tasks7 classification discrete target values

x : pixels (28*28)y : 0,1, 2,3,,9

regression real target valuesy : 0,1, 2,3,,9

clustering no target values"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E T. Michell (1997)

7

Terry Taewoong Um ([email protected])Performance8"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E T. Michell (1997)

classification 0-1 loss function

regression L2 loss function

clustering

8

Terry Taewoong Um ([email protected])eXperience9"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E T. Michell (1997)

classification labeled data

(pixels)(number)

regression labeled data

(x) (y)

clustering

unlabeled data

(x1,x2)

9

Terry Taewoong Um ([email protected])A Toy Example10

?Height(cm)Weight(kg)[Input X][Output Y]

10

Terry Taewoong Um ([email protected])11

180Height(cm)Weight(kg)

80Y = aX+bModel : Y = aX+bParameter : (a, b)

[Goal] Find (a,b) which best fits the given dataA Toy Example

11

Terry Taewoong Um ([email protected])12[Analytic Solution] Least square problem

(from AX = b, X=A#b whereA# is As pseudo inverse)Not always available

[Numerical Solution]1. Set a cost function

2. Apply an optimization method (e.g. Gradient Descent (GD) Method)

L(a,b)http://www.yaldex.com/game-development/1592730043_ch18lev1sec4.html

Local minima problemhttp://mnemstudio.org/neural-networks-multilayer-perceptron-design.htmA Toy Example

12

Terry Taewoong Um ([email protected])13

32Age(year)RunningRecord(min)140

What would be the correct model?Select a model Set a cost function Optimization

13

Terry Taewoong Um ([email protected])14

?XY

What would be the correct model?1. Regularization2. Nonparametric modeloverfitting

14

Terry Taewoong Um ([email protected])15L2 Regularization

(e.g. w=(a,b) where Y=aX+b)

Avoid a complicated model!

Another interpretation : : Maximum a Posteriori (MAP)http://goo.gl/6GE2ix

http://goo.gl/6GE2ix

15

Terry Taewoong Um ([email protected])16What would be the correct model?1. Regularization2. Nonparametric model

training timeerror

training errortest errorwe should stop here

trainingsetvalidationsettestsetfor training(parameter optimization)for early stopping(avoid overfitting)for evaluation(measure theperformance)

keep watching the validation error

16

Terry Taewoong Um ([email protected])17NonParametric ModelIt does not assume any parametric models (e.g. Y = aX+b, Y=aX2+bX+c, etc.)It often requires much more samples

Kernel methods are frequently applied for modeling the dataGaussian Process Regression (GPR), a sort of kernel method, is a widely-used nonparametric regression methodSupport Vector Machine (SVM), also a sort of kernel method, is a widely-used nonparametric classification method

kernel function[Input space][Feature space]

17

Terry Taewoong Um ([email protected])18Support Vector Machine (SVM)

Myo, Thalmic Labs (2013) https://youtu.be/oWu9TFJjHaM

[Linear classifiers][Maximum margin]

Support vector Machine Tutorial, J. Weston, http://goo.gl/19ywcj

[Dual formulation] ( )kernel functionkernel function

18

Terry Taewoong Um ([email protected])19Gaussian Process Regression (GPR)https://youtu.be/YqhLnCm0KXY

https://youtu.be/kvPmArtVoFE

Gaussian DistributionMultivariate regression likelihoodposteriorpriorlikelihood

predictionconditioning the joint distribution of the observed & predicted values

https://goo.gl/EO54WN

http://goo.gl/XvOOmf

19

Terry Taewoong Um ([email protected])20Dimension reduction

[Original space][Feature space]

low dim.high dim.high dim.low dim.Principal Component Analysis

: Find the best orthogonal axes (=principal components) which maximize the variance of the data

Y = P X

20

Terry Taewoong Um ([email protected])21Dimension reduction

http://jbhuang0604.blogspot.kr/2013/04/miss-korea-2013-contestants-face.html

21

Terry Taewoong Um ([email protected])22SUMMARY - Part 1Machine Learning - Tasks : Classification, Regression, Clustering, etc. - Performance : 0-1 loss, L2 loss, etc. - Experience : labeled data, unlabelled data Machine Learning Process (1) Select a parametric / nonparametric model (2) Set a performance measurement including regularization term (3) Training data (optimizing parameters) until validation error increases (4) Evaluate the final performance using test setNonparametric model : Support Vector Machine, Gaussian Process RegressionDimension reduction : used as pre-processing data

22