Bagging, Boosting

7/28/2019 Bagging, Boosting

1/32

Dealing with Data ,Bagging,

Boosting


2/32

Types of Data :Binary Data

ID Salary Male / Female Mortgage car

A 50000 1 0 0

B 85000 0 1 1

C 55000 1 0 1

D 95000 1 1 0

E 75000 0 0 0

F 45000 0 1 1

G 65000 1 1 0

A Binary variable has two states 0 or 1 , where 0 means the variable is absent

and 1 means it is present. Thus the variable smoker has value 1 if the

person smokes and 0 if he does not. A binary variable is symmetric if both ofits states are equally valuable and carry the same weight. A variable denoting

the gender of the person is a symmetric binary variable as both male, female

values are equally important.

Consider the data above. In this case Male/ Female , Mortgage and car are

binary variables as they take values 0 and 1 only. In this case how do we find

the distance between A and B.


3/32

Types of Data :Binary Data

object j

1 0 sum

object i 1 q r q+r

0 s t s+t

sum q+s r+t p

We construct a matrix as shown above. The matrix shows the matching

between two objects i and j. In the above matrix q denotes the number ofmatches between i and j , where both are 1 , r denotes the number of

matches where i=1 and j=0 and so on.

r + s r + s

d( i,j)= ------------------ = --------------q + r + s +t p

The distance between i & j is also called the

dissimilarity between i and j.


4/32

D(A,B) = (r+ s) / p = 3/3 =1

Calculation of d(A,B) i.e dissimilarity

between A and B

D(A,C) = (r+s) / p = 1 /3 =.33

Calculation of D(A,C) i.e dissimilaritybetween A and C

B

1 0 sum

A 1 0 1 1

0 2 0 2

sum 2 1 3

C

1 0 sum

A 1 1 0 1

0 1 1 2

sum 2 1 3

Symmetric Data


5/32

Asymmetric Binary variable

object j

1 0 sum

object i 1 q r q+r

0 s t s+t

sum q+s r+t p

A variable is asymmetric variable if the outcomes of the states are not equally

important such as positive and negative outcomes of a disease test. Let the variable

be the status of HIV disease of a person. It will be 1 if disease is present and 0 ifdisease is absent. Given two asymmetric binary variables , the agreement of two 1s (

a positive match) is considered more significant than that of two 0s. In this case the

formula for dissimilarity becomes:

r + s

d( i,j)= ------------------

q + r + s

where t is not considered


6/32

name gender fever cough test-1 test-2 test-3 test-4

Jack M Y N P N N N

Mary F Y N P N P NJim M Y Y N N N N

name gender fever cough test-1 test-2 test-3 test-4

Jack M 1 0 1 0 0 0

Mary F 1 0 1 0 1 0

Jim M 1 1 0 0 0 0

In the above case gender is symmetric and other factors are asymmetric binary. We convert asymmetric

values as 1 for Yes and Positive and 0 for No and negative.

D(Jack, Mary)= ( 0 + 1) / ( 2+ 0 +1 ) = .33

D(Jack,Jim) = ( 1 + 1) / ( 1 + 1 +1) = .67

D(Mary,Jim) = ( 1 + 2 ) / ( 1 + 1 +2) = .75

Asymmetric Binary variable


7/32

Categorical Variables

A categorical variable is a generalization of the binary variable in that it can take

on more than two states . For example map_color is a categorical variable thatmay take five states : red, yellow, green, pink, and blue.

The dissimilarity between two categorical objects i and j can be computed

based on the ratio of mismatches:

p - m

d( i, j ) = ----------------------------

p

Where m is the number of matches ( i.e the number of variables for which i and j

are in the same state) , and p is the total number of variables.


8/32


We take into account object identifier and test-1 only and make the dissimilarity

matrix. We have p=1 since only one variable is considered.


9/32



10/32

Ordinal Variables

A discrete ordinal variable resembles a categorical variable, except that the M

states of the ordinal value are ordered in a meaningful sequence.


11/32

Ordinal Variables

We consider the object identifier and test2 (ordinal variable). We replace each of

the test-2 value by the rank. Since there are three states namely ( excellent, fair

and good) Mf = 3.


12/32

Ordinal Variables

object-identifier test-2 Normalized value

1 3 (3-1)/ (3-1) =12 1 (1-1)/ (3-1)=0

3 2 (2-1) / (3-1)=.5

4 3 (3-1)/ (3-1) =1

We next calculate the Euclidean distance between the objects using the normalisedvalues .

The distance between 2 and 1 is (( 1)2 ) 1/2 = 1

The distance between 3 and 1 is ((.5-1)2) 1/2 = .5 and so on. This results in the

following matrix.

Rank : 1-fair , 2-good,3-excellent


13/32

RatioScaled Variables

For the Ratio Scaled Variables we take the log values. Consider the object-

identifier and the test-3 variable.


14/32

object-identifier test-3 Log Values

1 445 log(445)= 2.652 22 log(22)= 1.34

3 164 log(164)=2.21

4 1210 log(1210)=3.08

RatioScaled Variables

From the values in the last column we calculate the Euclidean distance and we get thefollowing distance matrix.


15/32

x 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

y 1 1 1 -1 -1 -1 -1 1 1 1

Bagging

Bagging , which is also known as bootstrap aggregating , is a

technique that repeatedly samples (with replacement) from a data

set . Each bootstrap sample has the same size as the original data.

Because the sampling is done with replacement , some instances

may appear several times in the same training set., while others may

be omitted from the training set.

Let x denote a one-dimension attribute and y denote the class label.

We apply a classifier that induces only one-level binary decision tree

with a test condition x < = k, where k is a split point chosen to

minimize the entropy of the leaf nodes.


16/32

Bagging


17/32

BaggingThese values of y are determined based on

bagging round 1. The round 1 states that for

x=3.5 , y= -1

The values of y in the column are added


18/32

An iterative procedure to adaptively changedistribution of training data by focusing more

on previously misclassified records

Initially, all N records are assigned equal weights

Unlike bagging, weights may change at the end of

boosting round

Boosting


19/32

Records that are wrongly classified will have

their weights increased

Records that are classified correctly willhave their weights decreased

Boosting


20/32

Boosting - AdaBoost

AdaBoost Algorithm

1: w= { wj= 1 /N | j= 1,2,3.N} {Initialize the weights for all N samples}

2: Let k be the number of boosting rounds.

3: for i= 1 to k do4: Create training set Di by sampling (with replacement) from D according to w

5: Train a base classifier Ci on D

6: Apply Ci to all examples in the original training set D.

Calculate the weighted error

7: If i > .5 then

w= { wj= 1 /N | j= 1,2,3.N} (Reset the weights for all N examples}

Go back to step 48: end if

9: Calculate

10: Update the weight of each example

N

j

jjiji yxCwN 1)(

1

i

ii

1ln

2

1


21/32

Base classifiers: C1, C2, , CT

Error rate:

Importance of a classifier:

N

jjjiji yxCwN 1 )(

1

i

ii

1ln

2

1

Boosting - AdaBoost


22/32

Weight update:

If any intermediate rounds produce error rate

higher than 50%, the weights are reverted

back to 1/ N and the re sampling procedure is

repeated

Classification:

factorionnormalizattheiswhere

)(ifexp

)(ifexp)()1(

j

iij

iij

j

jij

i

Z

yxC

yxC

Z

ww

j

j

T

j

jj

y

yxCxC

1

)(maxarg)(*

Boosting - AdaBoost


23/32

Boosting - AdaBoost

x 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

y 1 1 1 -1 -1 -1 -1 1 1 1

N=10 i.e the number of elements as shown above.

w= 1 / N = 1/10 =.1 is the initial weight assigned to each

element in the data

Let k= number of boosting rounds =3


24/32

Boosting - AdaBoost

The figure above shows the three boostingrounds . The elements are sampled with

replacement. Hence a element appears more

than once.


25/32

Boosting - AdaBoost

In round 1 all elements are given the same weight = 1 /10 =.1

as shown in the first row above.

The weights of training records are as follows (calculation is shown in subsequent

slides)


26/32

Boosting - AdaBoost

The split point is: if x


27/32

Boosting - AdaBoost

We need to calculate the value of

and so that new weights can becalculated according to the equation:

N

j

jjiji yxCw

N 1)(

1


)(ifexp

)(ifexp)()1(

j

iij

iij

j

j

ij

i

Z

yxC

yxC

Z

ww

j

j

i

ii

1ln

2

1


28/32

Boosting - AdaBoost

i = 1/10 (.1 x1 + .1x 1 +.1x1 +0+0..)i = .1 (.3) = .03

N

j

jjiji yxCw

N 1

)(1

Calculation is as under :

= 1 if data element in D does not

match the original data element else itis 0.

Thus = 1 for first three data elements in D , w is the weight assigned which

is equal to .1 for the first round.


29/32

Boosting - AdaBoost

We have the value ofi

:

i = .1 (.3) = .03

= 1 /2 In ( (1- .03)/ .03)

= 1.738


30/32

Boosting - AdaBoost

We now need to calculate the new weights given by the equation


)(ifexp

)(ifexp)()1(

j

iij

iij

j

j

ij

i

Z

yxC

yxC

Z

ww

j

j

The normalization factor ensures that wi

(j+1) =1

This condition shows the matching or non matching of values

x 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

y 1 1 1 -1 -1 -1 -1 1 1 1

Matching values


31/32

We need to calculate the value of Z j the normalization factor

1 = (.1/Zj ) ( e1.738 ) + (.1/Zj ) ( e

1.738 ) + (.1/Zj ) ( e1.738 ) + (.1/Zj ) ( e

-1.738)+

1 = (.1/Zj ) ( e1.738 ) + (.1/Zj ) ( e

1.738 ) + (.1/Zj ) ( e1.738 ) + (.1/Zj ) (.175 x 7)

The value of Zj must make the right hand side expression equal to 1

If we solve the above equation we get value of Zj = 1.82

For non-matching instances the weights are:

=(.1 / 1.82) x e 1.738 = .31

For non-matching instances the weights are :

= (.1/1.82) x e -1.738 =.0096 ~ .01

The whole process is repeated with the new weights

Boosting - AdaBoost


32/32

Boosting - AdaBoost

= -1 x (1.738) + 1 x (2.7784 ) + 1 x (4.1195) =-1 x(1.738) + 1 x (2.7784) +

-1 x (4.1195)

Bagging, Boosting

Documents

Transcript of Bagging, Boosting

Bagging Weak Predictors Manuel Lukas and Eric Hillebrand ......Bagging Weak Predictors ... With bagging, the modeling strategy is applied repeatedly to bootstrap samples of the data,

Kobe University Repository : Thesisemasculation + bagging and emasculation + hand-pollinated + bagging. None of the emasculated flowers with bags produced fruits but we observed high

Textile Bags Bagging Burlap Wholesale Lines 4249903 L

Plan de implementare BORDWIIS + ( Boosting Regional ...

Introduction to Adaptive Boosting - 國立臺灣大學htlin/course/ml08fall/doc/adaboost.pdf · Introduction to Adaptive Boosting Intuition Adaptive Boosting (AdaBoost) Apple Recognition

ELECTRIC BOOSTING TECHNOLOGY FOR GLASS MELTING … Electric Boosting... · • Design of the booster and engineering drawings preparation • Equipment manufacture • Electric booster

Boosting presentación 19 05 14

PENGGUNAAN METODE VACUUM BAGGING PADA PROSES …repository.ub.ac.id/6919/35/Pusananda, Febriko Dria.pdfpenggunaan metode vacuum bagging pada proses pembuatan komposit berserat kulit

Emotional Boosting – Internetstrategien aus Sicht des Gehirns

Boosting Sales - Formação em vendas

Bagging-based System Combination for Domain Adaptation

Boosting e Bagging

BRINKMANN pressure boosting pumps FH11

Decision Tree and Boosting - CILVR at NYU · Decision Tree and Boosting Tong Zhang Rutgers University T. Zhang (Rutgers) Boosting 1 / 29

Bagging, Random Forest und Boosting - mi.uni-koeln.de · Die Random Forest Methode o ist eine Modiﬁkation der Bagging-Methode o Trainingsdatensatz wird ebenfalls in sogenannte Bootstrap

Bagging and Boosting - Computer Sciencerlaz/prec20092/slides/Bagging_and_B… · · 2010-02-11Bagging and Boosting Amit Srinet Dave Snyder. Outline Bagging Definition Variants Examples

Boosting Knn TR 2008 03

Introduction to Boosting

Boosting Organic Trade in Africa

BOOSTING INDIVIDUAL COMPETENCES