模糊决策树—Soft decision tree

32
模糊决策树 Soft Decision Tree [email protected] June 2010

Transcript of 模糊决策树—Soft decision tree

Page 1: 模糊决策树—Soft decision tree

模糊决策树

Soft Decision Tree

[email protected] June 2010

Page 2: 模糊决策树—Soft decision tree

清晰决策树( Crisp Decision Tree )

数据预处理 属性值及分类是清晰的;连续值属性数据需做离散化处理;每个属性值(术语) 为属性空间上的经典集。

生成决策树 决策树的每个结点是属性空间上的经典子集;每一条从根到叶子的路径对应一条清晰规则; 位于同一层的结点的交集为空集。

匹配 测试示例仅与一条路径匹配。

适用性 适用于符号值属性和分类较清晰、 噪音小的中小型数据库。

Page 3: 模糊决策树—Soft decision tree

模糊决策树( Fuzzy Decision Tree )

数据预处理 属性值及分类是模糊的;连续值属性数据需做模糊化处理;每个属性值( 术语) 为属性空间上的模糊集。生成决策树 决策树的每个结点是属性空间上的模糊子集;每一条从根到叶子路径对应一条模糊规则;位于同一层的结点的交集一般不空。

匹配 测试示例可与多条路径近似匹配。

适用范围 适用于各种情况的数据库, 特别是对属性和类模糊性强, 有噪音的数据库。

Page 4: 模糊决策树—Soft decision tree

Fuzzy set theory

• 模糊集合 (Fuzzy set) 来描述模糊事物的概念。其基本思想是把经典集合中的绝对隶属关系模糊化。

Page 5: 模糊决策树—Soft decision tree

Fuzzy set theory

• Let U be a collection of objects denoted generically by {u}. U is called the universe of discourse and u represents the generic element of U.

Page 6: 模糊决策树—Soft decision tree

Fuzzy set theory

• Definition 1. A fuzzy set A in a universe of discourse U is

characterized by a membership function μA which takes values in the interval [0, 1].

For uЄU, μA (u)=1 means that u is definitely a member of A and μA (u) = 0 means that u is definitely not a member of A, and 0 < μA (u) < 1 means that u is partially a member of A. If either μA (u) = 0 or μA (u) = 1 for all u Є U, A is a crisp set.

Page 7: 模糊决策树—Soft decision tree

Fuzzy set theory

Page 8: 模糊决策树—Soft decision tree

Soft Decision Tree

A variant of classical decision tree inductive learning using fuzzy set theory

Page 9: 模糊决策树—Soft decision tree

Soft decision trees

VS.

Crisp regression trees

Page 10: 模糊决策树—Soft decision tree

T1->T2->D5

Page 11: 模糊决策树—Soft decision tree

Crisp regression tree

• a single threshold and having two possible answers: yes or no (left or right)

• split into two (in our case of binary trees) non-overlapping subregions of objects

Page 12: 模糊决策树—Soft decision tree

Leaf L4 with membership degree of0.43 Leaf L5 with membership degree of 0.57

1:0 0:43 label∗ ∗ L4 +1:0 0:57 label∗ ∗ L5 =1:0 0:43 0:44+1:0 0:57 1:00=0.76 ∗ ∗ ∗ ∗

Page 13: 模糊决策树—Soft decision tree

Soft decision tree

• discriminator function– piecewise linear (widely used)– two parameters

• α : corresponds to the split threshold in a test node of a decision or a regression tree

• β : the width, the degree of spread that defines the transition region on the attribute chosen in that node

– split (fuzzy partitioned) into two overlapping subregions

– reaches multiple terminal nodes – the output estimations given by all these terminal

nodes are aggregated through some defuzzification scheme in order to obtain the final estimated membership degree to the target class.

Page 14: 模糊决策树—Soft decision tree

Building a soft decision tree

GS: growing set PS: pruning setLS: learning setTS: test set LS =GS PS∪

Page 15: 模糊决策树—Soft decision tree

Soft tree semantics

Page 16: 模糊决策树—Soft decision tree

Soft tree semantics

Page 17: 模糊决策树—Soft decision tree

Soft tree semantics

Fuzzy set S into two fuzzy subsets, SL the left one and SR the right one

Discriminator function v(a(o),α,β,γ…)→[0,1]

Page 18: 模糊决策树—Soft decision tree

Soft tree semantics

• Membership degree of an object to the left successor’s subset SL

– μSL(0)=μS(0)v(a(o),α,β)

strictly positive

• Membership degree of an object to the left successor’s subset SR

– μSR(0)=μS(0)(1-v(a(o),α,β))

strictly positive

Page 19: 模糊决策树—Soft decision tree

Soft tree semantics

• j: a node of a tree• Lj: numerical value (or label) attached to node

• SLj: the fuzzy subset corresponding to this node

Page 20: 模糊决策树—Soft decision tree

SDT growing

Page 21: 模糊决策树—Soft decision tree

SDT growing

• a method to select a (fuzzy) split at every new node of the tree

• a rule for determining when a node should be considered terminal

• a rule for assigning a label to every identified terminal node

Page 22: 模糊决策树—Soft decision tree

Automatic fuzzy partitioning of a node

Objective: Given S, fuzzy set in a soft decision tree, and attribute a(·), threshold α and width β together with successors labels LL and LR, so as to minimize the squared error function

Page 23: 模糊决策树—Soft decision tree

Automatic fuzzy partitioning of a node

Strategy: •Searching for the attribute and split location. With a fixed β=0 (crisp split) we search among all the attributes for the attribute a(·) yielding the smallest crisp ES , its optimal crisp split threshold α, and its corresponding (provisional) successors labels LL and LR, by using crisp heuristics adapted from CART regression trees.

Page 24: 模糊决策树—Soft decision tree

Automatic fuzzy partitioning of a node

Strategy: •Fuzzication and labeling. With the optimal attribute a(·) and threshold α kept frozen, we search for the optimal width β by Fibonacci search; for every new β value, the two successors labels LL and LR are automatically updated to every candidate value of β by explicit linear regression formulas.

Page 25: 模糊决策树—Soft decision tree

SDT pruning

Page 26: 模糊决策树—Soft decision tree

SDT pruning

• Objective: Given a complete SDT and a pruning sample of objects PS, find

• i.e. find the subtree of the given SDT with the best mean absolute error (MAE) on the pruning set among all subtrees that could be generated from the complete SDT.

Page 27: 模糊决策树—Soft decision tree

SDT pruning

Page 28: 模糊决策树—Soft decision tree

SDT pruning

Strategy:•Subtrees sequence generation.The first node in the list is removed and contracted, and the resulting tree is stored in the trees sequence. Finally, we obtain a sequence of trees in decreasing order of complexity.

Page 29: 模糊决策树—Soft decision tree

SDT pruning

Strategy:•Best subtree selection.

– “One-standard-error-rule” to select a tree from the pruning sequence. –Use the PS to get an unbiased estimate of

the MAE, together with its standard error estimate.– Selecting among the trees not the one of

minimal MAE but rather the smallest tree in the sequence

Page 30: 模糊决策树—Soft decision tree

SDT tuning

• Refitting– optimize only terminal nodes parameters– based on linear least squares

• Backfitting– optimize all model free parameters– based on a Levenberg-Marquardt non-linear

optimization technique

Page 31: 模糊决策树—Soft decision tree

Empirical results

Page 32: 模糊决策树—Soft decision tree

References

• Cristina Olaru, Louis Wehenkel A complete fuzzy decision tree technique Fuzzy Sets and Systems 138 (2003) 221–254

• Yufei Yuan, Michael J. Shaw Induction of fuzzy decision trees Fuzzy Sets and Systems 69 (1995) 125-139

• 王熙照 孙 娟 杨宏伟 赵明华 模糊决策树算法与清晰决策树算法的比较研究 计算机工程与应用 2003.21