模糊决策树—Soft decision tree

Post on 21-Jul-2015

1.617 views 13 download

Transcript of 模糊决策树—Soft decision tree

模糊决策树

Soft Decision Tree

bookcold@msn.com June 2010

清晰决策树( Crisp Decision Tree )

数据预处理 属性值及分类是清晰的;连续值属性数据需做离散化处理;每个属性值(术语) 为属性空间上的经典集。

生成决策树 决策树的每个结点是属性空间上的经典子集;每一条从根到叶子的路径对应一条清晰规则; 位于同一层的结点的交集为空集。

匹配 测试示例仅与一条路径匹配。

适用性 适用于符号值属性和分类较清晰、 噪音小的中小型数据库。

模糊决策树( Fuzzy Decision Tree )

数据预处理 属性值及分类是模糊的;连续值属性数据需做模糊化处理;每个属性值( 术语) 为属性空间上的模糊集。生成决策树 决策树的每个结点是属性空间上的模糊子集;每一条从根到叶子路径对应一条模糊规则;位于同一层的结点的交集一般不空。

匹配 测试示例可与多条路径近似匹配。

适用范围 适用于各种情况的数据库, 特别是对属性和类模糊性强, 有噪音的数据库。

Fuzzy set theory

• 模糊集合 (Fuzzy set) 来描述模糊事物的概念。其基本思想是把经典集合中的绝对隶属关系模糊化。

Fuzzy set theory

• Let U be a collection of objects denoted generically by {u}. U is called the universe of discourse and u represents the generic element of U.

Fuzzy set theory

• Definition 1. A fuzzy set A in a universe of discourse U is

characterized by a membership function μA which takes values in the interval [0, 1].

For uЄU, μA (u)=1 means that u is definitely a member of A and μA (u) = 0 means that u is definitely not a member of A, and 0 < μA (u) < 1 means that u is partially a member of A. If either μA (u) = 0 or μA (u) = 1 for all u Є U, A is a crisp set.

Fuzzy set theory

Soft Decision Tree

A variant of classical decision tree inductive learning using fuzzy set theory

Soft decision trees

VS.

Crisp regression trees

T1->T2->D5

Crisp regression tree

• a single threshold and having two possible answers: yes or no (left or right)

• split into two (in our case of binary trees) non-overlapping subregions of objects

Leaf L4 with membership degree of0.43 Leaf L5 with membership degree of 0.57

1:0 0:43 label∗ ∗ L4 +1:0 0:57 label∗ ∗ L5 =1:0 0:43 0:44+1:0 0:57 1:00=0.76 ∗ ∗ ∗ ∗

Soft decision tree

• discriminator function– piecewise linear (widely used)– two parameters

• α : corresponds to the split threshold in a test node of a decision or a regression tree

• β : the width, the degree of spread that defines the transition region on the attribute chosen in that node

– split (fuzzy partitioned) into two overlapping subregions

– reaches multiple terminal nodes – the output estimations given by all these terminal

nodes are aggregated through some defuzzification scheme in order to obtain the final estimated membership degree to the target class.

Building a soft decision tree

GS: growing set PS: pruning setLS: learning setTS: test set LS =GS PS∪

Soft tree semantics

Soft tree semantics

Soft tree semantics

Fuzzy set S into two fuzzy subsets, SL the left one and SR the right one

Discriminator function v(a(o),α,β,γ…)→[0,1]

Soft tree semantics

• Membership degree of an object to the left successor’s subset SL

– μSL(0)=μS(0)v(a(o),α,β)

strictly positive

• Membership degree of an object to the left successor’s subset SR

– μSR(0)=μS(0)(1-v(a(o),α,β))

strictly positive

Soft tree semantics

• j: a node of a tree• Lj: numerical value (or label) attached to node

• SLj: the fuzzy subset corresponding to this node

SDT growing

SDT growing

• a method to select a (fuzzy) split at every new node of the tree

• a rule for determining when a node should be considered terminal

• a rule for assigning a label to every identified terminal node

Automatic fuzzy partitioning of a node

Objective: Given S, fuzzy set in a soft decision tree, and attribute a(·), threshold α and width β together with successors labels LL and LR, so as to minimize the squared error function

Automatic fuzzy partitioning of a node

Strategy: •Searching for the attribute and split location. With a fixed β=0 (crisp split) we search among all the attributes for the attribute a(·) yielding the smallest crisp ES , its optimal crisp split threshold α, and its corresponding (provisional) successors labels LL and LR, by using crisp heuristics adapted from CART regression trees.

Automatic fuzzy partitioning of a node

Strategy: •Fuzzication and labeling. With the optimal attribute a(·) and threshold α kept frozen, we search for the optimal width β by Fibonacci search; for every new β value, the two successors labels LL and LR are automatically updated to every candidate value of β by explicit linear regression formulas.

SDT pruning

SDT pruning

• Objective: Given a complete SDT and a pruning sample of objects PS, find

• i.e. find the subtree of the given SDT with the best mean absolute error (MAE) on the pruning set among all subtrees that could be generated from the complete SDT.

SDT pruning

SDT pruning

Strategy:•Subtrees sequence generation.The first node in the list is removed and contracted, and the resulting tree is stored in the trees sequence. Finally, we obtain a sequence of trees in decreasing order of complexity.

SDT pruning

Strategy:•Best subtree selection.

– “One-standard-error-rule” to select a tree from the pruning sequence. –Use the PS to get an unbiased estimate of

the MAE, together with its standard error estimate.– Selecting among the trees not the one of

minimal MAE but rather the smallest tree in the sequence

SDT tuning

• Refitting– optimize only terminal nodes parameters– based on linear least squares

• Backfitting– optimize all model free parameters– based on a Levenberg-Marquardt non-linear

optimization technique

Empirical results

References

• Cristina Olaru, Louis Wehenkel A complete fuzzy decision tree technique Fuzzy Sets and Systems 138 (2003) 221–254

• Yufei Yuan, Michael J. Shaw Induction of fuzzy decision trees Fuzzy Sets and Systems 69 (1995) 125-139

• 王熙照 孙 娟 杨宏伟 赵明华 模糊决策树算法与清晰决策树算法的比较研究 计算机工程与应用 2003.21