Mi ning Frequent Episodes for relating Financial Events and Stock Trends

Post on 22-Jan-2016

29 views 0 download

description

Mi ning Frequent Episodes for relating Financial Events and Stock Trends. Anny Ng and Ada Wai-chee Fu PAKDD 2003 報告者: Ming Jing Tsai. Definition. Events : financial news ,political … e 1 ,e 2 ,e 3 … .,e k : event types day record D i :{e i1 ,e i2 ,e i3 … .,e ik } - PowerPoint PPT Presentation

Transcript of Mi ning Frequent Episodes for relating Financial Events and Stock Trends

date:2004/03/05

Mining Frequent Episodes for relating Financial Events and Stock Trends

Anny Ng and Ada Wai-chee Fu PAKDD 2003

報告者: Ming Jing Tsai

Definition

Events : financial news ,political… e1,e2,e3….,ek : event types day record Di:{ei1,ei2,ei3….,eik} Episode:{e1,e2,e3….,ek} , has at least t

wo elements and at least one ej is a stock event type

Window = x days

Definition

Window frequency : number of windows that contains an event type

DB frequency : number of occurrences of an event type in DB

Frequency of an episode (ex) number of windows the first day of window contains at least on

e of the event types in episode.

Construct event tree

Header in descending db frequencies order

Event_set pair <(firstday) ,(remaining day)> sorted in the descending db frequencies

node<E:C:B>: E :event type ,c :counts ,b :binary bit

Pruning method

window frequencies < min_sup Remove duplicate event type in both fir

stday part and remaining day part

days events

1 b

2 ac

3 b

4 d

5 b

6 ca

7 d

Window = 3,min_sup =3An Event database

Db frequencies<a:2,b:3,c:2,d:2>

windows

window Day included

Event_set pairs

1 1,2,3 <(b),(ac)>2 2,3,4 <(a,c),(b,d)>3 3,4,5 <(b),(d)>4 4,5,6 <(d),(b,a,c)>5 5,6,7 <(b),(a,c,d)>6 6,7 <(a,c),(d)>7 7 <(d),()>

Ordered frequent event type<b,a,c,d>

Window frequencies<a:5,b:5,c:5,d:6>

Window = 3,min_sup =3

{null}

{b:1:0}

{a:1:1}

{c:1:1}

{a:1:0}

{c:1:0}

{b:1:1}

{d:1:1}

b

a

c

d

{null}

{b:2:0}

{a:1:1}

{c:1:1}

{a:1:0}

{c:1:0}

{b:1:1}

b

a

c

d

{d:1:0}

{b:1:1}

{a:1:1}

{c:1:1}

{d:1:1}

{d:1:1}

{null}

{b:3:0}

{a:1:1}

{c:1:1}

{a:1:0}

{c:1:0}

{b:1:1}

b

a

c

d

{d:1:0}

{b:1:1}

{a:1:1}

{c:1:1}

{d:1:1}

{d:1:1}

{null}

{b:3:0}

{a:2:1}

{c:2:1}

{a:1:0}

{c:1:0}

{b:1:1}

b

a

c

d

{d:1:0}

{b:1:1}

{a:1:1}

{c:1:1}

{d:1:1}

{d:1:1}{d:1:1}

{d:1:1}

{null}

{b:3:0}

{a:2:1}

{c:2:1}

{a:2:0}

{c:2:0}

{b:1:1}

b

a

c

d

{d:1:0}

{b:1:1}

{a:1:1}

{c:1:1}

{d:1:1}

{d:1:1}{d:1:1}

{d:1:1}

{null}

{b:3:0}

{a:2:1}

{c:2:1}

{a:2:0}

{c:2:0}

{b:1:1}

b

a

c

d

{d:2:0}

{b:1:1}

{a:1:1}

{c:1:1}

{d:1:1}

{d:1:1}{d:1:1}

Mining frequent episode

Header table{h0,h1,…..,hH} Mining recursively each of the linked list kept at the he

ader table from bottom to top

Conditional path can build conditional event tree Object 1:found frequent episodes of form {a} ∪{hi}

first-part frequencies Object 2:found frequent episodes that contain hi and a

t least two other event types Db frequencies

Traverse conditional path

Remove invalid event types Adjust counts of nodes above hi in the

path to be equal to that of hi If hi is in the firstdays part, then move a

ll event types in the remainingdays part to the firstdays part

Remove hi from the path

Generate frequent episode

When a conditional event tree contains only a single path Any subset of firstpart ∪ event base set Any Subsets of firstpart ∪ Any Subsets of r

emainingpart ∪ event base set

Mining Header d

<(a:1,c:1),(b:1)> <(b:1),()> <(b:1,a:1,c:1),()> <(b:1),(a:1,c:1)> <(a:1,c:1),()>

event base set {d}

db frequency:{<b:4,a:4,c:4>}First_part frequency:{<b:3,a:3,c:3>}

Frequent episode :{bd,ad,cd}

min_sup =3

W Event_set pairs

1 <(b),(ac)>

2 <(a,c),(b,d)>

3 <(b),(d)>

4 <(d),(b,a,c)>

5 <(b),(a,c,d)>

6 <(a,c),(d)>

7 <(d),()>

Recursively Mining Header c

<(a:1,b:1),()> <(b:1,a:1),()> <(b:1),(a:1)> <(a:1),()>

event base set {cd}

db frequency:{<b:3,a:4>}

First_part frequency:{<b:3,a:3>}Frequent episode :{bcd ,acd}

<(a:1,c:1),(b:1)><(b:1),()><(b:1,a:1,c:1),()><(b:1),(a:1,c:1)><(a:1,c:1),()>

<(b:1),()> <(b:1),()> <(b:1),()>

Recursively Mining Header aevent base set {acd}

db frequency:{<b:3>}

First_part frequency:{<b:3>}Frequent episode :{bacd}

<(a:1,b:1),()><(b:1,a:1),()><(b:1),(a:1)><(a:1),()>

Mining Header c

<(b:1),(a:1)> <(a:1,b:1),()> <(b:1),(a:1)> <(a:1),()>

event base set {c}

db frequency:{<b:3,a:4>}First_part frequency:{<b:3,a:2>}

Frequent episode :{bc}

min_sup =3

W Event_set pairs

1 <(b),(ac)>

2 <(a,c),(b,d)>

3 <(b),(d)>

4 <(d),(b,a,c)>

5 <(b),(a,c,d)>

6 <(a,c),(d)>

7 <(d),()>

Recursively Mining Header a

<(b:1),()> <(b:1),()> <(b:1),()>

event base set {ac}

db frequency:{<b:3>}First_part frequency:{<b:3>}

Frequent episode :{bac}

min_sup =3

Mining Header a

<(b:1),()> <(b:1),()> <(b:1),()>

event base set {a}

db frequency:{<b:3>}First_part frequency:{<b:3>}

Frequent episode :{ba}

min_sup =3

W Event_set pairs

1 <(b),(ac)>

2 <(a,c),(b,d)>

3 <(b),(d)>

4 <(d),(b,a,c)>

5 <(b),(a,c,d)>

6 <(a,c),(d)>

7 <(d),()>

Experiment (synthetic data)

Dataset 2 T20,I5,M1000,D3K

Experiment (real data)

News event from a internet 121 event types 757 days

Stock data Dow Jones ,Nasdaq ,Hang Seng , 12 top loc

al companies

Experiment (real data)

Experiment (real data)

episode support

Nasdaq downs, PCCW downs 151

Nasdaq ups, SHK properties flats, HSBC flats

178

China Mobile downs, Nasdaq downs, HK Electric flats

178