1 Stock Price Prediction Using Reinforcement Learning 이 재 원.

Post on 03-Jan-2016

236 views 0 download

Transcript of 1 Stock Price Prediction Using Reinforcement Learning 이 재 원.

1

Stock Price Prediction Using

Reinforcement Learning

이 재 원

2

Introduction Analytical methods

– Technical analysis

– Fundmental analysis

– EMH (efficient market hypothesis)

– Traditional time series forecasting

– Chaos theory

Computer techniques

– Neural network

– Fuzzy logic / Expert system

3

“Economic history is a never-ending series of episodes based

on falsehoods and lies, not truths. It represents the path to big money. The object is to recognize the trend whose premise is false, ride that trend, and step off before it is discredited.”

- George Soros –

The proposed method

– Adopt reinforcement learning

– Suitable for representing delayed rewards as well as immediate rewards

4

Reinforcement Learning Agent-environment interaction

Action at

Reward rt

Agent

rt+1

st+1Environment

State st

5

improvement

evaluation

VV

greedy(V ) V

...)(3

2

21

rrrsV tttt

Value function

Generalized policy iteration

6

TD Algorithms Learn from raw experience without a

model

Bootstrap

– update in part on an existing estimate

– suitable for continuous tasks

TD(0)

– the simplest TD algorithm

)]()([)()( 11 ttttt sVsVrsVsV

7

Stock Price Changes in TD View State vector

– Raw daily data (open price, close price, ...)

– Technical indicators• Disparities• Moving averages• Stochastic oscillator• etc.

))(),...,2(),1(( nssss

8

9

Reward

– Relative rate of change in close price

– The values of states can be calculated from the rewards using discounting factor (0 < < 1)

)1(

)1()(100

ty

tytyr

c

cct

10

– e.g., The value of stock A at time step 0 is greater than that of stock B

800900

10001100120013001400

0 1 2 3 4t

clos

ing

pric

e(t) A

B

11

Function Approximation by Neural

Network Parameter vector

– Vector of connection weights of the net

Gradient descent

)()]()([ 11 tttttttt sVsVsVrt

Ttttt n))(),...,2(),1((

ttt

1

12

Experimental result

 

13

Future works Predictability

– Rule-based approach/other learning models

Policy optimization

– optimal profit ratio

– optimal stop loss(risk management)

– optimal holding period

Asset allocation

– Other investment opportunities• Foreign exchange• Futures/options

 

14