DETERMINANTS OF THE GOLD PRICE - Ghent University€¦ · UNIVERSITEIT GENT FACULTEIT ECONOMIE EN...

UNIVERSITEIT GENT

FACULTEIT ECONOMIE EN BEDRIJFSKUNDE

ACADEMIEJAAR 2011 – 2012

DETERMINANTS OF THE GOLD PRICE

Masterproef voorgedragen tot het bekomen van de graad van

Master in de Toegepaste Economische Wetenschappen: Handelsingenieur

Bernard Dierinck

onder leiding van:

Prof. dr. Michael Frömmel

Prof. dr. ir. Benjamin Schrauwen

Begeleiders: ir. Michiel Hermans ir. Francis wyffels

Permission Undersigned declares that the content of this master thesis may be consulted and

reproduced when referencing to it.

Bernard Dierinck

I

I Foreword I always have had a sincere interest in the financial markets and saw it as an opportunity to be able

to write my thesis about something which is located on the border between economics and artificial

intelligence.

I would like to take this opportunity to thank Prof. dr. Frömmel and Prof. dr. ir. Schrauwen since they

offered me the possibility to do this thesis. I would also like to thank ir. Francis wyffels and ir. Michiel

Hermans who invested their time and effort to teach me techniques, which are seldom used by

students with an economic background. I also thank Prof. dr. Gerdie Everaert who helped me with

some econometrical questions.

Besides these people, I would also like to thank my family and friends since they provide me with an

ideal environment to work and live.

II

II Table of contents I Foreword ................................................................................................................................... I

II Table of contents .................................................................................................................... II

III Abbreviations ....................................................................................................................... III

III.I General abbreviations .................................................................................................... III

III.II Mathematical abbreviations ......................................................................................... IV

IV List of tables .......................................................................................................................... V

V List of figures ........................................................................................................................ VII

VI List of graphs ..................................................................................................................... VIII

VII Summary (Dutch) ................................................................................................................ IX

1.Introduction ............................................................................................................................. 1

2.Gold and its history ................................................................................................................. 2

2.1.The history of gold............................................................................................................ 2

2.2.Reasons for central banks to invest in gold ..................................................................... 3

2.3.The evolution of the gold price ........................................................................................ 4

2.4.London gold market fixing ............................................................................................... 5

2.5.Supply and demand .......................................................................................................... 6

3.Literature Review .................................................................................................................... 9

3.1.Macroeconomic approach ............................................................................................... 9

3.2.Speculation approach ....................................................................................................... 9

3.3.Inflation hedge approach ............................................................................................... 10

3.4.The artificial intelligence approach ................................................................................ 11

3.5.Conclusion ...................................................................................................................... 12

4.Determinants of the gold price ............................................................................................. 13

4.1.Short run supply and demand ........................................................................................ 13

4.2.Factors affecting the gold price in the long run ............................................................. 17

5.Data and research questions ................................................................................................. 18

5.1.Description of the variables ........................................................................................... 20

6.Regression ............................................................................................................................. 27

6.1.Technique ....................................................................................................................... 27

6.2.Results ............................................................................................................................ 27

7.Recurrent neural networks ................................................................................................... 40

7.1.Reservoir computing ...................................................................................................... 40

7.2.Backpropagation through time ...................................................................................... 48

8.Gaussian processes for machine learning ............................................................................. 59

8.1.Technique ....................................................................................................................... 59

8.2.Results ............................................................................................................................ 60

9.Discussion .............................................................................................................................. 63

10.Conclusion ........................................................................................................................... 70

VIII References .......................................................................................................................... XI

III

III Abbreviations

III.I General abbreviations ADF Augmented Dickey Fuller

ADL Autoregressive Distributed Lag

BPTT Backpropagation Through Time

CBGA Central Bank Gold Agreement

CLI Composite Leading Indicator

CPI Consumer Price Index

CRDP Credit Risk Default Premium

DF Dickey Fuller

ECM Error Correction Model

ER Exchange Rate

GP Gaussian Processes

MAE Mean Absolute Error

MSE Mean Squared Error

NMSE Normalised Mean Square Error

OECD Organisation for Economic Co-operation and Development

OLS Ordinary Least Squares

RC Reservoir Computing

USGS United States Geological Survey

WGC World Gold Council

IV

III.II Mathematical abbreviations Gold Price (end of month)

Gold Lease Rate

Trade Weighted Exchange Rate Index

Beta

CPI US

Inflation US

Volatility US Inflation

CPI World

Inflation World

Volatility World Inflation

CRDP

CLI

V

IV List of tables

TABLE 1: GOLD PRODUCING COUNTRIES (MINING) (2010) ....................................................... 7

TABLE 2: JEWELLERY CONSUMPTION (2010) ............................................................................. 8

TABLE 3: INFLUENCE OF AN INCREASE OF THE DETERMINANTS ON SUPPLY AND PRICE ....... 17

TABLE 4: INFLUENCE OF AN INCREASE OF THE DETERMINANTS ON DEMAND AND PRICE .... 17

TABLE 5: VARIABLES ................................................................................................................. 18

TABLE 6: THEORETICAL IMPACT ON THE GOLD PRICE OF AN INCREASE IN THE VARIABLES... 19

TABLE 7: UNIT ROOT TESTING, WITH X BEING THE ORDER OF INTEGRATION ........................ 27

TABLE 8: REGRESSION RESULTS OF THE LONG RUN RELATIONSHIP (P-VALUE BETWEEN BRACKETS) ................................................................................................................................ 29

TABLE 9: ADF TEST RESULTS ON THE RESIDUALS (P-VALUE BETWEEN BRACKETS) ................. 29

TABLE 10: TABLE OF MACKINNON ........................................................................................... 29

TABLE 11: GRANGER REPRESENTATION THEOREM REGRESSION RESULTS (P-VALUE BETWEEN BRACKETS) ................................................................................................................................ 30

TABLE 12: GOLD AS A HEDGE (1971-1995) .............................................................................. 32

TABLE 13: GOLD AS A HEDGE (1971-2011) .............................................................................. 32

TABLE 14: ADL REGRESSION RESULTS WITHOUT DUMMY VARIABLES (P-VALUE BETWEEN BRACKETS) ................................................................................................................................ 34

TABLE 15: ADL REGRESSION RESULTS WITH DUMMY VARIABLES (P-VALUE BETWEEN BRACKETS) ................................................................................................................................ 34

TABLE 16: REGRESSION RESULTS ADL MODEL ......................................................................... 36

TABLE 17: TEST RESULTS FOR THE ADL MODEL (2010-2011) .................................................. 38

TABLE 18: ESTIMATION RESULTS FOR THE ADL MODEL (1990-2009) ..................................... 38

TABLE 19: VALIDATION ERROR RESERVOIR COMPUTING ........................................................ 44

TABLE 20: OPTIMAL COMBINATION OF META-PARAMETERS FOR RC .................................... 45

TABLE 21: TRAINING ERROR FOR RC ........................................................................................ 45

TABLE 22: TEST ERROR FOR RC ................................................................................................ 45

TABLE 23: TEST ERROR OF BEST MODEL FOR RC ..................................................................... 45

TABLE 24: VALIDATION ERROR FOR RC (4 FOLDS) ................................................................... 46

TABLE 25: TEST ERROR FOR RC (4 FOLDS) ................................................................................ 47

TABLE 26: ERROR MEASURES FOR BPTT-MODEL 1 .................................................................. 50

TABLE 27: STANDARD DEVIATION ON ERROR MEASURES FOR BPTT-MODEL 1 ...................... 50

TABLE 28: ERROR OF THE BEST BPTT-MODEL 1 ....................................................................... 50

TABLE 29: IMPACT OF THE DIFFERENT CATEGORIES ON THE PREDICTED GOLD PRICE FOR BPTT-MODEL 1 ......................................................................................................................... 51

TABLE 30: INDEPENDENT VARIABLES ....................................................................................... 52








VI


TABLE 39: RESULTS FOR GAUSSIAN PROCESSES ...................................................................... 60

TABLE 40: DIFFERENT ERROR MEASURES FOR THE DIFFERENT MODELS (BEST CELL PER LINE IN GREY) .................................................................................................................................... 63

TABLE 41: COMPUTATIONAL POWER RANKING ...................................................................... 65

TABLE 42: OVERVIEW OF THE IMPORTANCE OF THE DIFFERENT VARIABLES (IMPORTANT VARIABLE: CELL IS GREY) .......................................................................................................... 67

VII

V List of figures

FIGURE 1: GOLD DEMAND 2011................................................................................................. 6 FIGURE 2: GOLD IN CIRCULATION (TOTAL=165600 TONNES) .................................................... 8 FIGURE 3: SHORT RUN SUPPLY (THE EFFECT OF AN INCREASE IN THE STATED VARIABLES HAS EITHER A POSITIVE OR A NEGATIVE IMPACT ON THE SUPPLY) ................................................ 14 FIGURE 4: SHORT RUN DEMAND (THE EFFECT OF AN INCREASE IN THE STATED VARIABLES HAS EITHER A POSITIVE OR A NEGATIVE IMPACT ON THE DEMAND) ..................................... 16 FIGURE 5: EXAMPLE OF A RESERVOIR WITH IN- AND OUTPUT ............................................... 41 FIGURE 6: EXAMPLE OF A NETWORK WITH IN- AND OUTPUT (BPTT) ..................................... 48

VIII

VI List of graphs

GRAPH 1: GOLD PRICE (1971-2011) ........................................................................................... 4

GRAPH 2: GOLD PRICE (1990-2011) ........................................................................................... 5

GRAPH 3: EVOLUTION OF THE GOLD DEMAND ......................................................................... 6

GRAPH 4: EVOLUTION OF THE GOLD SUPPLY ............................................................................ 7

GRAPH 5: PRICE OF GOLD AND CONSUMER PRICE INDEX (CPI) US ......................................... 20

GRAPH 6: PRICE OF GOLD AND CPI WORLD ............................................................................. 21

GRAPH 7: COMPARE CPI'S (11/1990 = 100) ............................................................................. 21

GRAPH 8: PRICE OF GOLD AND INFLATION US ........................................................................ 22

GRAPH 9: PRICE OF GOLD AND INFLATION WORLD ................................................................ 22

GRAPH 10: PRICE OF GOLD AND VOLATILITY OF INFLATION US .............................................. 23

GRAPH 11: PRICE OF GOLD AND VOLATILITY OF INFLATION WORLD ..................................... 23

GRAPH 12: PRICE OF GOLD AND CLI ........................................................................................ 24

GRAPH 13: PRICE OF GOLD AND US-WORLD EXCHANGE RATE INDEX .................................... 24

GRAPH 14: PRICE OF GOLD AND GOLD LEASE RATE ................................................................ 25

GRAPH 15: PRICE OF GOLD AND GOLD’S BETA ........................................................................ 25

GRAPH 16: PRICE OF GOLD AND CRDP .................................................................................... 26

GRAPH 17: PREDICTED VS ACTUAL GOLD PRICE (ADL-MODEL) ............................................... 38

GRAPH 18: PREDICTED VS ACTUAL GOLD PRICE FOR 200 NEURONS (RC) .............................. 46

GRAPH 19: PREDICTED VS ACTUAL GOLD PRICE FOR 100 NEURONS (BPTT-MODEL 1) .......... 51

GRAPH 20: IMPORTANCE OF DIFFERENT INDEPENDENT VARIABLES VIA THE DIRECT WAY U (BPTT-MODEL 1) ....................................................................................................................... 52

GRAPH 21: IMPORTANCE OF DIFFERENT INDEPENDENT VARIABLES VIA THE INDIRECT WAY V (BPTT-MODEL 1) ....................................................................................................................... 52







GRAPH 28: GAUSSIAN PROCESSES (DOTTED LINE: INDIVIDUAL GAUSSIANS; FULL LINE: LINEAR COMBINATION OF GAUSSIANS) ............................................................................................... 59

GRAPH 29: PREDICTED VS ACTUAL GOLD PRICE (GP) .............................................................. 61

GRAPH 30: IMPORTANCE OF DIFFERENT INDEPENDENT VARIABLES (GP) .............................. 62

GRAPH 31: PREDICTED VS ACTUAL GOLD PRICE (BEST PREDICTION) ...................................... 65

IX

VII Summary (Dutch) De mens heeft altijd al een zekere fascinatie voor goud gehad. De Grieken, de Romeinen, de

Egyptenaren, allen hebben ze goud gebruikt als handelsmunt, als teken van rijkdom of als

vluchthaven in tijden van crisis. Door de recente economische problemen, is goud terug prominent

aanwezig in de wereld van de beleggers. Dit leidde de laatste jaren tot enorme prijsstijgingen,

waarbij de goudprijs in nominale waarde nieuwe hoogtes verkent. Als we echter zouden kijken naar

de goudprijs in constante 2011 waarde, dan zien we dat de maandelijkse goudprijs zijn hoogste

niveau haalde in januari 1980 met een goudprijs van $1958.85.

De goudprijs modeleren is nooit eenvoudig geweest. In tegenstelling tot andere activa, zoals

aandelen of obligaties, biedt goud geen dividenden of interest coupons. Daardoor kan de goudprijs

niet berekend worden door toekomstige kasstromen te gaan verdisconteren. Een andere manier om

naar de goudprijs te kijken, is de drijvers van vraag en aanbod te gaan analyseren. Eenmaal men

weet wat de goudprijs drijft, kan men daaruit de impact van een verandering in een drijver op de

goudprijs gaan bekijken. Deze thesis zal dan ook de verschillende modellerings- en

voorspellingstechnieken voor de goudprijs onderzoeken, zoals lineair regressie, reservoir computing,

backpropagation through time en Gaussian processes. Daarnaast zal ook ingegaan worden op de

determinanten die de goudprijs drijven. Deze thesis wordt geschreven vanuit de ideeën van de

inflatoire stroming waarbij goud bescherming moet bieden tegen inflatie.

De beste techniek is backpropagation through time. Binnen deze modellen presteert het model

waarbij de metaparameters worden geoptimaliseerd en de recurrente gewichten worden getraind,

het beste. Reservoir computing is ook zeer belovend, maar enkel wanneer het correct uitgevoerd

wordt. Dit wil zeggen, datapunten na de validatie set weglaten. Lineaire regressie is een van de

betere technieken omdat het goede voorspellingen geeft en vrij eenvoudig is om te interpreteren.

Twee belangrijke opmerkingen kunnen wel over lineaire regressie gemaakt worden. Ten eerste, voor

de periode november 1990 tot december 2011, waren we niet in staat om een cointegratie relatie te

vinden tussen de goudprijs en het algemeen prijsniveau in de VS. Dit wil hoegenaamd niet zeggen dat

goud daarom geen goede bescherming tegen inflatie kan zijn. Aangezien er geen cointegratie

aanwezig was, hebben we dan ook een ADL-model gebruikt voor de goudprijs. De tweede opmerking

heeft te maken met het feit dat wanneer het lineair karakter weg is, lineaire modellen zwakker

presteren dan niet-lineaire modellen.

Betreffende het gebruiken van technieken voor tijdreeks analyse, kunne twee suggesties gegeven

worden. Indien het qua tijd en financieel budget mogelijk is, gebruikt men best zoveel mogelijk

X

technieken. Wanneer de verschillende technieken elkaar bevestigen, dan zal de waarschijnlijkheid

dat de goudprijs zo zal bewegen, hoger zijn in vergelijk met de situatie waar ze elkaar tegenspreken.

De tweede suggestie heeft te maken met de keuze van techniek. Als men door omstandigheden niet

meerdere technieken kan toepassen, dan moet men ervoor zorgen dat men de juiste kiest. Het is

namelijk zo dat elke techniek voor- en nadelen heeft, maar het komt er op neer de juiste techniek te

vinden die het beste aansluit bij bepaalde specifieke karakteristieken van een probleem.

Om een antwoord te vinden op de tweede onderzoeksvraag, werd het belang van de verschillende

variabelen in de verschillende modellen nagegaan. Hieruit bleek dat acht van de twaalf variabelen

belangrijk zijn bij de goudprijsmodellering. Van deze acht, hebben vier een meer directe, lineaire

impact: de goudprijs in de vorige periode, de inflatie in de VS, de wisselkoers en het prijsniveau in de

wereld. De andere vier variabelen hebben een meer indirecte, niet-lineaire impact: de bèta van goud,

de leasing interest van goud, de volatiliteit van de wereld inflatie en het prijsniveau in de VS. Dit alles

kan vrij eenvoudig samengevat worden: de goudprijs in de huidige periode is gelijk aan de goudprijs

in de vorige periode plus wat extra’s.

Gebaseerd op de resultaten van deze studie, zal de goudprijs minstens nog drie jaar toenemen,

waarna een daling kan plaatsvinden. Zes van de acht variabelen die een impact hebben op de

goudprijs, wijzen allen op een stijging van de goudprijs. Voor de andere twee variabelen is het

moeilijker om een uitspraak te doen. Een potentiële maximum goudprijs voor de komende twee à

drie jaar is $2465.57 per ounce, aangezien dit de goudprijs is in constante 2011 termen die bereikt

werd op 21 januari 1980.

1

1.Introduction In mankind’s history different nations, like the Romans, the Greeks, the Egyptians and others wanted

to posses as much gold as possible. They used gold as a way to pay for resources, as a token of

wealth, as a safe-haven in times of war or political turmoil... Each nation had its reasons to own as

much gold as possible. The last decade the interest in gold has significantly increased, gold is popular

anno 2012. The demand for gold is high while the supply is decreasing year by year which leads to

quotations of more than $1500 per ounce.

Before investing in gold, one needs to get an idea of the fair value of gold. Modelling the value of this

asset is considered to be difficult. When one invests in assets that offer recurrent yields like bonds

and shares, it is less difficult to get an idea of the value. The many determinants of the gold price, the

absence of recurrent cashflows, the psychological aspect of gold... all contribute to the fact that gold

is hard to value. The price of gold is now modelled via linear regression techniques like cointegration

that use as input variables the US inflation, the beta of gold... However these techniques are not

perfect. In case of a breach in the model during training, for example going from a linear to a non-

linear model, machine learning techniques can add value to the modelling task. The purpose of this

thesis is not only to get a clear view on the most important determinants of the gold price but also to

know which modelling and forecasting technique is the most promising one.

This thesis will begin with a short explanation about gold and its history. After this explanation an

overview of the literature is presented, where there is a short elaboration on the different views of

the drivers of the gold price. In the chapter following this overview, the different determinants of the

gold price are presented. The influence of every determinant is first explained in a more qualitative

way after which there will be an empirical argumentation in the following chapter. In that chapter,

we will also present the data that is used and its sources. The thesis will then continue with three

more quantitative chapters. The first of these three chapters will try to model the gold price with

classic linear regression techniques, while the second one will use artificial intelligence, more specific

reservoir computing and backpropagation through time to model the gold price. The third of these

three chapters will make use of Gaussian processes to get an idea of the gold price model. After

these three chapters, there will be a chapter that summarizes and compares the results of the three

preceding chapters. The final chapter of this thesis consists of a thorough conclusion.

2

2.Gold and its history

2.1.The history of gold Already early on in history, mankind used gold to trade, as a token of wealth… The history of gold

started around 5000 years ago in Egypt and Nubia (which is an area in the northern part of Sudan). In

1500 BC, merchants in the Middle-East recognized gold as the standard medium of exchange. In 1091

BC golden leaflets were legalized as money in China. The Romans started to use gold in a frequent

manner in the first century BC. They used the Aureus, which was a golden coin with a weight of

around 8 grams. During the period where Egypt and Rome were flourishing, annual gold production

was around one tonne annually.

In 1377 AD England developed a monetary system which was based on gold and silver. During these

Dark- and Middle-Ages gold production dropped to an amount of less than one tonne annually. From

the 15th century on, Africa started to excavate gold which increased annual gold production to

around five to eight tonnes annually. Annual gold production arrived at a new height with the

discovery of North- and South-America. In 1851, thanks to the huge gold output of California, annual

gold production was around 77 tonnes. With the discovery of gold in Australia, gold production even

soared to 280 tonnes in 1852.

In 1844 the Bank of England was the first central bank to adopt the gold standard. The gold standard

is a monetary system where the issued amount of paper is tightly or loosely tied to the central bank

gold reserve. In this system countries all have fixed exchange rates with each other. To maintain

these fixed exchange rates, central banks were obliged to adjust their interest rates. By the year

1890, most of the world had adopted the gold standard. The increasing popularity of the gold

standard can be explained by its network effects. First of all, countries that refused in the beginning

to adapt to the gold standard were facing difficulties in attracting money. Secondly, companies that

were doing business in a country with the gold standard, were able to do better investment planning

because of lower inflation volatility. A major drawback of the system was the potential

destabilization of countries with a trade deficit. If there was a trade deficit between two countries,

central banks were obliged to solve it by exchanging the deficit in gold. When inhabitants of a

country with a trade deficit, knew that there was a trade deficit, they often did a bank run to

exchange their paper money in gold before it was transferred to the other country. This could

destabilize countries that already suffered from lower economic activity (Eichengreen & Flandreau,

1997).

3

In 1914, with the outbreak of World War I, the United Kingdom decided to leave the gold standard.

By doing so, they were able to finance the war by printing more paper money. After the first World

War the US and UK did some serious attempts to get back to the gold standard. However none of the

attempts was successful.

By the end of the second World War, a new monetary system was established between 44 countries:

the Bretton Woods System. In the Bretton Woods System, all currencies are tied to the dollar and the

dollar is tied to gold. The first established parity was 35 dollars per ounce. The Bretton Woods

System came to an end by 1971 because the US kept on printing dollars to finance its war in Vietnam.

This caused other countries to lose their trust in the maintainability of the parity.

Since the seventies, the price of gold is free to move, it increases or decreases as a result of changes

in supply or demand. Nevertheless central banks intervened once more on 26/9/1999. With the

Central Bank Agreement On Gold (CBGA) they stressed the importance of gold as a reserve asset.

This agreement limited the sale of gold by central banks by 400 tonnes a year and 2000 tonnes in the

next five years. It also limited the amount of gold leased to the level of previous years. This

agreement was renewed after five years: on 8/3/2004 the central banks decided to renew it, but

changed the amounts to 500 tonnes annually and 2500 tonnes in the next five years.

2.2.Reasons for central banks to invest in gold According to the World Gold Council there are multiple different reasons why central banks invest in

gold:

Gold acts as a hedge against inflation.

Central banks get income from leasing gold.

Gold offers worldwide confidence.

Gold offers a diversification benefit.

Changes in the world monetary system do not affect gold.

Gold provides physical safety in case other assets are blocked on accounts.

The price of gold is unaffected by bad government decisions, unlike currencies that are prone

to bad decisions made by governments.

4

2.3.The evolution of the gold price The nominal gold price (GRAPH 1) moved from $37.87 in January 1971 to $1652.31 in December

2011. There was a serious increase in the gold price around 1980, that was caused by the oil crisis,

floating exchange rates and the silver cornering. The silver cornering was an attempt by the Hunt

Brothers to manipulate the value of silver. There is the possibility to corner the market if you control

the price of an asset, like for example silver. The day when the Brothers were unable to meet the

margin calls on their futures, was named Silver Thursday. Due to the serious decline in the silver

price, other assets that previously also benefitted from the surge, now collapsed. The last decade the

price of gold reached new heights (in nominal terms). This was caused by financial turmoil, increasing

demand coming from developing countries, the CBGA …

GRAPH 1: GOLD PRICE (1971-2011)

Gold is often perceived as being a hedge against inflation. The inflation hedge gold price (GRAPH 1)

shows us that if an investor bought gold when the Bretton Woods system stopped (1971), the

investor would have been able to maintain his wealth. This is because the nominal gold price is

always higher than the inflation hedge gold price. The gold price in constant 2011 terms shows that

an investor investing in gold at any time period and holding his position, would never have lost

wealth except for the period around 1980. The gold price in January 1980 in constant 2011 terms is

$1958.85, meaning that even the gold price peak of the last year does not yet reach the peak from

1980.

$-

$500,00

$1.000,00

$1.500,00

$2.000,00

$2.500,00

jan

/71

mrt

/74

mei

/77

jul/

80

sep

/83

no

v/8

6

jan

/90

mrt

/93

mei

/96

jul/

99

sep

/02

no

v/0

5

jan

/09

Nominal gold price

Inflation hedge gold price 1971

Gold price in constant 2011 terms

5

If an investor bought gold in December 1990 (GRAPH 2), he had to wait to sell the asset until April

2006. Selling before April 2006 would mean that the investor lost wealth because the price of gold

was insufficient to meet inflation, the nominal gold price is lower than the inflation hedge gold price.

Buying gold at any period and holding the position until the end of 2011 offered an increase in value

that was more than sufficient to meet inflation demands. The gold price in constant 2011 terms

(GRAPH 2) is almost never higher than the nominal gold price at the end of 2011.

GRAPH 2: GOLD PRICE (1990-2011)

2.4.London gold market fixing Since 12 September 1919, London is the centre in the world for fixing the gold price. On the 12th of

September 1919 the five member banks, N.M. Rotschild & Sons, Mocatta & Goldsmid, Samuel

Montagu & Co, Pixley & Abell and Sharps & Wilkings, fixed for the first time the gold price at £4.94

per ounce. Until 1968, the fix was in Sterling’s but from then on, it was quoted in dollars. Around that

same period, an afternoon fix was also introduced to better serve the American market.

Anno 2012 the London fix is established two times per day by five members, once at 10.30 and once

at 15.00. The five current members are Scotiabank, HSBC, Barclays Capital, Deutsche Bank and

Société Générale Corporate & Investment Banking.

At the start of every fixing the chairman announces a price. After this, the five members relay this

price back to their customers. Then the members present themselves as sellers or buyers. If there are

sellers and buyers, the banks are asked how many bars they would like to trade at the quoted price.

If there are no sellers or buyers, or the amount of bars at which there can be a trade is not within 50

bars, a new price is quoted. This procedure will be repeated as long as there is no balance. The fix,

the price where there is a balance, will serve as benchmark for valuing gold derivatives, like futures.

6

2.5.Supply and demand This section will explain in detail the supply and demand for gold (World Gold Council). It will also

give an overview of the top gold consuming and producing countries. It finishes with an overview of

all the gold that is ever excavated.

GRAPH 3: EVOLUTION OF THE GOLD DEMAND

In 2011, the total demand for gold was 3993.3 tonnes, which is almost 4% less than a year before

when demand was 4151.4 tonnes. Despite this, the price of gold increased from $1224.5 in 2010 to

$1571.5 in 2011 (annual averages). This increase of the gold price can partly be explained by the

decrease in the total supply of gold. The rising price of gold is the main driver for the lower gold

demand for jewellery (GRAPH 3). Since 2003 demand for gold for investment reasons kept on rising

notwithstanding the higher prices and reached a new height of 1640.7 tonnes in 2011. Gold appears

to act as a kind of safe-haven for investors in times of economical and financial turmoil. The gold

demand for industry reasons remained stable over the last decade, around 500 tonnes annually. The

category “other” is a correction factor to make sure that supply equals demand. In FIGURE 1 it is

clear that the gold demand for jewellery is still higher than the demand for investment reasons, but

as can be seen in GRAPH 3 the gap is closing.

FIGURE 1: GOLD DEMAND 2011

-500

0

500

1000

1500

2000

2500

3000 2

00

2

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

ton

ne

Industry

Investment

Jewellery

Other

Industry: 464 tonnes

Investment: 1641 tonnes

Jewellery: 1963 tonnes

7

Gold that is being offered on the market, comes from three sources: mining, recycling and official

sector sales. Mining was able to supply 2821.7 tonnes of gold in 2011 (GRAPH 4). The increase in gold

coming from mines only started three years ago, which is two to three years after the start of the

boom in the gold price. The reason for this slow reaction is the required time to open a new mine for

gold digging.

GRAPH 4: EVOLUTION OF THE GOLD SUPPLY

In GRAPH 4, the gold coming from recycling shows a minor drop of around 2.5% in 2011. There is

however a clear difference per continent. In India and China, gold recycling dropped while in the EU

and US, where gold recycling is only a recent phenomena, it increased (WGC). Since the financial

crisis, governments and central banks (official sector) stopped liquidating their gold reserves and

even started to increase their gold reserves. Central banks try to diversify their assets as much as

possible and try to reduce their assets in Euros or Dollars, because they fear a fierce depreciation in

the future. Where central banks by the end of the nineties still needed a CBGA to limit their sales of

gold, they now are net buyers of gold.

Rank Country Gold production

2010 (tonnes)

1 China 345

2 Australia 255

3 United States 230

4 Russia 190

5 South Africa 190

6 Peru 170

7 Indonesia 120

8 Ghana 100

9 Canada 90

10 Uzbekistan 90 TABLE 1: GOLD PRODUCING COUNTRIES (MINING) (2010)

-500

0

500

1000

1500

2000

2500

3000

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

ton

ne Mining

Recycling

Official sector

8

The top ten of gold producing countries (according to the United States Geological Survey, USGS) is

presented in TABLE 1 (for 2010). An interesting number one is China because often the conception is

that South Africa is the number one in gold mining.

Rank Country Jewellery consumption

2010 (tonnes)

1 India 657.4

2 China 451.8

3 USA 128.6

4 Turkey 70.6

5 Saudi Arabia 67.6

6 Russia 66

7 UAE 61.6

8 Egypt 53.4

9 Italy 34.9

10 Indonesia 32.8 TABLE 2: JEWELLERY CONSUMPTION (2010)

An interesting fact that arises when comparing the top gold producing countries with jewellery

consumption is the high demand coming from India (WGC). India uses for jewellery 657.4 tonnes of

gold per year but has almost no gold production at all. Importing all this gold will affect the trade

account of India.

Mankind in its entire history, has excavated 165600 tonnes of gold, of which more than half is used in

jewels (FIGURE 2). Another 29600 tonnes is kept for investment reasons by investors, while central

banks own another 28900 tonnes. According to USGS there is still 50000 tonnes of gold beneath the

earth crust that can be excavated in an economical viable way. This amount changes with changing

gold prices and changing production costs and techniques. If the price of gold stays the same and

mankind keeps on excavating at a paste of 2500 tonnes per year, gold will be depleted in less than 20

years.

FIGURE 2: GOLD IN CIRCULATION (TOTAL=165600 TONNES)

Industry: 19800 tonnes

Investment: 29600 tonnes

Juwellery: 83700 tonnes

Official sector: 28900 tonnes

Other: 3600 tonnes

9

3.Literature Review In the literature, different studies have been conducted that try to explain and model the gold price

and its determinants. These studies can be categorized into four main groups. The first approach

tries to model the variation in the gold price in terms of variation in the main macroeconomic

variables, such as exchange rates, world income, political shocks… The second approach will focus

more on gold as a way to speculate. This approach will study the rationality or irrationality of gold

price movements. The third approach considers investing in gold more as a way to hedge a portfolio

against inflation. They often distinguish short term and long term movements. The last and most

recent group of studies are these studies that use artificial intelligence to model the gold price.

3.1.Macroeconomic approach Dooley, Isard and Taylor (1995) conducted a variety of empirical tests to prove that the gold price has

explanatory power in predicting exchange rate movements in addition to the movements in

monetary fundamentals and other variables that enter standard exchange rate models. The core idea

of their research is that they consider gold to be an asset without a country. Any shock that reduces

the attractiveness of net claims on A will increase the demand for other assets like gold or net claims

on B. This will lead to changes in the equilibrium prices.

Sjaastad and Scacciallani (1996) investigated the relationship between the gold price and the foreign

exchange market for the period 1982-1990. The main result of their study is the significant influence

of an appreciation or depreciation of a European currency on the gold price. They also found that the

US dollar only has a minor impact on the gold price. According to the authors fluctuations in the real

exchange rates among the major currencies explain almost half of the variation in the gold price.

3.2.Speculation approach Diba and Grossman (1984) conducted a research to find out if there were rational bubbles in the

relative gold price. They conclude with three findings. First of all they found that de relative price of

gold, meaning relative to a general index of prices of goods and service, corresponds with

fundamental market variables, like real interest rates. Their second conclusion is the fact that the

process generating the first differences of market fundamentals is stationary. Their last conclusion is

that actual gold price movements do not involve rational bubbles.

Pindyck (1993) tried to develop a present value model for the gold price based on futures. For

commodities like copper, lumber and oil, the model performed well, but for gold the results of the

present value model were rather poor. According to Pindyck, this is due to the fact that gold has not

the same convenience yield like the other commodities.

10

3.3.Inflation hedge approach Kolluri (1981) did some empirical research to find out the role of gold for inflation hedging. To do so,

Kolluri followed two approaches. The first approach modelled the relationship between the return on

gold investments and the anticipated inflation or variants of it estimated via the iterative procedure

of Cochrane-Orcutt. For this, Kolluri used monthly gold prices from 1968-1980. In the second

approach Kolluri first modeled the return of shares and bonds between 1926-1978, which he then

used as minimum required return for gold investments. Kolluri concluded that gold was a good hedge

against inflation in the period 1968-1980.

Sherman (1983) used multiple regression to identify the important determinants of the gold price.

With an R² of 76.03% he found that the tension index, the interest rate, the US trade weighted

exchange rate, the GDP, the excess liquidity and the unanticipated inflation were important drivers

for the gold price. However when Sherman tried to correct the serial correlation via an adjustment

procedure, certain variables became insignificant, like the tension index while others became more

significant, like unanticipated inflation.

Via simulation, Moore (1990) was able to earn an annual return on gold investments of 18-20% by

tracking the buy and sell signals of the inflation index of Columbia University (during 1970-1988). If

an investor would have followed a buy and hold strategy during the same period, the return would

be 13.9% while if he had held stocks and bonds, he would have earned 11.2% or 8.7% per year.

Mahdavi and Zhou (1997) performed research to find out if gold or other commodities were

indicators for inflation. They also investigated the possibility to predict the inflation via error-

correction models. They concluded that commodities are good indicators for inflation since they are

co-integrated with the consumer price index. According to the authors, the weak results concerning

the prediction of inflation are caused by the strong volatility of the gold price in the short run.

Ghosh, Levin, Macmillan and Wright (2004) studied the long term relation between the gold price

and the average price level in the US. In their research they also looked for an explanation for the

short run deviations from this long run relationship. For their research they used the monthly gold

prices from 1976 until 1999. The long run relationship was found via cointegration regression

techniques. This long run relationship had an elasticity of one. They also found via error-correction

models that the deviations in the short run are caused by changes in the interest rates to lease gold,

the beta of gold and the US/World exchange rate.

Ranson and Wainwright (2005) concluded that resources and more specifically gold, were the best

hedge against inflation. Their research proved that an increase in the gold price is on average two to

three times larger than an increase in inflation. According to the authors, the optimal amount to

11

hedge your portfolio against inflation, was to invest 18% of your portfolio in gold. The only drawback

could be a decrease in inflation or a deflation, which could affect your return on the entire portfolio.

Levin and Wright (2006) proved that there was a long run relationship between the price of gold and

the average price level in the US for the period 1975-2006. They showed that an increase of 1% in the

average US level caused an increase of 1% in the gold price. They modelled the long run relationship

via cointegration and the short run dynamics via error correction models. The main drivers of the

gold price in the short run were changes in US inflation, inflation volatility, credit risk, the interest

rate to lease gold and the US trade weighted exchange rate. They also proved that 66% of a deviation

of the long run relationship will disappear within five years after the shock that caused the deviation.

This research added value to the research of Ghosh and others (2000) by including a variable

“political risk in oil producing countries”.

In his research, Lampinen (2007) tried to confirm the results from the study of Levin and Wright

(2006) but he extends the research period with one year. Lampinen found similar results in his

research. The biggest difference was the number of included time dummy variables. Levin and

Wright (2006) only needed 10 time dummy variables while Lampinen needed 19 time dummy

variables. These time dummy variables were added in order to prevent autocorrelation in the

residuals.

3.4.The artificial intelligence approach

This approach is the youngest of the four approaches. However it is maybe not a new approach in

the sense that it defines a new scope of determinants. The newness of this approach lies in the new

techniques that help in modelling the gold price. The determinants that are used in this artificial

modelling are not limited to one of the above approaches.

There are only a limited number of papers available concerning the subject of artificial intelligence in

combination with the gold price. It is however remarkable to see that these papers are often written

by engineers or mathematicians with no or limited knowledge about the drivers of the gold price.

This can be seen in the kind of drivers they use in their models. They often use general determinants

like the S&P 500 index and less, the actual drivers of the gold price.

McCann and Kalman (1994) developed a forecasting mechanism for the gold price based on a simple

recurrent neural network. The simple recurrent neural network was trained to discover turning-

points in the gold price. Based on these turning points, an investor decided to either go short or long

in the gold price. The data that were used to train the recurrent neural network were they daily gold

prices over a period of five years. The authors tested their network on a test period of nine months

after these five years and during these nine months they were able to realize a return of 21%.

12

Mirmirani (2004) found inspiration in the fact that economic theories were not able to offer a

satisfying explanation for the dynamic gold price movements. Previous studies used linear

techniques to model the gold price, which is of course arbitrary. Neural networks in combination

with genetic algorithms offer the advantage that these techniques can be used without having a clear

view on the underlying structure of the problem. The author used back-propagation in combination

with genetic algorithms to predict the gold price. The results showed that the past gold price, 36 days

before, exercised a strong influence on the gold price that needed to be predicted.

Parisi, Parisi and Diaz (2008) used rolling and recursive neural networks to predict sign variation.

Their study showed that rolling networks outperformed the recursive and feed forward networks in

predicting gold price sign variation. The block bootstrap method realized an average sign prediction

of 60.68% with a standard deviation of 2.82%.

3.5.Conclusion Most research is performed in the area of the inflation hedge approach. This thesis is also based on

the inflation hedge approach although some techniques that are used in this thesis, come from the

artificial intelligence approach.

13

4.Determinants of the gold price

4.1.Short run supply and demand Since the value of gold cannot be modelled by discounting future expected cashflows, another

method to value the price of gold needs to be determined. The short run price of gold is determined

by the supply and the demand for gold, while the long run price will move more in line with the

general price level. This section combines the insights from the inflation hedge approach into one

understandable gold price model.

Short run supply

There are different variables that alter the market supply of gold ( in the short run. Buyers get

gold coming from two sources, producers (mining or recycling companies) and central banks

( . Central banks started leasing gold in the early 1980’s. Mining companies can choose to supply

their clients by leasing gold from central banks or by extracting it from their mines. Recycling

companies can choose to supply their customers by leasing gold from central banks or by recycling

gold. So the total supply of gold (FIGURE 3) in the short run equals the gold supplied by producers

plus the gold supplied by central banks.

The quantity that producers supply from their mines or recycling installations to their customers is

determined by the price in an earlier period, because producers react with a lag to an increase in

price. There is a time lag because in general it takes a lot of time to start up a new mine or to build an

new recycling installation. Producers will also supply more to the market coming from their inventory

if prices in the current period are high. The quantity of gold supplied to the market coming from

producers is negatively related to the amount of gold produced to repay for leased gold from a

previous period plus the physical interest rate if this needs be repaid in gold. To conclude, the total

supply to the market coming from production is positively related to the current and lagged gold

prices, negatively related the amount of gold produced to repay leased gold from the previous period

and negatively related to the gold lease rate in the previous period.

Central banks will be leasing gold if they receive a physical interest rate which is at least the

convenience yield plus a compensation for the default risk (Levin and Wright, 2006). The

convenience yield is the benefit a bank has from physically owning the gold. When the market is in

equilibrium, the lease rate is equal to the convenience yield plus a default risk. An increase in the

physical interest rate (lease rate), a decrease in the default risk or a decrease in the convenience

14

yield will increase the gold supplied via leasing from central banks. When prices in the current period

are high, central banks will also be more inclined to sell their gold, which positively affects supply.

FIGURE 3: SHORT RUN SUPPLY (THE EFFECT OF AN INCREASE IN THE STATED VARIABLES HAS EITHER A POSITIVE OR A NEGATIVE IMPACT ON THE SUPPLY)

There is however one important remark before there can be a general conclusion about the

determinants that drive the overall short run supply of gold. Although the current lease rate may

positively affect the supply by central banks, the lagged lease rate causes a negative effect. So the

factors driving the lagged lease rate, like the lagged default risk and lagged convenience yield will

have a positive relationship with current supply, because if these variables were high in previous

periods, central banks were less inclined to lease gold in the previous period. This results in the fact

that less produced gold needs to be used to repay central banks.

Therefore the short run supply is positively affected by the current and lagged values of the gold

price, positively affected by the current gold lease rate, lagged default risk and lagged convenience

yield while it will be negatively affected by the current default risk, current convenience yield and

lagged lease rate.

15

Short run demand

There are different drivers that alter the short run demand for gold ( ). The short run demand for

gold consists out of two categories. First of all there is the so called use demand for gold ( . In this

category, gold is used for jewellery, medals, electrical components… The second category is the asset

demand for gold ( , where one will buy gold for investment reasons. So the total demand for gold

(FIGURE 4) equals the use demand for gold plus the asset demand for gold.

The use demand for gold is affected by two variables. If the price of gold increases in the current

period, use demand for gold will decrease. Gold price volatility in the current period has a negative

impact on the current use demand for gold. If the price of gold is volatile, users will delay their

purchase of gold (The effects of gold price volatility are too short run to be included in this research.).

The asset demand for gold is based on a number of factors that have an impact on the demand side,

like exchange rate, inflation, fear, return on other assets and the correlation with other assets. If the

dollar becomes stronger, foreign investors will be less inclined to buy gold. For these foreign

investors it becomes more expensive to buy gold which is quoted in dollars. If investors expect

inflation to rise, investors will try to counter this by investing a part of their portfolio in gold. Another

variable that has a serious impact on the gold price is fear, during times of economic and financial

turmoil. The more fear is present in the market, the higher the price of gold will rise. Investors will

also buy gold to diversify their portfolio in order to reduce the overall portfolio risk. In such a case,

gold can be an interesting asset because of its negative or even zero beta. In the used data, the

average beta was -0.0765. In times of underperforming stock markets, beta will be more negative

than in more general calm markets, where there will be almost no correlation. If the beta of gold

increases, asset demand will lower but rises again when the beta returns to its average. So the asset

demand is negatively related to the current beta, but positively to the lagged betas.

If an investor buys gold, he precludes interest that he could earn by investing his money in other

assets. The real interest rate is the opportunity cost of holding gold. In general a higher interest rate

will decrease the asset demand for gold, which in turn lowers the price of gold. However in some

cases the gold price and interest rate may move together if the underlying reason is the same, for

example in case of an expected higher inflation.

16

FIGURE 4: SHORT RUN DEMAND (THE EFFECT OF AN INCREASE IN THE STATED VARIABLES HAS EITHER A POSITIVE OR A

NEGATIVE IMPACT ON THE DEMAND)

The short run demand for gold is negatively affected by the gold price, gold price volatility, exchange

rate, real interest rate, current beta, but positively affected by inflation, fear and lagged values of

beta.

Equilibrium

The market price of gold is determined when supply equals demand. As explained above, different

factors drive supply and demand and therefore the price of gold. There exists however an arbitrage

relationship between two of these factors, the gold lease rate and the real interest rate. According to

Ghosh and others (2004) “mines are indifferent between extracting gold now and selling the mined

gold or leasing gold now, selling the leased gold now, investing the proceeds of the sale in a bond,

selling the bond in one year and using the proceeds including interest to pay for extracting the gold

plus the physical interest rate.” So if the cost of extraction rises at the rate of inflation, the real

interest rate equals the gold lease rate.

TABLE 3 and TABLE 4 give not only a general conclusion about the impact of an increase of the

variables on the supply or demand, but also the impact on the gold price. If an increasing factor

Gold price (t): -

Gold price volatility (t): -

(t)

Exchange rate (t): -

Inflation (t): +

Fear (t): +

Real interest rate (t): -

Beta (t): -

Beta (t-1): +

17

causes the supply to increase, it will decrease the price or vice versa. For example an increase in the

current gold lease rate, will increase the supply, which will cause prices to drop. If an increasing

factor causes the demand to increase, it will increase the price or vice versa. For example an increase

in the real interest rate will lower demand and as a consequence, lower demand will cause prices to

drop.

Influence on supply Influence on gold price

Current gold lease rate + -

Lagged gold lease rate - +

Current default risk - +

Lagged default risk + -

Current convenience yield - +

Lagged convenience yield + - TABLE 3: INFLUENCE OF AN INCREASE OF THE DETERMINANTS ON SUPPLY AND PRICE

Influence on demand Influence on gold price

Gold price volatility - -

Exchange rate - -

Real interest rate - -

Current beta - -

Lagged beta + +

Inflation + +

Fear + + TABLE 4: INFLUENCE OF AN INCREASE OF THE DETERMINANTS ON DEMAND AND PRICE

4.2.Factors affecting the gold price in the long run In the long run, the gold price is expected to rise in line with the general price level and as such act as

an inflation hedge. According to Ghosh and others (2004) the main reason behind this moving

together, is because the long run price of gold is related to the marginal cost of extraction. So if the

cost of production increases with inflation, the price of gold will also have to increase with inflation.

In the long run, the gold price will be equal to the marginal cost of extraction if the market is

competitive or the price will be proportional with the marginal cost of extraction if the producers

have market power. In both cases the price will rise at the general rate of inflation.

18

5.Data and research questions

This thesis will cover two main research questions:

1. Compare the modelling and forecasting power of classic linear regression techniques against

more advanced non-linear machine learning techniques.

2. What are the most important determinants of the gold price according to the different

methods that were used for modelling the gold price?

The training and validation period for the different methods will be from November 1990 until

December 2009. The forecasting period will be from January 2010 until December 2011. The reason

for starting on November 1990 is the lack of data for the gold lease rate before this date. The

variables that are used in this thesis are presented in TABLE 5. In TABLE 5 the mnemonics, the source

and the way in which they are calculated, are also listed. TABLE 6 covers the impact on the gold price

of an increase in the different variables from TABLE 5. Three important remarks need to be made

about the different variables. Firstly, from TABLE 3 and TABLE 4 only 1 variable is not included in the

research, the gold price volatility because its impact is expected to be too short run for this research.

The second remark is the fact that there are variables in TABLE 5 that are additional to the ones in

TABLE 3 and 4. However all these variables are in a way connected to the ones in TABLE 3 and 4. The

last remark is the relationship between the gold lease rate, the convenience yield and the default

risk. In this thesis we only use the gold lease rate since as explained, the lease rate is in equilibrium

equal to the default risk plus the convenience yield.

Mnemonic Source Calculation

Gold price (end of month)

World Gold Council /

Gold lease rate Datastream subtract UK GOFO 3months from US

interbank rate 3months

Trade weighted exchange rate index

Datastream /

Beta of gold Datastream regress the monthly gold return on the monthly S&P500 return using a sample

period of 36 months

CPI US Datastream / Inflation US Datastream (CPI(t)/CPI(t-12))-1

Volatility inflation US Datastream GARCH(1,1) volatility modelling CPI World Datastream / Inflation World Datastream (CPI(t)/CPI(t-12))-1 Volatility inflation World

Datastream GARCH(1,1) volatility modelling

CRDP Datastream (Moody's BAA - Moody's AAA)/Moody's

AAA Composite leading indicator

OECD /

TABLE 5: VARIABLES

19

Variables Expected influence on

gold price at time t

+

+

+

+

+

+

+

-

-

+

-

+

+ TABLE 6: THEORETICAL IMPACT ON THE GOLD PRICE OF AN INCREASE IN THE VARIABLES

20

5.1.Description of the variables

In this section I will describe the different variables that have an impact on the gold price. It is

however important to notify that this descriptive analysis may reveal spurious trends or correlations.

The graphs will start on November 1990, because that was the first month that the gold lease rate

was available.

The general price level in the US (US consumer price index) is included as an explanatory variable to

test whether the gold price moves together with the general price level in the long run so that gold

can be considered as a long run hedge against inflation. It is clear from GRAPH 5 that both variables

trend upwards, suggesting that there is a long run relationship and as such gold might be a long run

hedge against inflation. It is however important to see that there are significant deviations between

short run movement in both variables. Gold is therefore probably not a good hedge against inflation

in the short run.

GRAPH 5: PRICE OF GOLD AND CONSUMER PRICE INDEX (CPI) US

The consumer price index of the world (CPI world) is also included in the described long run

relationship to control for the fact that foreign investors also can have an impact on the gold price

movements. In GRAPH 6 we can observe that both variables trend upwards and as consequence gold

might be a long run hedge against inflation. If we compare both CPI’s, we see that the CPI world is

much steeper than the CPI US (GRAPH 7).

4,8

4,9

5

5,1

5,2

5,3

5,4

5,5

5

5,5

6

6,5

7

7,5

8

no

v/9

0

jun

/92

jan

/94

aug/

95

mrt

/97

okt

/98

mei

/00

dec

/01

jul/

03

feb

/05

sep

/06

apr/

08

no

v/0

9

jun

/11

ln C

PI U

S ($

)

ln g

old

pri

ce (

$)

ln nominal gold price

ln CPI US

21

GRAPH 6: PRICE OF GOLD AND CPI WORLD

GRAPH 7: COMPARE CPI'S (11/1990 = 100)

3

3,2

3,4

3,6

3,8

4

4,2

4,4

4,6

4,8

5

5

5,5

6

6,5

7

7,5

8

no

v/9

0

apr/

92

sep

/93

feb

/95

jul/

96

dec

/97

mei

/99

okt

/00

mrt

/02

aug/

03

jan

/05

jun

/06

no

v/0

7

apr/

09

sep

/10

ln C

PI w

orl

d (

$)

ln g

old

pri

ce (

$)


ln CPI world

100

105

110

115

120

125

130

135

140

145

150

no

v/9

0

dec

/91

jan

/93

feb

/94

mrt

/95

apr/

96

mei

/97

jun

/98

jul/

99

aug/

00

sep

/01

okt

/02

no

v/0

3

dec

/04

jan

/06

feb

/07

mrt

/08

apr/

09

mei

/10

jun

/11

Pri

ce le

vel

CPI US

CPI world

22

If inflation is high, the gold price should also be high. In some periods this relationship holds true, but

in other periods there is no evidence of such a relationship (GRAPH 8 and GRAPH 9). The last decade

inflation has almost always been less than 5% (which is historically speaking low), nevertheless the

price of gold has exploded.

GRAPH 8: PRICE OF GOLD AND INFLATION US

GRAPH 9: PRICE OF GOLD AND INFLATION WORLD

-3%

-2%

-1%

0%

1%

2%

3%

4%

5%

6%

7%

4,5

5

5,5

6

6,5

7

7,5

8

no

v/9

0

apr/

92

sep

/93

feb

/95

jul/

96

dec

/97

mei

/99

okt

/00

mrt

/02

aug/

03

jan

/05

jun

/06

no

v/0

7

apr/

09

sep

/10

Infl

atio

n U

S

ln g

old

pri

ce (

$)


Inflation US

0%

5%

10%

15%

20%

25%

30%

35%

5

5,5

6

6,5

7

7,5

8

no

v/9

0

apr/

92

sep

/93

feb

/95

jul/

96

dec

/97

mei

/99

okt

/00

mrt

/02

aug/

03

jan

/05

jun

/06

no

v/0

7

apr/

09

sep

/10

Infl

atio

n w

orl

d

ln g

old

pri

ce (

$)


Inflation world

23

Rising US or world inflation volatility should increase the gold price, but on GRAPH 10 and 11 there

are also periods that show a negative relationship.

GRAPH 10: PRICE OF GOLD AND VOLATILITY OF INFLATION US

GRAPH 11: PRICE OF GOLD AND VOLATILITY OF INFLATION WORLD

0

0,01

0,02

0,03

0,04

0,05

0,06

5

5,5

6

6,5

7

7,5

8 n

ov/

90

jun

/92

jan

/94

aug/

95

mrt

/97

okt

/98

mei

/00

dec

/01

jul/

03

feb

/05

sep

/06

apr/

08

no

v/0

9

jun

/11

Vo

lati

lity

US

infl

atio

n

ln g

old

pri

ce (

$)


Volatility US inflation

0

0,05

0,1

0,15

0,2

0,25

5

5,5

6

6,5

7

7,5

8

no

v/9

0

jul/

92

mrt

/94

no

v/9

5

jul/

97

mrt

/99

no

v/0

0

jul/

02

mrt

/04

no

v/0

5

jul/

07

mrt

/09

no

v/1

0

Vo

lati

lity

wo

rld

infl

atio

n

ln g

old

pri

ce (

$)


Volatility world inflation

24

The composite leading indicator (CLI) of the OECD provides early signals of turning points between

expansions and slowdowns of economic activity. In general the relationship between the composite

leading indicator and the gold price is expected to be positive (GRAPH 12). During good economic

times, demand for jewellery and investments should have to increase which leads to higher prices for

gold.

GRAPH 12: PRICE OF GOLD AND CLI

The US-World exchange rate (ER) is included as a variable to control for gold price movements that

are not caused by changes in the price level in the US or the world. If the dollar becomes stronger,

investment demand by foreign investors will lower which causes prices to drop (GRAPH 13).

GRAPH 13: PRICE OF GOLD AND US-WORLD EXCHANGE RATE INDEX

4,44

4,49

4,54

4,59

4,64

5

5,5

6

6,5

7

7,5

8

no

v/9

0

jun

/92

jan

/94

aug/

95

mrt

/97

okt

/98

mei

/00

dec

/01

jul/

03

feb

/05

sep

/06

apr/

08

no

v/0

9

jun

/11

ln C

LI

ln g

old

pri

ce (

$)


ln CLI

4,2

4,3

4,4

4,5

4,6

4,7

4,8

5

5,5

6

6,5

7

7,5

8

no

v/9

0

apr/

92

sep

/93

feb

/95

jul/

96

dec

/97

mei

/99

okt

/00

mrt

/02

aug/

03

jan

/05

jun

/06

no

v/0

7

apr/

09

sep

/10

ln E

R in

dex

ln g

old

pri

ce (

$)


ln ER

25

The theory suggests that in case of a decreasing lease rate, gold prices should increase. This seems to

be the case from 2000 onwards, however in other periods, it is more difficult to find evidence for this

relationship. The spikes that can be seen in GRAPH 14 have been caused by sudden increases in the

real interest rate, convenience yield or default risk. As explained above, the real interest rate equals

the gold lease rate. This means for the last couple of years, that as long as real interest rates are

negative, prices of gold will continue to rise because investors are not able to maintain their wealth

by parking money at a bank.

GRAPH 14: PRICE OF GOLD AND GOLD LEASE RATE

The relationship between the gold price and the beta of gold is expected to be negative for the

current values of beta and positive for the lagged values of beta. In some periods evidence is found

for this negative relationship while in others, the relationship is more positive (GRAPH 15).

GRAPH 15: PRICE OF GOLD AND GOLD’S BETA

-1%

0%

1%

2%

3%

4%

5%

6%

7%

8%

5

5,5

6

6,5

7

7,5

8

no

v/9

0

apr/

92

sep

/93

feb

/95

jul/

96

dec

/97

mei

/99

okt

/00

mrt

/02

aug/

03

jan

/05

jun

/06

no

v/0

7

apr/

09

sep

/10

Go

ld le

ase

rate

ln g

old

pri

ce (

$)


gold lease rate

-0,6

-0,4

-0,2

0

0,2

0,4

0,6

5

5,5

6

6,5

7

7,5

8

no

v/9

0

apr/

92

sep

/93

feb

/95

jul/

96

dec

/97

mei

/99

okt

/00

mrt

/02

aug/

03

jan

/05

jun

/06

no

v/0

7

apr/

09

sep

/10

Go

ld's

bet

a

ln g

old

pri

ce (

$)


Gold's beta

26

The credit risk default premium (CRDP) can be found in GRAPH 16. It is expected to show a positive

relationship with the price of gold. In periods of lower credit risk, central banks will be willing to lease

more gold which causes gold prices to decrease because of increased supply. During the recent crisis

the CRDP has increased which as the theory suggested, caused a rise in the gold price.

GRAPH 16: PRICE OF GOLD AND CRDP

The flight to safety, meaning investing in gold during periods of turmoil, that took place in recent

years, will be modelled by the CRDP and the volatility in inflation in the US and the world, since these

variables are a kind of proxy for economic and financial risk. During periods of financial crises, the

exchange rate volatility will be higher (Coudert, Couharde, Mignon, 2010) which will lead to a higher

inflation volatility. Especially in emerging markets where there is the fear of floating (Calvo, Reinhart,

2000), a crisis will have a larger impact on inflation volatility because of liability dollarization and

depreciations of the local currency. It is therefore that we expect that a crisis will have more impact

via the volatility of the world inflation on the gold price, than via the volatility of the US inflation.

In order to even better catch the flight to safety aspect of gold, additional variables can be added like

for example a political risk factor. In contrast to the study of Levin and Wright (2006) this thesis does

not include a political risk factor because it was not available for free. However this variable can be

bought from the PRS group.

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

5

5,5

6

6,5

7

7,5

8

no

v/9

0

apr/

92

sep

/93

feb

/95

jul/

96

dec

/97

mei

/99

okt

/00

mrt

/02

aug/

03

jan

/05

jun

/06

no

v/0

7

apr/

09

sep

/10

CD

RP

ln g

old

pri

ce (

$)


CRDP

27

6.Regression

6.1.Technique The technique that will be used to model the gold price, is called cointegration regression.

Cointegration regression is used in economics to model financial series that cannot be modelled by

more basic forms of regression. The cointegration regression technique was developed by Clive W.J.

Granger by the end of the eighties. Granger (1987) proved that applying statistical methods that

were developed for stationary series on non-stationary series resulted in spurious conclusions. A

time series is called stationary if its moments, like average, variance… are independent of the

moment of observation. Granger’s research resulted in the remarkable observation that if individual

time series are non-stationary, certain linear combinations of these non-stationary time series are

stationary. These time series can then be called cointegrated.

When two time series are cointegrated, these two series are related in the long run. So if an event

causes a deviation of this long run relationship, there will be a return to the long run relation. As such

the existence of a long run equilibrium has its implications for the short run behaviour of the

variables. The relationship between two series can be described via a so called error correction

model or Granger representation theorem.

6.2.Results Unit root testing

The dynamic testing procedure of Enders (2004) is first applied to get a clear view if there are any

unit roots. If there is no unit root (the order of integration is zero = I(0)), then the variable is

stationary (TABLE 7).

Variables I(x)

0

0

2

2

1

0

0

1

0

0

1

0 TABLE 7: UNIT ROOT TESTING, WITH X BEING THE ORDER OF INTEGRATION

28

In what follows, we will continue assuming that all variables are I(1) and as consequence are

stationary on first differencing. Ghosh and others (2004) and Levin and Wright (2006) also found

variables with different orders and they also continued under the assumption that all variables are

I(1). According to Ghosh and others (2004) there is nothing wrong with following such a strategy for

four reasons. Unit root tests have low power and practitioners often rely on theoretical arguments to

construct models based on cointegration (Engle and Granger, 1987). The second reason is the most

obvious one. Since this study is following the inflation hedge approach from the literature review

(supra), the possibility of a long run relationship between the price of gold and the general price level

needs to be investigated. An important condition to test for this cointegration is that both variables

need to be non-stationary or I(1). The third is that models that mix variables in levels with variables in

first differences often fail residual tests. The final reason is that such models that mix, are difficult to

interpret in terms of the hypotheses.

Tests for cointegration

Engle-Granger two step approach

Assume the following equation with two time series and

We do not include

since it is more than 95% correlated with

So the long run relationship between the

gold price and the general price level in the US will be researched.

Since both variables are I(1), a cointegration analysis can be executed. This thesis will estimate this

model in levels using Ordinary Least Squares (OLS) and if is I(0), OLS is super consistent. To test

whether is I(0) and as such and

are cointegrated, an Augmented Dickey-Fuller

(ADF) test is performed on the residuals. The Augmented Dickey-Fuller test tests for a unit root in the

estimated residuals using the standard Dickey-Fuller (DF) specification:

As explained above, TABLE 8 shows the estimation of the model in levels using OLS. The following

step consists of an Augmented Dickey-Fuller test on the residuals, as previously explained. The

critical values produced by Eviews cannot be used, instead the appropriate critical values are

calculated via the formula of MacKinnon.

29

Where T is the sample size and the other variables can be found in the table of MacKinnon (1991)

(TABLE 10). The critical value at the 5% significance level equals -3.4. Since -0.599759 (TABLE 9) is

larger than the critical value, the null hypothesis of no cointegration cannot be rejected. According to

the Augmented Dickey-Fuller test, the price of gold and the price level in the US are not cointegrated.

Because there is no cointegration observed, the second step of the Engle-Granger two step

approach, estimate an Error Correction Model (ECM) to analyze the short run dynamics, is not

executed. Some important remarks about the Engle-Granger two step approach:

OLS is biased in finite samples.

OLS is not normally distributed, not even asymptotically.

The ADF test has low power.

The Engle-Granger approach is only capable of handling a single cointegrating vector.

The ECM model only allows for unidirectional causality.

Since the result is not in line with the expectations, a second cointegration test is executed.

Dependent variable

Coefficient of constant -2.9330 (0.00)

Coefficient of 1.7379 (0.00)

Adjusted R² 44% TABLE 8: REGRESSION RESULTS OF THE LONG RUN RELATIONSHIP (P-VALUE BETWEEN BRACKETS)

Dependent variable

Coefficient of constant 0.0009 (0.7527)

Coefficient of -0.0061(0.5493)

Adjusted R² 0.16%

ADF-test statistic -0.5998 TABLE 9: ADF TEST RESULTS ON THE RESIDUALS (P-VALUE BETWEEN BRACKETS)

TABLE 10: TABLE OF MACKINNON

30

Granger representation theorem

The Granger representation theorem is the second cointegration test. It states that if there exists an

error correction model for a set of I(1) variables, then this set of variables is cointegrated. Thus if

and

are I(1) and there exists an error correction model of the general form:

=

+ … +

+

then and

are cointegrated with cointegrating vector (1, ). We do not

include since it is more than 95% correlated with

The direct test for cointegration:

Since the t-test statistic of 1.8 (TABLE 11) is larger than -3.4, which is the critical value calculated via

the table and formula of MacKinnon (TABLE 10), the null hypothesis of no cointegration cannot be

rejected. It is even surprising to see that the coefficient of the correction term is positive where it

should be negative. This signals a model that is more explosive than one that returns to its long term

equilibrium as expected. This cointegration test shows again that there is no cointegration.

Dependent variable

Coefficient of constant -0.0868 (0.0913)

Coefficient of

-0.2593 (0.0004)

Coefficient of

0.0591 (0.1510)



Coefficient of -0.0870 (0.8876)

Coefficient of

0.0022 (0.6785)





Coefficient of

0.01246(0.0722)

Adjusted R² 12.36%

Prob(F-statistic) 0

Durbin-Watson stat 1.9222

T-test statistic of 1.8065 TABLE 11: GRANGER REPRESENTATION THEOREM REGRESSION RESULTS (P-VALUE BETWEEN BRACKETS)

Also for this method some important remarks need to be taken into consideration:

Under this test has a non-normal distribution, meaning that the MacKinnon critical values

need to be used although they are only a proxy.

This method is only capable of handling a single cointegrating vector.

It also assumes unidirectional causality.

31

Even though the Granger representation theorem still has some disadvantages, it is found to have

more power compared to the residual based tests. It avoids common factor restrictions and the

dynamics are not ignored when the cointegrating vector is estimated within the ECM.

The two preceding tests prove that there is no long run relationship between the price of gold and

the general price level in the US. Ghosh and others (2004) also found no conclusive proof of

cointegration and assumed for their research that the two variables were cointegrated. Other

authors like Levin and Wright (2006) did find proof for cointegration however their research was only

until August 2005, meaning that the boom in the gold price was not included. The assumed long run

relationship between the price of gold and the general price level in the US is no longer there. It

appears that since 2002 the gold price is increasing at a faster pace than the general price level in the

US, hence the positive coefficient of the error correction term. Since the gold price is apparently

more determined by short run dynamics, comparing linear regression with artificial intelligence

techniques that are good in capturing such dynamics, might be very interesting.

If there is cointegration between the two variables, the gold price and the price level, then gold is a

long run hedge against inflation. If there is no cointegration, this does not mean that gold is no long

run hedge against inflation. In GRAPH 1 it is clear that if an investor bought gold and kept it in his

portfolio until January 2012 he would almost always be hedged against inflation, except for two

periods. The first period is the boom around the beginning of the eighties while the second period is

the real short term, meaning the last six months. This finding supports the idea that gold is a long run

hedge against inflation while it is more difficult to hedge against inflation in the short run.

If an investor was offered the possibility from January 1971 (when the gold price was free to move)

onwards to invest on a monthly basis in gold for 1, 2, 4, 8 or 16 years (with a sliding window), the

picture looks different (TABLE 12). For example: an investor with a time horizon of 16 years could buy

gold from January 1971 until December 1995 and keep it in portfolio for 16 years or an investor with

a time horizon of 1 year could also buy gold from January 1971 until December 1995 and keep it in

portfolio for 1 year (the 1 year investor can only buy until December 1995 because the results for the

investors with different time horizons need to be comparable). It seems that the investor with the

short run horizon is in 48.33% of the cases protected against inflation, while for the investor with the

16 years horizon this is only 43%. Nevertheless in both cases it is less than 50%, meaning that in less

than 1 out of 2 times an investor was protected against inflation.

32

1y 2y 4y 8y 16y

Sum 145 133 125 101 129

Total 300 300 300 300 300

Hedge? 48.33% 44.33% 41.67% 33.67% 43.00% TABLE 12: GOLD AS A HEDGE (1971-1995)

In TABLE 13 all the months are included, even the ones where the time horizon can no longer be

respected. So an investor with a time horizon of 16 years who buys gold in January 2000 and keeps it

until December 2011 (which is only 12 years) will positively contribute to the hedging result. By

including these numbers the results are more in favour of a long run hedge even though investors

with a time horizon of maximum 1 year do not underperform investors with a team horizon of 8

years.

1y 2y 4y 8y 16y

Sum 276 271 278 276 317

Total 492 492 492 492 492

Hedge? 56.10% 55.08% 56.50% 56.10% 64.43% TABLE 13: GOLD AS A HEDGE (1971-2011)

A suggestion for future work is to apply a Johansen test which is a procedure for testing cointegration

of several variables. This test permits the existence of more than one cointegration relationship

which makes it more general than the two previous tests.

33

Autoregressive distributed lag model (ADL)

ADL models are used to explain one variable from its own past and current and/or lagged values of

other variables. As such the dynamic impact of variables on the gold price can be modelled. It is

however important that the variables are stationary. Since our variables are all I(1), the first

difference is taken to make the variables stationary. There are different reasons why it might be

interesting to include dynamics:

Psychological reasons: investors do not change their habits rapidly

Technological reasons: sometimes technology inhibits investors to change

Institutional reasons: investors might be bound to regulations

A simple example of an ADL model is presented:

Where and are assumed to be stationary and a white noise process. The presented model is

an ADL(1,1), because one lagged value of the variables is included. For this thesis the variables are

the first differences in order to be stationary, so equals , equals

…

In order to know the number of lags, different models are tested. It is however important to know

that all the variables will have the same number of lags, so there will only be ADL(0…0), ADL(1…1)…

models and no mix in the number of lags. The reason for this simplification is the high number of

variables which would complicate everything unnecessarily.

To know the number of lags, the Schwarz criterion for the different models is calculated. The most

negative value of Schwarz shows the ideal number of lags. Once the Schwarz value increases, it keeps

on increasing.

The first model ADL(0…0) gives a Schwarz criterion of -3.37.

The second model ADL(1…1) gives a Schwarz criterion of -3.27.

Since the Schwarz criterion of an ADL(0…0) model is more negative than the Schwarz criterion of an

ADL(1…1) model and the criterion will keep on increasing, the ideal model is one with no lags

included. The model that will be estimated is similar to the ADL(0…0) model except for one variable,

the delta of the lagged gold price that will be included (TABLE 14).

34

Dependent variable

Coefficient of constant 0.0055(0.0441)

Coefficient of

-0.2320 (0.0014)

Coefficient of

0.0590 (0.1539)




Coefficient of

0.0026 (0.6253)





Adjusted R² 11.45%

Prob(F-statistic) 0

Durbin-Watson stat 1.9122 TABLE 14: ADL REGRESSION RESULTS WITHOUT DUMMY VARIABLES (P-VALUE BETWEEN BRACKETS)

To increase performance and to make sure that the model performs well on residual tests, two

dummy variables are included. The first one is September 1999 because then the Central Bank Gold

Agreement was signed, which caused an enormous increase in the gold price which cannot be

explained by any of the above variables. The second dummy variable is October 2008, the start of the

global financial crisis with the failure of Lehman.

The final model including the dummy variables is presented in TABLE 15.

Dependent variable

Coefficient of constant 0.0055 (0.0354)

Coefficient of

-0.1994 (0.0049)

Coefficient of

0.0492 (0.2145)




Coefficient of

-0.0043 (0.4595)





Coefficient of Dummy 1999/9 0.1552 (0.0005)

Coefficient of Dummy 2008/10 -0.1408 (0.0016)

Adjusted R² 19.7%

Prob(F-statistic) 0

Durbin-Watson stat 1.8674 TABLE 15: ADL REGRESSION RESULTS WITH DUMMY VARIABLES (P-VALUE BETWEEN BRACKETS)

35

The OLS estimator for the parameters in the ADL model:

Is biased as the error term and the explanatory variables are not independent

Is consistent as long as the error term is uncorrelated with and

Necessary conditions for this to hold true are:

o The error term is white noise, that is not correlated with

o is exogenous such that are not correlated with the error term

So to conclude, the main limitations are that the explanatory variables need to be exogenous and all

variables need to be stationary. In this thesis, the explanatory variables are assumed to be exogenous

while all variables are stationary due to the first differencing.

Two residual tests were executed. The Jarque-Bera test for normality showed that the residuals are

normally distributed at the 5% significance level. The correlogram in levels of the residuals also

revealed that there is no autocorrelation in the residuals.

36

Results and interpretation

TABLE 16 gives an overview of the null hypotheses and the alternative hypotheses based on TABLE 6.

It also shows the observed signs corresponding to the variables, that can be found in TABLE 15. The

p-value for a one sided t-test can also be found in TABLE 16, it is just half of the p-value that can be

found in TABLE 15 where the included p-value is for a two sided t-test. The adjusted R² for the model

is 19.7%, meaning that 19.7% of the variation in the change of the gold price can be explained by

changes in the explanatory variables. The R² is comparable to the one found by Lampinen (2007), he

found an R² of 19.5% for the period September 1989 until August 2006 (however he included 19 time

dummy variables).

In the following section, the five variables that are significant at the 5% level and for which the null

hypothesis can be rejected, will be explained in more detail (TABLE 16). This does not mean that the

other variables are not relevant. Therefore it is preferable to keep these variables included even

though they are not significant. The variables in TABLE 16, that have a different sign than expected,

are the change in the US inflation volatility and the change in the beta of gold. However none of

these variables are significant.

The significant negative coefficient of

causes a kind of mean reversion. It is not because

there is no longer a long run relationship between the price of gold and the general price level in the

US that the gold price can now explode. If the gold price increased in a previous period, it is likely to

decrease in the subsequent period. If there has been an increase of 1% from t-2 to t-1 in the gold

price, it will be likely that there will be a decrease in the gold price with almost 0.2% in the following

period (keeping all other variables constant).

Coefficient of Observed sign of coefficient

p-value

= 0 <0 - 0.0025

= 0 <0 + 0.1073

= 0 >0 + 0.1556

= 0 >0 + 0.0356

= 0 >0 + 0.3383

= 0 <0 - 0.2298

= 0 >0 + 0.4889

= 0 <0 - 0.0003

= 0 >0 - 0.4060

= 0 >0 + 0.3001

Dummy1999Sept = 0 >0 + 0.0003

Dummy2008Okt = 0 <0 - 0.0008 TABLE 16: REGRESSION RESULTS ADL MODEL

37

If the inflation in the US rises with 100 basis points, then the gold price will rise with 1.3% (keeping all

other variables constant). This means that although some people who claim that gold is not a hedge

against inflation in the short run, might be wrong. In TABLE 12 and TABLE 13 it was also clear that in

general, gold is not such a bad hedge against inflation in the short run. However protecting your

wealth in the short run against inflation by going long in gold is difficult because it is important to

pick the right moment. If the inflation in the US rises with 1%, then the gold price will rise with only

0.08% (keeping all other variables constant).

If the dollar index shows a stronger dollar by increasing 1%, the price of gold will decrease with 0.5%

(keeping all other variables constant). If we compare the three previous variables, we see that the

exchange rate has the biggest impact, an increase by 1% causes a decrease of 0.5%.

Both dummy variables show the impact as was expected. The CBGA in September 1999 caused the

gold price to increase with more than 16%. The start of the financial crisis in October 2008 caused a

decrease in the gold price with more than 13%. This negative sign may not seem logical in the first

place, but after the failure of Lehman a lot of hedge funds needed to deleverage or liquidate in order

to meet margin calls on positions. One of the ways to do so, was selling their gold which caused

prices to fall, even though the world was at the beginning of a new crisis.

Error measures

To compare the different techniques we use as an error measure the NMSE.

With being the desired value and the outputted value. Two other error measures that are also

calculated, are the MAE and the hitrate. However these two error measures are only given for the

linear regression part. Because the NMSE is the most often used error measure in time series

modelling, I will only compare the techniques via the NMSE.

With being the desired value and the outputted value. The hitrate is a percentage describing how

often to model could correctly predict an increase or decrease in the gold price.

Predicting

Once the gold price model was established for the period November 1990 until December 2009, it

was used to predict the gold price from January 2010 until December 2011 with the actual values of

the input variables from that period (GRAPH 17).

38

GRAPH 17: PREDICTED VS ACTUAL GOLD PRICE (ADL-MODEL)

As can be seen on GRAPH 17, the predicted gold price is close to the actual gold price. The error

measures corresponding to GRAPH 17 can be found in TABLE 17 in the column with the title “Gold

Price”. The error measures in the other columns will be used to compare these results with the

results of the other methods.

ln gold price Gold price

Normalised ln gold price

NMSE test 0.1287 0.1500 0.1287

HITRATE test 54.17% 54.17% 54.17%

MAE test 0.0448 64.5922 0.0887 TABLE 17: TEST RESULTS FOR THE ADL MODEL (2010-2011)

When comparing the results from estimating (TABLE 18) with these from testing (TABLE 17), it is as

expected that the results from estimating are better than the results from testing. The only exception

is the hitrate where the result from the test period is higher than the one from estimating. It is also

interesting to see that normalizing data does not affect the NMSE measure, the NMSE measure in the

first column equals the NMSE measure in the third column. Surprising to see is the fact that the

hitrate during estimating is less than 50%, this means that the model was unable to correctly specify

(less than 50%) in which direction the gold price would go in the next month.

ln gold price Gold price Normalised ln gold price

NMSE estimation 0.0101 0.0136 0.0101

HITRATE estimation 47.78% 47.78% 47.78%

MAE estimation 0.0322 16.6499 0.0636 TABLE 18: ESTIMATION RESULTS FOR THE ADL MODEL (1990-2009)

The NMSE validation based on 4 folds (parts) during the period 51-180 equals to 0.0119 (for the

normalised ln gold price). An ADL-model has a low validation error if during the period where it was

validated, the gold price moved in a more linear way. However if the gold price moves more

$ 1.000,00

$ 1.100,00

$ 1.200,00

$ 1.300,00

$ 1.400,00

$ 1.500,00

$ 1.600,00

$ 1.700,00

$ 1.800,00

$ 1.900,00

jan

/10

mrt

/10

mei

/10

jul/

10

sep

/10

no

v/1

0

jan

/11

mrt

/11

mei

/11

jul/

11

sep

/11

no

v/1

1

Predicted Gold Price

Actual Gold Price

39

exponential or non-linear, the ADL-model faces more difficulties in getting good results. So an ADL-

model will give poor results if it is estimated on a linear part and tested on a non-linear part.

Robustness check

To check the robustness of the model, the quadratic and cubic forms of the variables were included

in the ADL-model. This resulted in a lower estimation error, namely a NMSE of 0.007, compared to

the situation where they were not included. However the test error increased, going from 0.1287 to

0.18. This can be caused by the fact that there is an in-sample overfit while the prediction out-sample

is worse.

When estimating an ADL-model on the period 1998-2009, the volatility in the world inflation became

also a significant variable, as expected and explained in section 5.1. It shows that the recent crisis is a

reason for the gold price increase.

40

7.Recurrent neural networks Some problems are difficult to solve with classical linear regression techniques. In such cases, it

might be interesting to use more advanced, non-linear machine learning techniques. One subpart of

these non-linear machine learning techniques makes use of neural networks. These are abstract

mathematical models of how the human brain functions. It consists of a number of connected, non-

linear neurons that receive input and pass output to other neurons. In the neurons, the sum of the

incoming values are processed using a non-linear function, in this thesis this will be a hyperbolic

tangent.

In general, neural networks have two categories: feed-forward neural networks and recurrent neural

networks (Jaeger, 2005). Feed-forward neural networks do not have feedback between the neurons

and the layers: there is a direct mapping from the input to the output. In contrast, recurrent neural

networks have feedback, which brings in the aspect of memory.

There are different ways to train a recurrent neural network. Two of these techniques will be

explained in more detail: reservoir computing and backpropagation through time. This chapter is

based on the tutorial of Herbert Jaeger (2005): “A tutorial on training recurrent neural networks

covering BPTT, RTRL, EKF and the echo state network approach”.

7.1.Reservoir computing Reservoir computing is a recent technique to train recurrent neural networks. It uses less

computational power than backpropagation through time. Reservoir computing is a collection of

similar techniques: echo state neural networks, liquid state machines… In this thesis the echo state

neural network will be used.

Technique

This thesis will first explain an easy setup after which certain features that add complexity but also

yield better results will be added. It is important to know that for the recurrent neural networks, all

data was normalized. Normalizing data makes the influence of every variable equal, it also decreases

the likelihood that the neurons will be saturated due to high input values. A second important factor

in reservoir computing is the fact that all reservoirs are initiated randomly and only the output-

weights are trained (FIGURE 5). The fact that only the output-weights are trained can be seen in

FIGURE 5 by the dotted lines. In contrast to the output-weights, the other weights remain fixed and

untrained.

41

Input Reservoir Output

FIGURE 5: EXAMPLE OF A RESERVOIR WITH IN- AND OUTPUT

Before the model is explained, some notations are given:

Input at time k

Reservoir states at time k

Actual output at time k

Predicted output at time k

: Weight matrix from input to reservoirrandom and fixed

Weight matrix between neurons in the reservoirrandom and fixed

Weight matrix from bias to reservoirrandom and fixed

Weight matrix from input to outputtrained

Weight matrix from reservoir to outputtrained

Weight matrix from bias to outputtrained

The matrices , and are first generated randomly. As soon as these matrices are generated

the reservoir states can be determined.

Two remarks can be made for this formula. First of all, the states of the neurons at time k+1 will be

determined by the input at time k+1. Secondly, the function that will be used in this thesis will be the

hyperbolic tangent. After the states are determined, the output-weight matrices are calculated such

that the quadratic error is minimized.

42

Where A is a matrix where each column contains the input, neuron states and bias for the different

observations. As soon as the output weights are determined, the output can be forecasted.

Meta-parameters/features that can be added

Regularization meta-parameter

In order to avoid that the output weights become too big, a regularization meta-parameter can be

added. This is equivalent to adding noise to the reservoir states, it makes a more general model

instead of a model that overfits on the training data. When using a regularization meta-parameter,

the output weights need to be calculated differently.

In order to determine the ideal regularization meta-parameter, cross-validation has to be executed.

When doing cross-validation the training data is divided into N different parts. Then for each

regularization meta-parameter we train on N-1 parts to test on the part that was not included for

training. For every regularization meta-parameter the error measure on the test part is stored. For

every regularization meta-parameter this process is repeated N times, after which an average error is

calculated per regularization meta-parameter. The regularization meta-parameter yielding the

minimum average error measure is the optimal one (in this thesis the error measure is the NMSE).

Input scaling, spectral radius and leak rate

The input scaling determines the degree of non-linearity. When the input scaling is low, the

reservoir is almost linear, while when the input scaling is high, the neurons are saturated and as such

switch between two extreme values which is non-linear.

After initiating , it is rescaled by dividing it by its largest, absolute eigenvalue. After this it will be

multiplied by the spectral radius. The spectral radius is the largest absolute eigenvalue of the matrix

. In general the spectral radius is smaller than or close to one, meaning that the neuron system is

on the edge of stability.

The dynamics of the reservoir can be slowed down by adding a leak rate . The closer the leak rate is

to zero, the slower the system will adapt. If the leak rate equals one, then there is no change

compared to the normal situation.

43

In order to determine the ideal combination of input scaling, spectral radius and leak rate, cross-

validation has to be executed. However if the ideal combination of these three meta-parameters has

to be found, it will be a double cross-validation that needs to be executed, since for every

combination the ideal regularization meta-parameter per part also has to be found.

Warm-up drop

To avoid that transient effects of the states impact the training of the neural network, a warm-up

drop is performed. During the warm-up drop, the first states of the neurons are omitted. To

determine the number of omitted states, the graph describing the activity of the neurons has to be

observed. The optimal number of omitted states is the number where the transient effects

disappear. It is important to know that, the more states are dropped, the less data is available to

train. This can be a crucial factor when only a limited amount of data is available.

Number of neurons

The number of neurons can have an impact on the results. If there is no regularization, an optimal

number of neurons can be found: there is a minimum error for a certain number of neurons because

increasing the neurons further, will increase the error (overfitting). However if there is regularization,

more neurons will yield better results. Nevertheless the advantage of adding more neurons will

decrease. A disadvantage of a large number of neurons is the increasing computational power that is

needed.

44

Results

The reservoir uses 12 input variables to determine the :

,

,

,

,

and . The data ranges from

11/1990 until 12/2011 and these 254 months will be divided into three subparts:

1-50: warm-up drop

51-230: training with double cross-validation

231-254: testing

For the double cross-validation there will be 30 parts (folds) of each six months. The double cross-

validation will be repeated 10 times per number of neurons in order to avoid that accidental results

would make the system spurious. The number of neurons that will be used, will vary from 50 neurons

up to 200 neurons with steps of 50 neurons.

When after the double cross-validation the ideal combination of input scaling, leak rate and spectral

radius is determined, another cross-validation is performed on the entire training set to get the

optimal regularization meta-parameter for testing. During the double cross-validation the

regularization meta-parameter was only determined on 29 sets because there still needed to be one

set available for testing. In contrast, in this case the regularization meta-parameter is determined on

30 sets while the test will be performed on the period ranging from month 231 until 254. Testing will

be performed at least 20 times so that there is an average test and training error plus standard

deviation on the test and training error. Other important remarks have to be made before looking at

the results:

There is no recursive prediction, meaning that the actual gold price from the previous period

will be an input variable instead of the forecasted gold price from the previous period.

The input variables for modelling the gold price at time t, are also observed at time t. In

order to use this model to forecast gold prices, investors will need predictions of these input

variables. In this thesis during testing, the actual observations of the input variables are used

as proxies for the predicted input variables.

As can be seen in TABLE 19 the model is well validated and as expected the validation error

decreases with an increasing number of neurons.

Neurons Average NMSE validation Std NMSE validation

50 0.0133 0.0011

100 0.0131 0.0008

150 0.0128 0.0009

200 0.0125 0.0010 TABLE 19: VALIDATION ERROR RESERVOIR COMPUTING

45

The optimal combinations of meta-parameters (input scaling, leak rate and spectral radius) for the

different number of neurons are presented in TABLE 20.

Leak Rate Input scaling Spectral radius

50 0.0631 0.0156 1.25

100 0.0631 0.0221 1.1

150 0.0316 0.0221 1

200 0.0316 0.0156 1 TABLE 20: OPTIMAL COMBINATION OF META-PARAMETERS FOR RC

It is clear in the gold price model that the leak rate is close to zero meaning that the model will slowly

adapt. The spectral radius is close to one which is common in reservoir computing. The input scaling

for the gold price model is low. Since the input scaling is a degree for non-linearity, a low input

scaling shows that the reservoir is more linear and as such the neurons less saturated.

The average training error is a bit smaller than the average validation error (TABLE 21).

Neurons Average NNSE train Std NMSE train

50 0.0077 0.0014

100 0.0057 0.0017

150 0.0052 0.0015

200 0.0053 0.0015 TABLE 21: TRAINING ERROR FOR RC

When looking at the average test error of 20 tests and its standard deviation (TABLE 22), it appears

that there is too much variation in testing. However the 20 tests show that these results fall apart

into two groups, one group with on average a NMSE of 0.1 and another group where the NMSE is

above two. Before looking into ways to solve this problem, the model that yielded the best results is

presented in TABLE 23 and GRAPH 18.

Neurons Average NMSE test Std NMSE test

50 1.8946 6.9254

100 3.7645 5.5540

150 7.1580 11.4967

200 4.1852 5.9896 TABLE 22: TEST ERROR FOR RC

Neurons NMSE test HIT test MAE test Regularization parameter

50 0.0997 65.2174% 0.0715 0.0032

100 0.2507 65.2174% 0.1263 0.0100

150 0.0988 60.8696% 0.0683 0.0032

200 0.1029 60.8696% 0.0778 0.0032 TABLE 23: TEST ERROR OF BEST MODEL FOR RC

46

GRAPH 18: PREDICTED VS ACTUAL GOLD PRICE FOR 200 NEURONS (RC)

A possible reason for the variation in the test results is the fact that during cross-validation the model

could already recognize the relations between variables that were coming in the future, but were

unseen in the past. If for example the sixth set of thirty was used for validation, the model could

learn from relationships behind number six but that were unseen until set five. This explains why the

model has a low validation error and a non-stable test error.

There are two different ways to get better and more stable test results. The first suggested

improvement is to leave out a certain amount of data after each validation set in order that the

model cannot learn from future patters that are closely located in time to the validation set. A

second way to improve the model is by decreasing the number of folds. If the number of folds are

decreased, the validation set will become larger. This makes it more difficult for the model to learn

from relationships in the short run that are located right behind a validation set. If this statement

holds true, the validation error should increase. To test this hypothesis in the thesis, the number of

folds was decreased from 30 to 4. The results are presented in TABLE 24. As can be seen from TABLE

24, the average validation error is four times larger than the average validation error when 30 folds

were used. This might be an indication that the hypothesis was correct. The test results with four

folds in TABLE 25 show a smaller average NMSE error with also a smaller standard deviation in

comparison when 30 folds were used.

Neurons Average NMSE validation

50 0.0506

100 0.0517

150 0.0460

200 0.0475 TABLE 24: VALIDATION ERROR FOR RC (4 FOLDS)

$ 1.000,00

$ 1.100,00

$ 1.200,00

$ 1.300,00

$ 1.400,00

$ 1.500,00

$ 1.600,00

$ 1.700,00

$ 1.800,00

$ 1.900,00

jan

/10

mrt

/10

mei

/10

jul/

10

sep

/10

no

v/1

0

jan

/11

mrt

/11

mei

/11

jul/

11

sep

/11

no

v/1

1

Predicted gold price

Actual gold price

47

Cross-testing is a technique comparable to cross-validation that can be used to test if the suggested

reason for the variation in the test results, is true. Cross-testing is often suggested in theory but

seldom applied in practice. However for financial time series analysis, cross-testing might have its

use. This technique can help to determine if models learn too much from information located just

behind the validation set.

Neurons Average NMSE test std NMSE test

50 0.3067 0.2750

100 0.3080 0.2727

150 0.4279 0.3784

200 0.4409 0.3727 TABLE 25: TEST ERROR FOR RC (4 FOLDS)

A way to improve the model from reservoir computing is to not only train the output weights but

also training the input and recurrent weights. One of the ways to do this, is called backpropagation

through time. Another advantage of applying backpropagation through time is the fact that it is

possible to determine the most important variables of a model. This feature can help in answering

the second research question, since reservoir computing is not capable of determining the most

important variables in a model.

It would be interesting to research if reservoir computing can add value to classical linear regression

by trying to model the residuals coming from linear regression, using the same input variables that

were used in developing the linear model. If the reservoir computing model can model the residuals,

then the added value of reservoir computing becomes clear, since it is able to help explain things that

were not explained by linear regression.

48

7.2.Backpropagation through time

Backpropagation through time is an often used classical gradient method to train recurrent neural

networks. In comparison to reservoir computing it requires more computational power. However in

general the results it yields are more accurate and stable.

Technique

Before the technique is explained in more detail, some important notations are given (FIGURE 6):

: Weight matrix from neural network and input to outputzero initialised and trained

: Weight matrix from input to neural networkrandomly initialised and trained

: Weight matrix between neurons in the neural networkrandomly initialized and trained (optional)

Input V Recurrent neural network U Output FIGURE 6: EXAMPLE OF A NETWORK WITH IN- AND OUTPUT (BPTT)

Initially a maximum number of iterations is determined. For every iteration until the maximum

number of iterations is reached, the network is trained and the states of the neurons are

determined. Each iteration results in an output and as such in an error which is the deviation from

the desired output. Per iteration, a gradient is calculated of the mean squared error with respect to

the different weight matrices. After calculating the gradient, the different weight matrices are

updated by subtracting the multiplication of the gradient with a learning rate .

In this thesis, backpropagation through time is applied in combination with cross-validation. The

training data is, as explained in the reservoir computing part, divided in different parts, where one

will serve as a validation set. For every validation fold, , and are reinitialised. Since there is a

gold price prediction for every iteration, there is also a validation error available. The optimal

49

number of iterations is the number where the validation error is minimal. This is because there can

be overfitting, which results in an increasing error when increasing the number of iterations.

The optimal combination of input scaling, learning rate, spectral radius and leak rate will be

determined by choosing the combination where the validation error for the optimal number of

iterations is minimal. The described process with the optimal meta-parameter combination is then

repeated on the entire training set, so there is no distinction between training and validation, until

the optimal number of iterations is reached. For the optimal number of iterations the gold price will

be predicted on the test set.

Results

The neural network uses 12 input variables to determine the :

,

,

,

,

and . The data ranges

from 11/1990 until 12/2011 and these 254 months will be divided into three subparts:

1-50: warm-up drop

51-230: training with double cross-validation

231-254: testing

For the cross-validation there will be four parts of each 45 months. The number of neurons that will

be used in this thesis is 100. The ideal combination of input scaling, leak rate, learning rate and

spectral radius is found where the validation error for the optimal number of iterations is minimal.

Testing will be performed at least five times so that there is an average test and training error plus

standard deviation on the test and training error. The two remarks that were made for reservoir

computing, are also important for backpropagation through time.

Backpropagation through time will be executed in three different ways. The first model will take the

optimal input scaling, spectral radius and leak rate from the best model in reservoir computing. This

model will only optimize and . The recurrent weights are not yet trained since they are also not

trained in reservoir computing. The second model will perform cross-validation for different meta-

parameters to get the optimal combination of input scaling, spectral radius, learning rate and leak

rate. This model will also only optimize and . The third model will perform cross-validation to get

the optimal combination of input scaling, spectral radius, learning rate and leak rate. However this

model will not only optimize and but also .

50

Optimal meta-parameters coming from RC without training (BPTT-model 1)

The optimal meta-parameters that were used in this model, were the optimal meta-parameters

coming from reservoir computing with 200 neurons and 30 folds:

Learning rate = 0.0005

Spectral radius = 1

Leak rate = 10^-1.5 = 0.03162

Input scaling = 2^-6 = 0.01563

The average performance of the model with the input meta-parameters coming from reservoir

computing, is presented in TABLE 26. The standard deviation on the different NMSE’s is presented in

TABLE 27. TABLE 26 and TABLE 27 show that the average result is in line with what was found in

reservoir computing when four folds were used. However in this specific case the best result of

backpropagation through time with four folds (TABLE 28) does not reach the NMSE test measure of

0.1 as was reached in the best case with reservoir computing on 30 folds (TABLE 23).

Number of neurons 100

Average NMSE test 0.5816

Average NMSE train 0.0066

Average NMSE validation 0.0468 TABLE 26: ERROR MEASURES FOR BPTT-MODEL 1


Std NMSE test 0.1141

Std NMSE train 0.0004

Std NMSE validation 0.0083 TABLE 27: STANDARD DEVIATION ON ERROR MEASURES FOR BPTT-MODEL 1


NMSE test 0.4559

NMSE train 0.0061 TABLE 28: ERROR OF THE BEST BPTT-MODEL 1

In GRAPH 19 the actual en predicted gold price are presented. The predicted gold price comes from

the best model (TABLE 28).

51

GRAPH 19: PREDICTED VS ACTUAL GOLD PRICE FOR 100 NEURONS (BPTT-MODEL 1)

As described earlier in this thesis, a major advantage of backpropagation through time is its ability to

demonstrate the importance of different variables. Backpropagation through time is one of the

techniques a researcher can use to indicate if an explanatory variable has a more linear relationship

with the dependent variable or a more non-linear relationship.

To look how important the presence of the neural network, previous gold price and other input

variables is, the standard deviation on the part of the gold price that was predicted by one of these

three categories, is calculated. TABLE 29 shows that the impact of the neural network on the

predicted gold price is even higher than the impact of the previous gold price on the predicted gold

price. This proofs that there is a substantial amount of gold price prediction that comes from the

neural network which contains non-linear neurons.

Std in predicted gold price

Impact of neural network 0.4419

Impact of previous gold price 0.3702

Impact of other input variables 0.3077 TABLE 29: IMPACT OF THE DIFFERENT CATEGORIES ON THE PREDICTED GOLD PRICE FOR BPTT-MODEL 1

It is also interesting to see which variables impact the predicted gold price via which way. There are

two options for a variable to have an impact on the predicted gold price. The first way is via the

neural network, . The impact that is caused by the neural network is substantial as showed by

TABLE 29. Variables can however also have a direct impact on the predicted gold price without

passing through the neural network (a part of , which is a linear relationship since there are no

neurons involved. In GRAPH 20 the importance of the 12 independent variables via the direct way is

presented. We assume that the importance of a variable is approximated by the norm of the vector

$ 1.000,00

$ 1.100,00

$ 1.200,00

$ 1.300,00

$ 1.400,00

$ 1.500,00

$ 1.600,00

$ 1.700,00

$ 1.800,00

$ 1.900,00

jan

/10

mrt

/10

mei

/10

jul/

10

sep

/10

no

v/1

0

jan

/11

mrt

/11

mei

/11

jul/

11

sep

/11

no

v/1

1

Actual gold price


52

representing the variable in the model. In TABLE 30 the variables belonging to the different numbers

are shown.

1

2

3

4

5

6

7

8

9

10

11

12

TABLE 30: INDEPENDENT VARIABLES

GRAPH 20: IMPORTANCE OF DIFFERENT INDEPENDENT VARIABLES VIA THE DIRECT WAY U (BPTT-MODEL 1)

The three variables having an important direct impact are the gold price in the previous period, the

dollar-world exchange rate and the CPI level of the world (GRAPH 20).

The four variables having an important impact via the neural network are the beta of gold, the

inflation in the US, the dollar-world exchange rate and the gold lease rate (GRAPH 21).

GRAPH 21: IMPORTANCE OF DIFFERENT INDEPENDENT VARIABLES VIA THE INDIRECT WAY V (BPTT-MODEL 1)

0,001

0,01

0,1

1

10

100

1 2 3 4 5 6 7 8 9 10 11 12

Importance (via U)

0

0,5

1

1,5

2

2,5

3

3,5

1 2 3 4 5 6 7 8 9 10 11 12

Importance (via V)

53

Optimizing the different meta-parameters via cross-validation without training (BPTT-model 2)

Cross-validation for different meta-parameters, resulted in an optimal combination of meta-

parameters that yielded the lowest validation error. The optimal combination of meta-parameters:


Spectral radius = 1.2

Leak rate = 1/125 = 0.008

Input scaling = 0.2

The average performance during training, validation and testing is presented in TABLE 31. It is

interesting to see that the average NMSE during the test period is as low as the best model of

reservoir computing. Another positive aspect of this model is its low standard deviation for the

different NMSE’s (TABLE 32). This model is good at predicting the gold price but on top is capable of

doing this in a consistent way (low standard deviation).









The outcome of the best model is presented in TABLE 33 and GRAPH 22.


NMSE test 0.0908


54


The average optimal number of iterations where the validation error is minimal equals 12027 with a

standard deviation of 1014.4. As can be seen in TABLE 34, the impact of the previous gold price

increased in comparison with the previous model. The standard deviation went up from 0.3702 to

0.4356. At the same time the impact of the eleven other input variables decreased to 0.2287.




Impact of other input variables 0.2287 TABLE 34: IMPACT OF THE DIFFERENT CATEGORIES ON THE PREDICTED GOLD PRICE FOR BPTT-MODEL 2

The variables with the highest importance to impact the gold price in the direct way are the same as

in the previous model. So the three variables having an important direct impact are the gold price in

the previous period, the dollar-world exchange rate and the CPI level of the world (GRAPH 23). TABLE

30 gives an overview of the variables corresponding with the different numbers.


$ 1.000,00

$ 1.100,00

$ 1.200,00

$ 1.300,00

$ 1.400,00

$ 1.500,00

$ 1.600,00

$ 1.700,00

$ 1.800,00

$ 1.900,00

jan

/10

mrt

/10

mei

/10

jul/

10

sep

/10

no

v/1

0

jan

/11

mrt

/11

mei

/11

jul/

11

sep

/11

no

v/1

1

Actual gold price


0,0001

0,001

0,01

0,1

1

10

100

1 2 3 4 5 6 7 8 9 10 11 12

Importance (via U)

55

The four most important variables to influence the gold price via the neural network, meaning the

non-linear system, are somewhat different compared to the previous model. The first variable that

has an important impact, is the gold price in the previous period. Apparently this variable does not

only have a significant direct impact but also an important indirect one. The second variable is the

price level in the US. The third and fourth variable are the gold lease rate, which is a proxy for the

real interest rate and the volatility of the world inflation (GRAPH 24).


0

5

10

15

20

1 2 3 4 5 6 7 8 9 10 11 12

Importance (via V)

56

Optimizing the different meta-parameters via cross-validation in combination with training

(BPTT-model 3)

Cross-validation for different meta-parameters, resulted in an optimal combination of meta-

parameters that yielded the lowest validation error. The optimal combination of meta-parameters:


Spectral radius = 1.2

Leak rate = 1/125 = 0.008

Input scaling = 0.04

The average performance during training, validation and testing is presented in TABLE 35. Since the

weights inside the neural network are also trained, it is expected that this model should have to

return a lower validation error than the previous one. In TABLE 35 the average validation error equals

0.0356 which is lower than the average validation error of TABLE 31, 0.0533.





The results in TABLE 35 also have a low standard deviation which proves that this model is stable,

just like the previous one (TABLE 36).





The NMSE on the train and test set of the best model can be seen in TABLE 37. The corresponding

predicted gold price for this model is presented in GRAPH 25.


NMSE test 0.0947


57


The average optimal number of iterations where the validation error is minimal equals 19111 with a

standard deviation of 811.1. Because in this model, also the weights in the neural network need to

be trained, it takes more iterations to reach to optimal point in comparison with the previous two

models. TABLE 38 shows that in this model the impact of the eleven other input variables becomes

larger compared to the previous model. There is an increase of almost 15%. The impact of the neural

network remains almost stable even though the recurrent weights are trained. There is only a minor

increase of only 2.7% compared to the situation where is not trained. The impact of the previous

gold price decreases, although not substantially.




impact of other input variables 0.26 TABLE 38: IMPACT OF THE DIFFERENT CATEGORIES ON THE PREDICTED GOLD PRICE FOR BPTT-MODEL 3

The variables with the highest importance to impact the gold price in a direct way, are the same as in

the previous model. So the three variables having an important direct impact are the gold price in the

previous period, the dollar-world exchange rate and the CPI level of the world (GRAPH 26). An

overview of the corresponding variables with the different numbers can be found in TABLE 30.

1000

1100

1200

1300

1400

1500

1600

1700

1800

1900

jan

/10

mrt

/10

mei

/10

jul/

10

sep

/10

no

v/1

0

jan

/11

mrt

/11

mei

/11

jul/

11

sep

/11

no

v/1

1

Actual gold price


58


Three of the four variables that have an important indirect impact also had an import impact in the

previous model even though then the recurrent weights were not yet trained. The previous model is

already capable of determining the more important variables. The three variables are the price level

in the US, the gold lease rate and the volatility of the world inflation. The fourth variable to have an

important impact, is the inflation in the US (GRAPH 27).


0,0001

0,001

0,01

0,1

1

10

100

1 2 3 4 5 6 7 8 9 10 11 12

Importance (via U)

0

2

4

6

8

10

12

14

1 2 3 4 5 6 7 8 9 10 11 12

Importance (via V)

59

8.Gaussian processes for machine learning

8.1.Technique Gaussian processes is another machine learning technique to model the gold price. Via Gaussian

processes, certain non-linear relationships between variables can be modelled, where linear

regression would face difficulties in modelling. This chapter will first explain the concept of Gaussian

processes in a qualitative way, after which it will be presented in a quantitative, generalising manner.

This chapter is based on the book “Gaussian Processes for machine learning” by Carl Edward

Rasmussen and Christopher K.I. Williams.

To explain the concept of Gaussian processes, I will only model the relationship between two

variables, y and x. For each data point x, we generate a Gaussian, centred on x. The Gaussian process

then approximates y by taking a linear combination of said Gaussians. The objective of Gaussian

processes is to minimize the mean squared error between every outputted value and the desired

output. It is in fact similar to more classical linear regression. The Gaussian process can then be used

to predict output values for new values of x. In GRAPH 28 the grey line shows how to get the

predicted value and compares it with the actual value of y.

GRAPH 28: GAUSSIAN PROCESSES (DOTTED LINE: INDIVIDUAL GAUSSIANS; FULL LINE: LINEAR COMBINATION OF GAUSSIANS)

The process when having multiple independent variables is similar to the one described above. The

only thing that changes, is the number of dimensions.

60

If there are N data points available for training, the relation between y and x: y = can be

determined by using the formula’s below. Assuming that the number of independent variables is ,

the formula’s are:

With being point i in the training set and being a new point. The different values of can be

found by minimizing the MSE. A Gaussian process is specified by a mean function and a

covariance function . The covariance function that will be used in this thesis, is the

multivariate Gaussian.

With being the component j of . To optimize the hyper-parameter of the covariance function,

automatic relevance determination will be used. The objective of using this technique, is to get a

clear view on the importance of a dimension (Rasmussen & Williams, 2006).

8.2.Results The model uses 12 input variables to determine the

:

,

,

,

,

and . The data ranges from

11/1990 until 12/2011 and these 254 months will be divided into two subparts:

1-230: training with cross-validation

231-254: testing

For the cross-validation there will be 5 parts of each 46 months. The optimal initial starting value of

the hyper-parameter for the covariance function will be determined via cross-validation. Cross-

validation is executed to avoid local optima.

The training results for the Gaussian processes (TABLE 39) are better than the training results coming

from linear regression, TABLE 18. This is as expected, since Gaussian processes include the training

data in their solution.

NMSE test 0.6156

NMSE train 0.0081

NMSE validation 0.0263 TABLE 39: RESULTS FOR GAUSSIAN PROCESSES

61

The test NMSE is however somewhat disappointing since it is underperforming most of the other

gold pricing models. This is due to the fact that the dynamics that can be observed during the last

two years were not present in the training set.

In GRAPH 29 one can compare the actual gold price against the predicted gold price for the test

period. The model underestimated the gold price during testing. This points to the fact that the

recent surge in the gold price of the last two years is extraordinary and unexpected.

GRAPH 29: PREDICTED VS ACTUAL GOLD PRICE (GP)

The importance of the different variables is depicted in GRAPH 30. The higher a bar is, the higher the

importance of the variable. The importance of a variable is equal to the maximum average of the

logarithm of the length scale that was found minus the average logarithm of the length scale for a

variable. An overview of the different variables can be found in TABLE 30. According to this model,

the most important variables are the gold price in the previous period, the beta of gold, the US-world

exchange rate, the gold lease rate and the price level in the world. It is interesting to see that the all

the models from this thesis have a similar idea about the importance of the different variables.

$ 1.000,00

$ 1.100,00

$ 1.200,00

$ 1.300,00

$ 1.400,00

$ 1.500,00

$ 1.600,00

$ 1.700,00

$ 1.800,00

$ 1.900,00

jan

/10

mrt

/10

mei

/10

jul/

10

sep

/10

no

v/1

0

jan

/11

mrt

/11

mei

/11

jul/

11

sep

/11

no

v/1

1

Actual gold price


62

GRAPH 30: IMPORTANCE OF DIFFERENT INDEPENDENT VARIABLES (GP)

0

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6 7 8 9 10 11 12

Importance

63

9.Discussion

This part of the thesis will respond to the two research questions by relying on the findings from the

previous chapters.

1. Compare the modelling and forecasting power of classic regression techniques against more

advanced non-linear machine learning techniques.

To compare the modelling and forecasting power of different techniques, one should, in theory look

at the average test errors since this is the most objective way. However since no cross-testing was

performed, a better option is to compare the average validation error even though there might be

some overfitting on the meta-parameters. The validation error is validated on the entire training set,

as such there is less chance involved when having a low error on the test period. If only the results of

the test period would be taken into account, it would be possible to achieve an excellent test result,

being just a lucky shot. In this thesis, also the result on the test period is taken into account to check

for the robustness of a certain model, in order to analyse whether the technique also performs well

on unknown data. In TABLE 40 an overview of the different error measures including their standard

deviation is presented. A naive model is included, which relies only on the gold price from the

previous time step to predict for the current time step.

Naive

ADL-4 folds

RC-30 folds

RC-4 folds

BPTT-model1

BPTT-model2

BPTT-model3

Gaussian processes

Average NMSE test 0.1479 0.1287 4.1852 0.4409 0.5816 0.0927 0.1059 0.6156

Best NMSE test NA NA 0.1029 0.3727 0.4559 0.0908 0.0947 NA

Std NMSE test NA NA 5.9896 0.3727 0.1141 0.0013 0.0071 NA

Average NMSE validation 0.0120 0.0119 0.0125 0.0475 0.0468 0.0533 0.0356 0.0263

Std NMSE validation NA NA 0.001 0.0045 0.0083 0.0014 0.0007 NA TABLE 40: DIFFERENT ERROR MEASURES FOR THE DIFFERENT MODELS (BEST CELL PER LINE IN GREY)

As can be seen in TABLE 40, the ADL-model, which represents the classic regression technique,

performs adequately, having a good test error compared to the other techniques and the best

validation error. The ADL-model performs well on the validation error measure if, during the period

where it is validated, the gold price moves in the same way as when estimated. However if the gold

price moves more exponential or non-linear during the testing/validating period but was estimated

on a linear part, the ADL-model faces more difficulties in achieving good results. The ADL-model is

based upon the underlying assumption of a linear system, this is not the case when using recurrent

neural networks, explaining the good result of the RC and BPTT-models on the test period. Since

during the test period (2010-2011) the gold price exploded, RC and BPTT perform well and face less

difficulties in such circumstances. There are three important advantages that make the ADL-model so

popular. The first reason is that this technique is easy to understand and to implement. A second

64

advantage is the low computational power it demands. For a researcher it is possible to obtain a

result within a few minutes without requiring serious demands regarding computational power. The

third and most important reason, is the easy interpretability of the model. If for example the dollar

appreciates by 1%, the gold price decreases with 0.5%. This kind of conclusions are difficult to make

using the other techniques. Nevertheless, next to these advantages there are some disadvantages. A

first one is that multiple conditions need to be satisfied before running the regression. Another

disadvantage is the inability of the technique to deal with non-linear effects. To overcome these

disadvantages, this thesis uses more advanced, non-linear machine learning techniques to model the

gold price.

The first applied machine learning technique is reservoir computing. A validation error of 0.0125 is

obtained when using 30 folds. The test results are not robust, given the high standard deviation. A

possible reason for this is that during cross-validation it is easy to recognize relations situated around

the validation set and achieve a good validation error. Though, on the test set, such information is

missing, since there are no data available after the test period, which makes it harder to obtain a

good result for the test period. This thesis suggests two ways to solve the problem. The first and best

way to do so, is by omitting a certain number of data points situated right behind the validation set,

making it more difficult to just interpolate for the validation set. A second option is to decrease the

number of folds. If the number of folds is lower, the number of data points per fold increases, making

it difficult to just extract certain relations. As can be seen in TABLE 40 the validation error is higher

when using less folds. The latter is to be expected since it is now more difficult to learn from unseen

relationships. The average test result also decreases, however it is still larger than the test error

resulting from linear regression.

To get an idea of the importance of the different variables and to train not only the output weights,

backpropagation through time is applied. The first model of backpropagation through time takes the

optimal meta-parameters coming from reservoir computing with 30 folds. This model only trains the

input and output weights, not the recurrent weights. The validation and test errors are somewhat

comparable to the result in reservoir computing with 4 folds. The high error confirms once again that

the cross-validation in reservoir computing is not performed well. Acknowledging this, it seems more

interesting to optimize the meta-parameters within the model itself instead of using the reservoir

computing meta-parameters. The second BPTT-model optimizes the meta-parameters without

training the recurrent weights. The results for the validation and test errors are good compared to

the other techniques. The third BPTT-model optimizes the meta-parameters and trains the recurrent

weights. This model has the lowest validation error of the three BPTT-models in combination with a

65

good test result. A disadvantage of training the recurrent weights as well, is that there is a higher

number of optimal iterations needed, implying that more computational power is required.

The validation error for Gaussian processes (GP) is good and outperforming backpropagation through

time. However the test result is somewhat disappointing. The predicted gold prices are too smooth

compared to the actual ones. This is caused by the cross-validation, as explained before. The

predicted gold prices for the last two years, according to GP, are lower than the actual gold prices.

During these two years an increase in the gold price of more than 50% took place. Based upon the

historical relationship between the input parameters and the gold price in the training period and the

input parameters in the test period, GP predicts lower gold prices. This confirms the believes that

certain relationships between the gold price and some input variables were, in the lost two years, no

longer present or that the gold price surged, induced by a huge amount of fear. Relying on this

insight, one can pose the question whether the current gold prices are overvalued?

Based upon the test and validation error, it can be concluded that the best technique is

backpropagation through time with optimizing the meta-parameters and training the recurrent

weights. Reservoir computing is promising but only when it is properly executed. Furthermore, linear

regression is a good technique, since it gives good predictions and is easy to understand and

interpret. An important remark concerning linear regression is that during periods without any

linearity, the technique performs worse than non-linear techniques.

Two important suggestions can be made when applying these techniques in practice. A first

recommendation is to use as many techniques as possible when allowed by the circumstances

(timing). This way different techniques can be used as a kind of confirmation on how the gold price

will behave, which is interesting when there is a lot of money at stake. When conditions do not allow

for the previously described approach, a second suggestion is to consider and analyse all the

different techniques in order to pick the most appropriate one. Each technique has its advantages

and disadvantages, implying that choosing the right technique for a specific problem can improve the

results. To put it metaphorically: “You do not use a butchers knife to put butter on your sandwich!”.

GRAPH 31 presents the actual and predicted gold prices.

To conclude this section, the different techniques are ranked according to their use of computational

power (TABLE 41). The computational power used by a certain technique is approximated by the

time a technique requires to generate a solution.

1 ADL-Model

2 Gaussian Processes

3 Reservoir Computing

4 BPTT TABLE 41: COMPUTATIONAL POWER RANKING

65

GRAPH 31: PREDICTED VS ACTUAL GOLD PRICE (BEST PREDICTION)

$ 1.000,00

$ 1.100,00

$ 1.200,00

$ 1.300,00

$ 1.400,00

$ 1.500,00

$ 1.600,00

$ 1.700,00

$ 1.800,00

$ 1.900,00

jan

/10

feb

/10

mrt

/10

apr/

10

mei

/10

jun

/10

jul/

10

aug/

10

sep

/10

okt

/10

no

v/1

0

dec

/10

jan

/11

feb

/11

mrt

/11

apr/

11

mei

/11

jun

/11

jul/

11

aug/

11

sep

/11

okt

/11

no

v/1

1

dec

/11

Actual Gold Price

Predicted Gold Price (ADL model)

Predicted Gold Price (RC)

Predicted Gold Price (BPPT-model 1)



Predicted Gold Price (Gaussian Processes)

66

2. What are the most important determinants of the gold price according to the different

methods that were used for modelling the gold price?

An overview of the importance of the different variables can be found in TABLE 42. It is interesting to

see that four variables were not appointed as being important for any model. The first one is the

composite leading indicator, providing early signals of turning points between expansions and

slowdowns of economic activity. It is somewhat surprising that there was no model that indicated

this variable as important. A possible explanation could be that these turning points are already

captured by the other variables in the system, like for example inflation. The second and third

variable are the world inflation and the volatility of the US inflation. The reason for their

unimportance could be once again that their information is already captured by the other variables,

like the volatility of the world inflation and the US inflation. The fourth and final variable is the CRDP,

a measure for economic and financial risk. Though most people think of this variable as a crucial

factor, determining the gold price, no evidence for its importance was found.

A common factor in every model is the presence of a variable which is connected with the price level

in the US or the world. So it is important to have at least one of the three following variables included

in the model: the price level in the US, the price level in the world or the inflation in the US. In every

machine learning model the general price level in the world is important, sometimes in combination

with the US price level or inflation.

Interesting to notice is the consistency of the important variables that have a direct impact in the

BPTT- models. Each of the three models considers the previous gold price, the exchange rate and the

price level in the world to be important in a direct, linear way. Of these three variables two were

important in the linear ADL-model, the previous gold price and the exchange rate.

Surprisingly the gold lease rate, being not important according to the ADL-model, is important

according to the machine learning models. Assignable reason for this is that the impact of the gold

lease rate is somewhat more indirect, suggesting that the impact is more non-linear, through which it

cannot be captured by the ADL-model.

Two final remarks can be made about the beta of gold and the volatility of the world inflation. Firstly,

the beta of gold is found to be important by some machine learning models but not by the ADL

model. Nevertheless the beta in the ADL-model is almost significant on the 10% level, which, in the

end, makes it the fourth most important variable in the ADL-model. The importance of the

characteristics of the beta of gold in practice cannot be underestimated since it is one of the most

important reasons for investors to buy gold, in order to diversify their portfolio. The final remark

handles about the importance of the volatility of the world inflation via the indirect way in the two

67

BPTT-models. This volatility, which is higher during crisis, has an important impact which is not

captured by the ADL-model trained on the entire training set. However when the ADL-model is

estimated on a period from 1998 until 2009, the volatility in the world inflation becomes significant.

This shows that artificial intelligence models outperform the linear techniques in modelling when the

importance of the underlying variables of a system changes.

Variables ADL-model

BPTT-model 1 BPTT-model 2 BPTT-model 3 Gaussian Processes Direct Indirect Direct Indirect Direct Indirect

TABLE 42: OVERVIEW OF THE IMPORTANCE OF THE DIFFERENT VARIABLES (IMPORTANT VARIABLE: CELL IS GREY)

In conclusion, eight out of twelve variables are considered to be important in the gold price

modelling. Four of them have a more direct, linear impact: the gold price in the previous period, the

inflation in the US, the exchange rate and the price level in the world. The other four variables will

impact the gold price in a more indirect, non-linear way: the beta of the gold price, the gold lease

rate, the volatility of the world inflation and the price level in the US.

In the research of Ghosh and others (2004) and Levin and Wright (2006) similar determinants are

found to be important, which adds credibility to the results found in this thesis. Ghosh and others

(2004) found that lagged gold prices are important as well as current and lagged gold lease rates.

Furthermore they found three other variables to be important in their study, namely the US-World

exchange rate, the current and lagged price levels in the US and the current and lagged betas of gold.

Levin and Wright (2006) found that lagged gold prices, US inflation, volatility of the US inflation, the

US-World exchange rate, the lagged gold lease rate and the lagged CRDP are important in explaining

gold prices. Both studies used cointegretion regression as technique. This research was not able to

find papers that used any of the other mentioned techniques in combination with relevant input

parameters.

68

Future evolution

Based on the results of this study, the gold price will keep on increasing for at least 3 years, after

which there might be a decrease in the gold price. Six of the eight variables, that determine the

direction of how the gold price will evolve, are described in this section: the inflation in the US, the

US price level, the world price level, the volatility of the world inflation, the exchange rate and the

real interest rate (which is equal to the gold lease rate). The evolution of the other two variables that

have an impact on the gold price, is difficult to determine. Therefore, these variables will not be

described in this section.

Based on the low to moderate inflation expectations in the US, the gold price should remain stable or

increase slowly. However in the media, it is often stated that because of money creation by central

banks, hyper-inflation could occur. There are four reasons why this will probably not be the case.

First of all, it is true that the reserves of the banks increased by more than 4000% in the last 5 years.

However the explosion of reserves resulted in an increase of only 25% in money supply. Reason for

this is that the Federal Reserve pays interests on funds that were placed with them. Banks find it

more interesting to give their excess funds to the Fed instead of lending it to households and

corporations. Secondly the high unemployment and the negative output-gap observed in the US,

puts pressure on wages and prices of products, which will favour a low inflation. Third, the thesis

finds proof supporting the idea that more reserves do not cause higher inflation. Japan tried to use

quantitative easing to fight deflation around 2001, but they never succeeded. The last reason why

excessive reserves will not lead to a high level of inflation, is the power of the Federal Reserve. When

the Federal Reserve observes an increasing trend in inflation, they find a way to manipulate the

excessive reserves in various ways so that inflation can be kept at acceptable levels (Feldstein, 2012).

Based on the expected US-world exchange rate, the gold price is expected to increase. There are four

reasons why the dollar is expected to depreciate, causing an increase in the gold price. First of all, the

bad US trade balance will put downward pressure on the value of the dollar, inducing it to

depreciate. Secondly, according to the purchasing-power parity, cfr. the Big Mac index (The

Economist) on the 12th of January 2012, the dollar is overvalued when comparing with for example

China, Turkey, Russia, India... A the third reason is the fact that the International Monetary Fund and

the UN are looking for a new global currency. This new currency could temper global demand for

dollars, forcing the US dollar into a depreciation. Lastly, increasingly countries are using their

domestic currency to trade with each other, which has once again a negative effect on the demand

side of the US currency. For example China and Japan no longer trade with each other using the

dollar, but now use their own, domestic currency.

69

Since real interest rates are expected to remain low, even negative, investors find it attractive to buy

gold, implying an increase in the demand for gold. This rise in demand will translate itself into higher

gold prices. The real interest rate equals the nominal interest rate minus inflation. Since the Federal

Reserve announced to keep nominal interest rates historically low at least until the end of 2014 and

because inflation is also low, real interest rates are expected to stay close to zero for another three

years.

Price level in the US and the world are expected to increase at slow to moderate paces, leading to

higher gold prices. The volatility of the world inflation is expected to remain high as long as there is a

crisis (supra). Higher volatility of the world inflation will cause a higher demand for gold, which will

result in a higher gold price.

Combining these six elements, suggests a future increase in the gold price. A potential, maximum

price that will be reached in the coming 2 to 3 years, is $2465.57 per ounce, since this is the gold

price in constant 2011 terms that was reached on January 21 in 1980. As soon as the gold price

reaches this psychological border and one of the underlying drivers of the gold price changes, the

gold price will rapidly decrease since gold is nowadays a liquid asset due to the existence of ETF’s. For

example, an underlying driver that can change, is the real interest rate. If the economy picks up over

3 years time and interest rates start to rise, investors will realize that gold offers no annual return like

shares with dividends. In the words of Warren Buffett: “If you offer me the choice of looking at some

67 foot cube of gold all day or having all the farm land in the US plus 7 Exxon Mobil’s plus a trillion

dollars, I would choose the second option! You can maybe fondle the cube but it won’t produce

anything.” (De Tijd, 2012)

70

10.Conclusion

Throughout the history of mankind, gold has always been an intriguing asset. The Greeks, the

Romans, the Egyptians… all used gold as a trade currency, a token of wealth or a safe-haven in times

of political or financial turmoil. Due to the recent economic problems, gold is prominently back in the

world of investing, leading to new heights in the nominal gold price. In constant 2011 terms the gold

price is however not yet reaching its top, which was reached in January 1980 with a monthly gold

price of $1958.85.

Modelling the price of gold is a difficult job. Unlike assets like shares or bonds, gold does not offer

any dividends or interest payments. As such, the price cannot be calculated by discounting future

cashflows. Another way to look at the problem is by investigating the drivers of supply and demand

of gold and their impact on the expected gold price. This thesis researched different modelling and

forecasting techniques for the gold price, but also the diverse determinants constituting the gold

price. The different techniques that were investigated, were linear regression, reservoir computing,

backpropagation through time and Gaussian processes. The approach that is followed to answer

these questions is known as the inflation-hedge approach.

The best technique is backpropagation through time with optimizing the meta-parameters and

training the recurrent weights. Another promising technique is reservoir computing, but only when it

is properly executed, meaning leaving data points out after each validation set. Linear regression

provides good results and is easy to understand and interpret. Two important remarks need to be

made concerning linear regression. First, for the period November 1990 until December 2011 no long

run relationship between the gold price and the general price level in the US was found. However

this does not mean that gold cannot be used to hedge against inflation on the long run. Since there

was no long run relationship, this thesis used an ADL-model instead of a cointegration regression.

The second remark concerning linear regression is the fact that during periods where the linearity is

absent, linear regression performs worse than non-linear techniques.

Two suggestions can be made when applying these techniques for financial time series. If time allows

so to do, use as many techniques as possible. If they confirm each other then the likelihood that the

gold price will indeed move in the suggested direction, will be higher as compared to a situation

where all techniques tend to disagree on gold price movements. The second suggestion is to think

before acting. When solving a problem, start with analysing the different characteristics of the

problem. Based upon this investigation, choose the technique which fits best. Every technique has its

advantages and disadvantages which makes it more suitable for solving a specific problem.

71

To answer the second research question, the importance of the different variables in the different

techniques was investigated. Eight of the twelve variables are considered to be important in the gold

price modelling. Of these eight, four have a direct, linear impact: the gold price in the previous

period, the inflation in the US, the exchange rate and the price level in the world. The other four

variables will impact the gold price in an indirect, non-linear way. These four variables are: the beta

of the gold price, the gold lease rate, the volatility of the world inflation and the price level in the US.

To put it in a simplistic way, the gold price at time t equals to the gold price at time t-1 plus some

extra’s.

Based on the results of this study, the gold price is expected to increase for at least 3 years, after

which there might be a decrease in the gold price. Six of the eight important variables, that

determine the direction of how the gold price will evolve, show an evolution that induces an increase

of the gold price. The evolution of the other two variables that have an impact on the gold price, is

difficult to determine. Therefore, these variables were not described. A potential, maximum price

that will be reached in the coming 2 to 3 years, is $2465.57 per ounce, since this is the gold price in

constant 2011 terms that was reached on January 21 in 1980.

Lastly some suggestions for future research can be made. An important step would be to include

cross-testing for all the different models, so that the different techniques can be compared in an

objective way. It would also be interesting to research if non-linear machine learning techniques can

add value to classical linear regression by trying to model the residuals coming from linear

regression, using the same input variables that were used in developing the linear model.

Furthermore, linear regression can be improved by using the Johansen test, which allows more

variables to be cointegrated and as such is able to capture more dynamics. A last point of further

improvement for future research is the possibility to include a political risk variable.

XI

VIII References Bollen, J., Mao, H., Zeng, X., 2010, Twitter mood predicts the stock market. Journal of

Computational Science, vol. 2, pp. 1-8

Cai, C.X., Cheung, Y.-L., Wong, M.C.S., 2001, What moves the gold market? Journal of

Futures Markets 21, pp. 257–278

Calvo, G.C., Reinhart, C.M., 2002, Fear of floating, The Quarterly Journal of Economics, vol.

117, pp. 379-408

Coudert, V., Couharde, C., Mignon, V., 2010, Exchange Rate Flexibility across Financial

Crises, CEPII

Diba, B. and Grossman, H., 1984, Rational Bubbles in the Price of Gold, NBER Working Paper:

1300. Cambridge, MA, National Bureau of Economic Research

Dooley, M.P., Isard, P. and Taylor, M.P., 1995, Exchange Rates, Country-specific Shocks and

Gold, Applied Financial Economics, vol. 5, pp. 121-129

Eichengreen, B., Flandreau, M., 1997, Gold Standard In Theory & History, Routledge, Taylor

& Francis Group

Enders, W., 2004, Applied Econometric Time Series, Wiley, pp. 213

Engle, R.F. and Granger, C.W.J., 1987, Cointegration and Error-Correction: Representation,

Estimation and Testing, Econometrica, vol. 55, pp. 251-276

Ghosh, D.P., E.J. Levin, P. Macmillan and R.E. Wright and, 2004, Gold as an Inflation Hedge?,

Studies in Economics and Finance, vol. 22, no. 1, pp. 1-25

Ismail, Z., Yahya, A., Shabri, A., 2009, Forecasting gold prices using multiple linear regression

method, American Journal of Applied Sciences 6, no. 8, pp. 1509-1514

Jaeger, H., 2005, A tutorial on training recurrent neural networks, covering BPTT, RTRL, EKF

and the “echo state network” approach

Kolluri, B.R., 1981, Gold as a hedge against inflation: an empirical investigation. Quarterly

Review of Economics and Business 21, pp. 13–24

Lampinen, A., 2007, Gold Investments and short- and long-run price determinants of the

price of gold, Working paper, LUT, School of Business

Levin, E.J., Wright R.E., 2006, Short run and long run determinants of the gold price. World

Gold Council, London

Mahdavi, S., Zhou, S., 1997, Gold and commodity prices as leading indicators of inflation:

tests of long run relationships and predictive performance. Journal of Economics and

Business 49, pp. 475–489

XII

McCann, P., Kalman, B., 1994, A neural network model for the gold market, Washington

University

Mirmirani, S., Li, H., 2004, Gold Price, neural networks and genetic algorithm,

Computational Economics 23, no. 4, pp. 193-200

Moore, G., 1990, Gold Prices and a Leading Index of Inflation, Challenge, vol. 33, pp. 52-56

Parisi, A., Parisi, F., Diaz, D., 2008, Forecasting gold price changes : rolling and recursive

neural network models, Journal of multinational financial management 18, no. 5, pp. 477-

487

Pindyck, R.S., 1993, The Present Value Model of Rational Commodity Pricing, Economic

Journal, vol. 103, pp. 511-530

Ranson, D., 2005b, Inflation Protection: Why Gold Works Better Than “Linkers”, London,

World Gold Council

Rasmussen, C., Williams, C., 2006, Gaussian Processes for Machine Learning, the MIT Press

Shafiee, S., Topal, E., 2010, An overview of global gold market and gold price forecasting.

Resources Policy, vol. 35, no. 3, pp. 178-189

Sherman, E.J, 1983, A Gold Pricing Model, Journal of Portfolio Management, vol. 9, pp. 68-70

Sjaastad, L.A. and F. Scacciallani, 1996, The Price of Gold and the Exchange Rate, Journal of

Money and Finance, vol. 15, pp. 879-897

Websites:

World Gold Council, 2002, Gold Demand Trends, Retrieved October 15, 2011,

<http://www.gold.org/investment/research/regular_reports/gold_demand_trends/>













http://www.gold.org/investment/research/regular_reports/gold_demand_trends/







XIII







US Geological Survey, 2011, Mineral Commodity Summaries, Retrieved October 15, 2011,

< http://minerals.usgs.gov/minerals/pubs/commodity/gold/mcs-2011-gold.pdf>

Newspaper articles:

Feldstein, M., 2012, Fed Policy and Inflation Risk, De Tijd

Waarom wantrouwen? (2012, March 8). De Tijd




DETERMINANTS OF THE GOLD PRICE - Ghent University€¦ · UNIVERSITEIT GENT FACULTEIT ECONOMIE EN...

Documents

Transcript of DETERMINANTS OF THE GOLD PRICE - Ghent University€¦ · UNIVERSITEIT GENT FACULTEIT ECONOMIE EN...