PDF STOCKMARKETPREDICTION% - Computer Engineering

[Pages:44]

STOCK MARKET PREDICTION

Pawan Kumb hare; Ro hit Makhija; Hitesh Raichandani

SANTA CLARA UNIVERSITY

P REFACE

The report has been made in fulfillment of the requirement for the subject: Pattern Recognition and Data Mining in March 2016 under the supervision of Dr. Ming--Hwa Wang.

For making this project we have studied various concepts related to the stock market and how they can be used. We also studied about various Machine Learning algorithms and tools that can be used to solve the problem easily.

The project aims at applying two machine learning algorithms; Decision Trees and Support Vector Machines and analyze how these algorithms performs at predicting the stock market.

1 | P a g e

A C K N O W LED G EM EN T

Apart from the efforts of ourselves, the success of any project depends largely on the encouragement and guidelines of many others. We take this opportunity to express our gratitude to the people who have been instrumental in the successful completion of this project. We would like to show our greatest appreciation to Dr. Ming--Hwa Wang. We thank him for his tremendous support and help. The guidance and support received from all the members who contributed and who are contributing to this project, was vital for the success of the project.

2 | P a g e

T A B LE O F C O N T EN T S

P REFA CE ........................................................................................................................................................... 1 A C K N O W LED G EM EN T ...................................................................................................................................... 2 T A B LE O F C O N T EN T S . ...................................................................................................................................... 3 L IST O F F IG U R ES . ............................................................................................................................................. 5 1

I N TRO DUCTIO N ........................................................................................................................................ 8

1 . 1

O BJECTIVE . ....................................................................................................................................... 8 1 . 2

W H A T IS TH E PR O B LEM ? ................................................................................................................ 8 1 . 3

W H Y T H I S I S A P R O J E C T R E L A T E D T O T H I S C L A S S ? ...................................................................... 8 1 . 4

W H Y O T H E R A P P R O A C H IS N O G O O D ? .......................................................................................... 8 1 . 5

W H Y Y O U T H I N K Y O U R A P P R O A C H I S B E T T E R ? . ............................................................................ 9 1 . 6

S T A T E M E N T O F T H E P R O B LE M ....................................................................................................... 9 1 . 7

A R E A O R S C O P E O F IN V E S T IG A T IO N ............................................................................................ 10 2

T H E O R E T IC A L B A S E S A N D L IT E R A T U R E R E V IE W ................................................................................. 11 2 . 1

D E F IN IT IO N O F T H E P R O B LE M . ..................................................................................................... 11 2 . 2

T H E O R E T I C A L B A C K G R O U N D O F T H E P R O B L E M .......................................................................... 11 2 . 3

R E L A T E D R E S E A R C H T O SO L V E T H E P R O B L E M . ............................................................................ 11 2 . 4

A D V A N T A G E / D I S A D V A N T A G E O F T H O S E R E S E A R C H ................................................................... 12 2 . 5

O U R S O L U T IO N T O S O L V E T H IS P R O B L E M ................................................................................... 12 2 . 6

W H Y O U R S O L U T I O N I S D I F F E R E N T F R O M O T H E R S ? .................................................................. 12 2 . 7

W H Y O U R S O L U T IO N IS B E T T E R ? ................................................................................................. 12 3

H YPO TH ESIS ........................................................................................................................................... 13 3 . 1

P O S IT IV E / N E G A T IV E H Y P O T H E S IS ................................................................................................ 13 4

M ETH O D O LO G Y ..................................................................................................................................... 14 4 . 1

H O W T O C O LL E C T IN P U T D A T A ? . .................................................................................................. 14 4 . 2

H O W T O S O LV E T H E P R O B L E M ? . .................................................................................................. 14

4.2.1 ALGORITHM DESIGN. ..................................................................................................... 1 6 USING DECISION TREES ............................................................................................................... 1 6 USING SUPPORT VECTOR MACHINES .............................................................................................. 1 7 4.2.2 LANGUAGE USED . ......................................................................................................... 1 8 R (PROGRAMMING LANGUAGE) [4] ? . ............................................................................................ 1 8 4.2.3 TOOLS USED . .............................................................................................................. 1 8 RSTUDIO DESKTOP [5] ?. ............................................................................................................. 1 8

3 | P a g e

4 . 3

H O W T O G E N E R A T E O U T P U T ? . ..................................................................................................... 18 4 . 4

H O W T O P R O V E C O R R E C T N E SS ? .................................................................................................. 19 5

M ETH O D O LO G Y ..................................................................................................................................... 22 5 . 1

C O DE .............................................................................................................................................. 22

5.1.1 DECISION TREE IMPLEMENTATION CODE . .......................................................................... 2 2 5.1.2 SVM IMPLEMENTATION CODE . ....................................................................................... 2 5 5 . 2

D E S IG N D O C U M E N T A N D F L O W C H A R T . ....................................................................................... 28 5.2.1 Design Document . .......................................................................................................................... 28 METHODS USED FOR INDICATORS. .................................................................................................. 2 8 METHOD USED FOR DECISION TREE . ............................................................................................... 2 8 METHOD USED FOR PRUNING DECISION TREE . .................................................................................. 2 8 METHOD USED FOR PREDICTING THE OUTPUT . .................................................................................. 2 8 METHODS USED FOR EVALUATING THE MODEL . ................................................................................. 2 8 METHOD USED FOR SVM . ............................................................................................................ 2 9 METHOD USED FOR PREDICTING THE OUTPUT . .................................................................................. 2 9 METHODS USED FOR EVALUATING THE MODEL . ................................................................................. 2 9 5.2.2 Flowchart . ....................................................................................................................................... 30 6

D A T A A N A L Y S IS A N D D ISC U SSIO N ....................................................................................................... 31 6 . 1

O U T PU T G EN ER A T IO N .................................................................................................................. 31 6 . 2

O U TPU T A N A LYSIS ........................................................................................................................ 31 6 . 3

C O M P A R E O U T P U T A G A IN S T H Y P O T H E S IS ................................................................................. 33 6 . 4

S T A T IST IC R EG R ESSIO N . ................................................................................................................ 33 6 . 5

D ISCU SSIO N ................................................................................................................................... 33 7

C O N C L U S IO N A N D R E C O M M E N D A T IO N S ............................................................................................. 34 7 . 1

S U M M A R Y A N D C O N C LU SIO N . ...................................................................................................... 34 7 . 2

R E C O M M E N D A T IO N S F O R F U T U R E S T U D IE S ............................................................................... 34 8

B IBLIO G RA PH Y ....................................................................................................................................... 35 9

A PPEN DICES ........................................................................................................................................... 36 9 . 1

P R O G R A M F LO W C H A R T ................................................................................................................ 36 9 . 2

P R O G R A M S O U R C E C O D E A N D D O C U M E N T A T IO N ..................................................................... 37 9.2.1 DECISION TREE IMPLEMENTATION CODE . .......................................................................... 3 7 9.2.2 SVM IMPLEMENTATION CODE . ....................................................................................... 4 0 9 . 3

I N PU T / O U T P U T L IST IN G ............................................................................................................... 43

4 | P a g e

Input 43 Output 43

L IST O F F IG U R ES

Figure 1 : Steps to collect Input Data. ........................................................................................................ 14 Figure 2 : Steps to generate output . .......................................................................................................... 19

5 | P a g e

Table 1: Confusion Matrix. ......................................................................................................................... 19 Table 2: Effect of Indicators on prediction accuracy ................................................................................. 32

6 | P a g e

ABSTRACT

The prediction of a stock market direction may serve as an early recommendation system for short--term investors and as an early financial distress warning system for long--term shareholders. Forecasting accuracy is the most important factor in selecting any forecasting methods. Research efforts in improving the accuracy of forecasting models are increasing since the last decade. The appropriate stock selections those are suitable for investment is a very difficult task. The key factor for each investor is to earn maximum profits on their investments.

In this paper Support Vector Machine Algorithm (SVM) is used. SVM is a very specific type of learning algorithms characterized by the capacity control of the decision function, the use of the kernel functions and the scarcity of the solution. In this paper, we investigate the predictability of financial movement with SVM. To evaluate the forecasting ability of SVM, we compare its performance with Decision trees.

These methods are applied on 2 years of data retrieved from Yahoo Finance. The results will be used to analyze the stock prices and their prediction in depth in future research efforts.

7 | P a g e

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download