Choosing parameters/model for stock data prediction.

How can we predict stock prices using modern computer power and such technology as neural net, data filtering?Here are the results of research on what models/parameters can be chosen in order to get good enough prediction.
For calculations below I used NPREDICT Program from [1] and real stock data for 2006.
1. Number of lags in input data.
Here is the table with values of cumulative error for 30 points based on the number of lags. The error is calculated as:
SUM(actual_value-predicted_value)2

Number of lags:014567891011
Error:2.83.03.153.173.194.98.67.77.59.4

2. Number of data points used for prediction.
I used different number of daily points for input to neural net. From the data we can see that more points not always is the better.

Number of points:19801230730480355
Cumulative Error on 5 predicted points:0.310.230.200.170.16
*Cumulative Error on 10 predicted points:4.153.443.143.131.27
* 10 includes prev. 5 points

3. Data filtering before input to neural net


In this experiment the original data separated in high and low signal by filter and then inputted to PNN.The resulting series then added. The above figure shows the data flow.
Number of points959707707
Filter Frequency (Width=0.05)F=0.25F=0.015
Cumulative Error on 5 predicted points0.180.160.18
Cumulative Error on 10 predicted points3.423.401.60

Looking good? Unfortunately the picture is different when stock prices make big jumps in one or few days.The system cannot predict those big spikes.Here is the example. The last actual value is 35.5, width of filter W=0.05, number of points is 705.
Actual dataF=0.35F=0.25
36.2135.5235.58
37.3635.6435.54
38.0035.5035.61
Cum. Error on 3 points 9.79.4

4. Differencing
To address the above problem I decided to try differencing which is using difference between current value and previous instead of actual value.I used the same example as before with the last actual value is 35.5, number of points is 705, but width of filter W=0.05 and F=0.1.The results that I got are much better.
Actual dataActual Diff.Predicted
36.210.710.81
37.361.151.28
38.000.640.76
Cum. Error on 3 points  0.04

The last example is big improvement but this is just one prediction. Testing predictions on some another days does not give good results. I will investigate this in my next project.

References
1. Masters, Timothy (1995) Neural, Novel & Hybrid Algorithms for Time Series Prediction. John Wiley & Sons, Inc.
Next