Aspects Regarding a Deep Understanding of the Prediction for Stock Market Movements

Hu Xuemei

doi:10.5772/intechopen.115081

Abstract

It is an important puzzle in the financial market to predict stock return movement direction. In this chapter, we not only propose (group) penalized logistic regression with multiple indicators to predict up- or downtrends, but also propose group penalized trinomial logit regression with multiple indicator groups to predict stock return movement direction: uptrends, sideways trends and downtrends. For the former, we construct the corresponding coordinate descent (CD) algorithm to complete variable selection and obtain parameter estimator, and introduce two-class confusion matrix, Receiver Operating Characteristic (ROC) and the area under a ROC curve (AUC) to assess two-class prediction performance. For the latter, we develop a rapidly convergent group coordinate descent (GCD) algorithm to simultaneously complete group selection and group estimation, introduce the relatively optimal Bayes classifiers to identify class indexes, and finally adopt three-class confusion matrix, Kappa, PDI, ROC surface and hypervolume under the ROC manifold (HUM) to assess three-class prediction performance.

Keywords

technical indicators
stock return movement direction
coordinate descent algorithm
group coordinate descent algorithm
prediction accuracy
G-LASSO/G-SCAD/G-MCP estimators

Author Information

Show +

Hu Xuemei*
- School of Mathematics and Statistics, Chongqing Technology and Business University, Chongqing, China

*Address all correspondence to: huxuem@163.com

1. Introduction

Forecasting the financial market is a major challenge in both academic and business. In recent decades, some methods have been proposed to forecast the future stock returns or trend direction. For example, time series analysis develops some statistical models such as ARIMA, ARCH and GARCH to study the past stock behavior and predict the future stock returns. Technical analysis developed by Murphy [1] and Edwards et al. [2] on financial market and stock trends applies statistical charts, technical indicators and historical data to forecast stock behavior. Fundamental analysis studies the economic factors that may influence market movements and applies financial methodologies with fundamental variables to predict stock market. For instance, Cavalcante et al. [3] sought economic factors influencing market trends. Nti et al. [4] summarized fundamental and technical analysis for stock market prediction. Machine learning techniques model nonstationary and nonlinear data to predict market behavior. Henrique et al. [5] overviewed machine learning techniques on financial market prediction by the bibliometric analysis. Jiang [6] summarized stock market prediction using deep learning methods such as LSTM, Convolutional Neural Network (CNN), Deep Neural Network (DNN) and Recurrent Neural Network (RNN) [7, 8, 9, 10].

The selection of input features plays a key role in predicting uptrends and downtrends for stock returns. The following three types of input features have been studied: A. extract-related indicators through fundamental and technical analysis [4]; B. apply text mining techniques to extract textual features from financial news or tweets [11]; C. generate candlestick charts to extract image features [12]. Recently, Jiang et al. [13] and Hu and Yang [14] combined technical indicators with (group) penalized logistic regressions to predict up- and downtrends for stock prices. Lin et al. [15] applied financial news to construct text mining-based stock prediction models. Li et al. [16] found that models including prices and news sentiments outperformed models only including either technical indicators or news sentiments. This research mainly predicted up- or downtrends. In this chapter, we not only summarize (group) penalized logistic regressions with multiple technical indicators to predict up- or downtrends, but also provide group penalized trinomial logit dynamic models with multiple technical indicator groups to predict uptrends, sideways trends and downtrends, for more details see Ref. [17].

Firstly, we propose Ridge/LASSO/ENet/SCAD/MCP penalized logistic regressions with multiple indicators to further improve the two-class prediction performance, not only apply the training set and the CD algorithm to obtain parameter estimators and probability estimators, but also employ the testing set and the learned estimators to obtain two-class confusion matrix and draw ROC curves to assess two-class prediction performance. Secondly, we propose G-LASSO/G-SCAD/G-MCP penalized logistic regressions to predict up- or downtrends, not only apply the training set to learn parameter estimators and probability estimators, but also adopt the testing set and the learned estimators to obtain confusion matrix and draw ROC curves in order to assess two-class prediction performance. Finally, we propose G-LASSO/G-SCAD/G-MCP penalized trinomial dynamic logit models to predict uptrends, sideways trends, and downtrends, not only develop the GCD algorithm to complete group selection and group estimation, but also introduce Bayes classifier to identify class labels, still adopt three-class confusion matrix, HUM, Kappa and PDI to assess three-class prediction performance to stock return movement direction.

The rest is organized as follows. Section 2 predicts uptrends and downtrends. Section 3 predicts uptrends, sideways trends and down trends. Section 4 gives conclusions and prospects. Section 5 provides discussions.

2. Predicting uptrends and downtrends

In this section, we not only propose five penalized logistic regressions with technical indicators to predict up- or downtrends, but also propose three group penalized logistic regressions to further improve two-class prediction performance.

2.1 Five penalized logistic regression prediction methods

Hu and Jiang [18] proposed logistic regression to predict up- or downtrends. Here, we propose Ridge/LASSO/ENet/SCAD/MCP penalized logistic regressions to predict up- or downtrends.

2.1.1 Five penalized logistic regressions

Introduce the direction indicator function

Yt=1,ifCt+1>Ct,0,ifCt+1≤Ct,E1

where Ct represents the closing price at the end of the t-th trading day. Then, Yt=1 represents uptrends, and Yt=0 represents downtrends. Now, we generalize [18] and introduce logistic regression with p technical indicators:

PXtβ0β=PYt=1Xtβ0β=expβ0+Xt⊤β1+expβ0+Xt⊤β,E2

1−PXtβ0β=PYt=0Xtβ0β=11+expβ0+Xt⊤β,E3

where β0 is an unknown intercept term, β=β1β2…βp⊤ is an unknown parameter vector, and Xt=Xt,1Xt,2…Xt,p⊤ is a predictor vector whose distribution is usually unknown. To improve two-class prediction performance, we propose the five penalized logistic regressions to predict up- or downtrends.

Let xt=xt,1xt,2…xt,p⊤ and yt be the observation samples for Xt and Yt, respectively. Given the training set xtytt=1n1 with the sample size n1, we obtain the negative log-likelihood function

lβ=−Lβ=−∑t=1n1ytβ0+xt⊤β−log1+expβ0+xt⊤β,E4

and its penalized version

Qβλγ≡lβ+pλ,γβ,E5

where pλ,γβ is a penalty function with a tuning parameter λ and a regularization parameter γ. Table 1 lists the five penalties.

Penalties	Formulae
Ridge	pλβ=λ∥β∥22.
LASSO	pλβ=λ∥β∥1.
ENet	pλ,γβ=12λ1−γ∥β∥22+γ∥β∥1, λ∈0∞,γ∈01.
MCP	pλ,γβ=λβ−β22γ,ifβ≤γλ,12γλ2,ifβ>γλ,λ≥0,γ>1.
SCAD	pλ,γβ=λβ,ifβ≤λ,λγβ−0.5β2+λ2γ−1,ifλ<β≤λγ,λ2γ+12,ifβ>λγ.λ≥0,γ>2.

Table 1.

Penalized functions.

2.1.2 Parameter estimators and probability estimators

Suppose that the current parameter estimators are β̂0β̂m. For the non-differentiable function (4), we transform it into the following weighted least-squares function:

lQβ0β=−12n1∑t=1n1WtY˜t−β0−xt⊤β2+Cβ̂0β̂m,E6

where

Y˜t=β̂0+xt⊤β̂m+yt−P˜tP˜t1−P˜t,Wt=P˜t1−P˜t,P˜t=expxt⊤β̂m1+expxt⊤β̂m,E7

and Cβ̂0β̂m is constant. Similarly, we replace the non-differentiable function lβ in (5) by the weighted least-squares function lQβ0β, apply the CD algorithm to obtain the parameter estimator

β̂λ,γ=argminβlQβ0β+pλ,γβ.E8

More details refer to [18]. Table 2 lists LASSO/SCAD/MCP estimator.

Penalties	Estimators
LASSO	β̂jLASSOZjλ=SZjλνj.
MCP	β̂jMCPZjλγ=SZjλνj−1/γ,∣Zj∣≤νjλγ,Zjνj,∣Zj∣>νjλγ,γ>1/νj.
SCAD	β̂jSCADZjλγ=SZjλνj,∣Zj∣≤λνj+1,SZjγλ/γ−1νj−1/γ−1,λνj+1<∣Zj∣≤νjλγ,γ>1+1/νj.Zjνj,∣Zj∣>νjλγ,
Symbols	P̂t=expxt⊤β̂λ,γm/[1+exp(xt⊤β̂λ,γ(m)],Wt=P̂t1−P̂t,t=1,…,n1,W=diagW1W2…Wn,Y˜=x⊤β̂λ,γm+W−1Y−P̂,P̂=P̂1…P̂n,x⋅j=x1j…xn1j⊤,νj=n1−1x⋅j⊤Wx⋅j,j=1,⋯,p,Zj=n1−1x⋅j⊤WY˜−x⋅−jβ−j=n1−1x⋅j⊤Wr+νjβ̂jλ,γm,x⋅−j=x⋅1⋯x⋅j−10x⋅j+1⋯x⋅p⊤,β−j=β1⋯βj−10βj+1⋯βp.

Table 2.

Penalized functions and parameter estimators for penalized logistic regressions.

For j in 12…p, the CD algorithm partially optimizes a target function Qβλγ with respect to a single parameter βj with the remaining parameters βl,l≠j fixed at the updated values β̂1λ,γm+1,…,β̂j−1λ,γm+1,β̂j+1λ,γm,…, β̂pλ,γm, then cycling through all the parameters until convergence or a maximum iteration number M is reached, and this process repeats over a grid of values for λ to produce a path of the solution. For Ridge/LASSO/ENet penalty, variable selection is determined by the tuning parameter λ. In order to select an appropriate λ, we apply a cross-validation method to calculate the full solution path to model parameters, select a specific solution path from the full solution path, and take the binomial deviation as the risk measure. Then. we get the mean cross-validation error curve and the one-standard-deviation band. The parameter estimators for MCP/SCAD penalized logistic regression depend on λ and the regularization parameter γ. For γ, we generally take γ=3.7. Algorithm 1 tells us how to apply the CD algorithm to calculate MCP estimator. The other four cases are similar to Algorithm 1.

Algorithm 1: The CD algorithm for MCP logistic regression.

Require: the training set xt=xt,1xt,2⋯xt,pytt=1n, a grid of increasing λ values Λ=λ1…λL, γ, a given tolerance limit ε and a maximum iteration number M.

1: Initialization β̂0=β̂λmax=λLγ=5.

2: for each m=0,1⋯, each l∈LL−1⋯1, do.

3: repeat.

4: η̂t⇐β0+xt⊤β̂λ,γm.

5: P̂t⇐eη̂t/1+eη̂t.

_6: W⇐diagP̂11−P̂1⋯P̂n1−P̂n.

7: r⇐W−1Y−P̂.

8: Y˜⇐η+r.

9: while not convergent do.

10: for each j∈12⋯p do.

11: νj⇐n−1x⋅j⊤Wx⋅j.

12: Zj⇐1nx⋅j⊤WY˜−x⋅−jβ−j⇐1nx⋅j⊤Wr+vjβ̂jλ,γm.

13: if ∣Zj∣⩽νjγλ then.

14: β̂jλ,γm+1⇐SZjλνj−1/γ.

15: else.

16: β̂jλ,γm+1⇐Zjνj.

17: end if.

18: r⇐r−x⋅j⊤β̂jλ,γm+1−β̂jλ,γm.

19: end for.

20: end while.

21: until β̂λ,γm+1−β̂λ,γm22⩽ε or do a maximum iteration number M.

22: end for.

Ensure: β̂λ,γ.

Now we apply the CD algorithm to the five penalized logistic regressions to obtain the final parameter estimators β̂0λ,γ and β̂λ,γ, then combine them with the testing set xtyt+1kt=n1+1n1+n2 to compute the probability estimators

P̂Yt=1Xtβ̂0λ,γβ̂λ,γ=expβ̂0λ,γ+Xt⊤β̂λ,γ1+expβ̂0λ,γ+Xt⊤β̂λ,γ,E9

P̂Yt=0Xtβ̂0λ,γβ̂λ,γ=11+expβ̂0λ,γ+Xt⊤β̂λ,γ.E10

2.2 Three-group penalized logistic regression prediction methods

In this subsection, we propose G-LASSO/G-SCAD/G-MCP penalized logistic regressions with multiple technical indicators to predict up- or downtrends.

2.2.1 Three-group penalized logistic regressions

Suppose that the binary response Yt represents the stock price movement directions, and the p-dimensional predictor vector Xt=Xt,1…Xt,p⊤ represents p technical indicators influencing stock price movement direction. The training set XtYtt=1…n1 is made up of the predictor vector Xt∈ℝp allowing both categorical and continuous predictors and the binary response Yt∈01. We divide the p-dimensional predictor vector Xt into the G different groups xt=xt,1⊤…xt,g⊤…xt,G⊤⊤ with xt,g=Xt,dfg−1+1…Xt,dfg and the g-th group length dfg, g=1,⋯,G. Then, the group logistic regression is

pβxt=PβYt=1xt=expβ0+∑g=1Gxt,g⊤βg1+expβ0+∑g=1Gxt,g⊤βg,E11

1−pβxt=PβYt=0xt=11+expβ0+∑g=1Gxt,g⊤βg,E12

where β0 is the intercept, βg∈ℝdfg is the g-th parameter vector, and β=β0β1⊤…βG⊤⊤∈ℝp+1 is the whole parameter vector.

2.2.2 Parameter estimators and probability estimators

Based on the training set xtYtt=1…n1, we obtain the log-likelihood function

lβ=∑t=1n1Ytηβxt−log1+expηβxtE13

with ηβxt=logpβxt1−pβxt=β0+∑g=1Gxt,g⊤βg, the penalized log-likelihood function is

Sλ,γβ=−lβ+∑l=1pPβlλγ,E14

and the group penalized log-likelihood is

Sλ,γβ=−lβ+∑g=1GP∥βg∥2λdfgγ,E15

where P∥βg∥2λdfgγ with a tuning parameter λ≥0 and a regularization parameter γ defines a family of penalty functions concave in ∥βg∥2 . The estimator β̂ is the minimizer of the convex function Sλ,γβ. Here, we take Pβlλγ as PSCADβlλγ or PMCPβlλγ, and take P∥βg∥2λdfgγ as PgSCAD∥βg∥2λdfgγ or PgMCP∥βg∥2λdfgγ. Three group penalized functions are listed in Table 3. According to Breheny and Huang [19] and Fan and Li [20], the Bayes risks are not sensitive to the choice of γ. In general, we take γ=3.7 for SCAD/gSCAD (group SCAD) and γ=3 for MCP/gMCP(group MCP). According to Hu and Liu [21], we adopt the GCD algorithm to complete group selection and group estimations listed in Table 4.

Symbols	Three-group penalized functions
gLASSO	PgLASSOβλ=λdfgβg2.
gSCAD	PgSCAD∥βg∥2λdfgγ=λdfgβg2,βg2≤λdfg,λ≥0,2λdfgγβg2−βg22+λ2dfg2γ−1,λdfg<βg2≤λdfgγ,λ2dfgγ2−12γ−1,βg2>λdfgγ,γ>2.
gMCP	PgMCP∥βg∥2λdfgγ=λdfgβg2−βg22/2γ,βg2≤λdfgγ,λ2dfgγ/2,βg2>λdfgγ, λ≥0, γ>1.

Table 3.

gLASSO, gSCAD and gMCP functions.

Note: gLASSO stands for group LASSO.

Symbols	Three parameter estimators
gLASSO estimator	β̂gLASSO=1vSvZgdgλ.
gSCAD estimator	β̂ggSCAD=1vSvZgdgλ,‖Zg‖≤2dgλ,γ−1γ−2⋅1vSvZgdgλγγ−1,2dgλ<‖Zg‖≤dgλγ,Zg,‖Zg‖≥dgλγ, γ>2.
gMCP estimator	β̂ggMCP=γγ−1⋅1vSvZgdgλ,‖Zg‖≤dgλγ,Zg,‖Zg‖≥dgλγ, γ>1.
Labels	vj=n1−1Xj⊤WXj,j=1,⋯,p,Zj=n1−1Xj⊤WY˜−X−jβ−j,
	Y˜=X⊤βm+W−1Y−P,X−j=X1⋯Xj−10Xj+1⋯Xp,
	β−j=β1⋯βj−10βj+1⋯βp;
	v=maxtsupηΔ2Ltη,Ltη=Ytηt−log1+eηt,ηt=β0+∑g=1GXt,g⊤βg,t=1,⋯,n1,
	Zg=Xg⊤Y¯−x−gβ−g,g=1,⋯,G,Y¯=x⊤βm+W−1Y−P,
	x−g=X1⊤⋯Xg−1⊤0⊤Xg+1⊤⋯XG⊤,
	β−g=β1⊤⋯βg−1⊤0⊤βg+1⊤⋯βG⊤.

Table 4.

The iterative parameter estimators for gLASSO, gSCAD and gMCP.

Note: X=X1…Xn1⊤ is a n1×p matrix, x=x1…xn1⊤ is a n1×p matrix, Y=Y1…Yn1⊤ is a n1×1 vector, W is a n1×n1 weighted diagonal matrix, P is the estimated probability of the m-th iteration, βm is the parameter estimator of the m-th iteration.

After applying the training set xtYtt=1…n1 and the GCD algorithm to obtain group estimation β̂g, we apply the testing set xtYtt=n1+1…n and β̂g to obtain the predicted probabilities for up- or downtrends:

p̂βxt=ℙ̂βYt=1xt=expβ̂0+∑g=1Gxt,g⊤β̂g1+expβ̂0+∑g=1Gxt,g⊤β̂g,E16

1−p̂βxt=ℙ̂βYt=0xt=11+expβ̂0+∑g=1Gxt,g⊤β̂g,E17

where n represents the sample size of the entire dataset. G-lasso estimators can be computed by R package grplasso. Now we estimate the predicted value Ŷt according to the following rules:

IfP̂t>c,thenŶt=1,elseŶt=0,E18

where c is a given threshold. Raghavan et al. [22] tell us how to select the optimal threshold.

2.3 Two-class prediction performance

We introduce the two-class confusion matrix listed in Table 5 to assess two-class prediction performance.

True \ Predicted class	Ŷt=1	Ŷt=2	Total
Yt=1	V11	V12	V1⋅
Yt=2	V21	V22	V2⋅
Total	V⋅1	V⋅2	V⋅⋅

Table 5.

Two-class confusion matrix.

Vkk¯ is the number of items that truly belong to class k and are predicted as class k¯.

According to Table 5, one can compute

Accuracy=TP+TNTP+TN+FP+FN,Precision=TPTP+FP.

In fact, accuracy cannot reflect the losses from two types of errors. However, a ROC curve can clearly assess two-class prediction performance. For a given threshold c, TPRc=PYt<c is true-positive rate, and FPRc=P1−Yt<c is false-positive rate. Given the different threshold c, we can calculate TPRcFPRc or (Sensitivity, 1-Specificity) and draw a ROC curve, where

SensitivityTrue positive rateTPR=TP/TP+FNE19

measures the ability of classifiers to correctly identify positive samples, and

Specificity1‐False positive rate1‐FPR=TN/TN+FPE20

measures the ability to correctly identify negative labels. ROC is a graphical method to evaluate the performance of the binary classifier. Here, we choose ggplot2 to draw the ROC curve and compute the corresponding AUC (the area under the ROC curve).

3. Predicting uptrends, sideways trends and downtrends

3.1 Three-group penalized trinomial logit prediction methods

Let Ct be the closing price at the end of the t-th trading day, Zt=Ct−Ct−k be the k period stock excess return, and

Ytc=1,ifZt>c,2,if−c≤Zt≤c,3,ifZt<−c.E21

be an unordered three-category response variable. Then, Ytc=1 represents uptrends, Ytc=2 represents sideways trends, and Ytc=3 represents downtrends. When c=0, Ytc is an indicator for positive returns. When c=0.1Ct−1, Ytc is an indicator for “large” positive returns. When c=−0.1Ct−1, Ytc is an indicator for “large” negative returns. More details on how to choose c may refer to [23]. Suppose that the predictive vector Xt=Xt,1…Xt,p⊤ allowing both categorical and continuous predictors can represent the factors influencing stock return movement direction. One main goal is to predict stock return movement direction Yt+1c using the available information set It=σXtXt−1⋯ at time t. If Yt+1c only depend on Xt, then we can predict Yt+1c based on Xt. Trinomial logit dynamic model for stock return movement direction is

PYt+1c=1xtβ=expβ01+∑g=1Gxtgβg1∑k=13expβ0k+∑g=15xtgβgk,PYt+1c=2xtβ=expβ02+∑g=1Gxtgβg2∑k=13expβ0k+∑g=15xtgβgk,PYt+1c=3xtβ=expβ03+∑g=1Gxtgβg3∑k=13expβ0k+∑g=15xtgβgk,E22

where the parameter vector β=β01β1⊤β02β2⊤β03β3⊤ with the k-th class intercept β0k∈R and the g-th group parameter vector from the k-th class βgk=βdfg−1+1…βdfg⊤, g=1,⋯,5. The k-th class parameter vector is βk=β1k⊤⋯βGk⊤⊤,k=1,2,3. According to the training set XtYt+1ct=1…n1 and yt+1k=IYt+1c=k, we obtain the group trinomial logit log-likelihood function

lβ=∑t=1n1∑k=13yt+1kβ0k+xtβk−log∑k=13expβ0k+xtβkE23

and the corresponding group penalized trinomial logit log-likelihood function

Qλ,γβ≡Rn1β+∑k=13∑g=1GP∥βgk∥λpgγ,E24

where Rn1β=1n1∑t=1n1γβxtyt+1k=−1N∑t=1n1∑k=13yt+1kβ0k+xtβk−log∑k=13expβ0k+xtβk is the empirical risk to Rβ=Eγβxy with y=y1y2y3 and the loss function

γβxy=−∑k=13ykβ0k+xβk−log∑k=13expβ0k+xβk,E25

P∥βgk∥λpgγ defines a family of penalty functions. Next, we introduce the three group penalized criterions:

G-LASSO penalized function
QG−LASSOλ,γβ≡Rn1β+∑k=13∑g=1Gλpg∥βgk∥;E26
G-SCAD penalized function
QG−SCADλ,γβ≡Rn1β+∑k=13∑g=1GPG−SCAD∥βgk∥λpgγE27
for λ≥0 and γ>2, where
PG−SCAD∥βgk∥λpgγ=λpg∫0∥βgk∥min1γ−λpgx+/γ−1dx=λpg∥βgk∥,if∥βgk∥≤λpg,2λpgγ∥βgk∥−∥βgk∥2+λ2pg2γ−1,ifλpl<∥βgk∥≤λpgγ,λ2pgγ2−12γ−1,if∥βgk∥>λpgγ
and x+=x1x≥0 represents the nonnegative part of x.
G-MCP penalized function

QG−MCPλ,γβ≡Rn1β+∑k=13∑g=1GPG−MCP∥βgk∥λpgγE28

for λ≥0 and γ>1, where

PG‐MCP∥βgk∥λpgγ=λpg∫0∥βgk∥1−x/λpgγ+dx=λpg⋅∥βgk∥−∥βgk∥2/2λpgγI∥βgk∥<λγ+λ2pgγ/2⋅I∥βgk∥≥λγ=λpg∥βgk∥−∥βgk∥2/2γ,if∥βgk∥≤λpgγ,λ2pgγ/2,if∥βgk∥>λpgγ.

3.2 GCD algorithm completes group selection and group estimations

For g in 12…G, GCD partially optimizes the target function

Qλ,γβ≡−1n1∑t=1n1∑k=13yt+1kβ0k+xtβk−log∑k=13expβ0k+xtβk+∑k=13∑g=1GP∥βgk∥λpgγ

for a single group βg with the other groups βg¯,g¯≠g fixed at the updated values β˜1km+1,…,β˜g−1km+1,β˜g+1km,…, β˜Gkm, then iteratively cycling through all the groups until convergence or a maximum iteration is reached. Now, we develop the GCD algorithm for G-LASSO/G-SCAD/G-MCP penalized trinomial logit dynamic models. We perform partial Newton steps by forming the partial quadratic approximation

RQkβ0kβk⊤=12n1∑t=1n1ωtkζtk−β0k−xt⊤βk2−Cβ˜0k¯β˜1k¯k¯=13

to the empirical risk Rn1β=−1n1∑t=1N∑k=13yt+1kβ0k+xtβk−log∑k=13expβ0k+xtβk or Taylor expansion about current estimates β˜0k¯β˜1k¯k¯=13, allowing only β0kβk⊤⊤ to vary for a single class at a time, where ωtk=P̂tk1−P̂tk, P̂tk=expβ˜0k+xtβ˜k/∑k=13expβ˜0k+xtβ˜k, and ζtk=β˜0k+xt⊤β˜k+yt+1k−P̂tk/ωtk. For each given λγ, we create an outer loop which cycles over k and compute RQkβ0kβk⊤ about current estimates β˜0k¯β˜1k¯k¯=13. Then, we apply the GCD algorithm to solve the negative group penalized weighted least-squares problem

minβ0kβk⊤∈R∑g=1Gpl+1RQk(β0kβk⊤)+∑g=1GP(∥βgk∥λplγ).

The main four steps are as follows:

Step 1. Set m=0 and β˜k0=β˜0k0β˜1k0⊤…β˜Gk0⊤⊤ be the initial value. Calculate P̂k0=P̂1k0P̂2k0…P̂n1k0⊤ with

P̂tk0=expβ˜0k0+∑g=1Gxtgβ˜gk0∑k=13expβ˜0k0+∑g=1Gxtgβ˜gk0,Wk0=diagω1k0ω2k0…ωn1k0

with ωtk0=P̂tk01−P̂tk0, and obtain the initial residual vector rk0=r1k0r2k0…rNk0⊤ with rtk0=yt+1k−P̂tk0/wtk0..

Step 2. At the g-th step of m+1 iterations, g=1,…,G, carry out (A)-(C):

Calculate ωtkm=P̂tkm1−P̂tkm,P̂km=P̂1kmP̂2km…P̂n1km⊤,Wkm =diagω1kmω2km…ωn1km with P̂tkm=expβ˜0km+∑g¯=1g−1xtg¯β˜g¯km+∑g¯=gGxtg¯β˜g¯km−1∑k=13expβ˜0km+∑g¯=1g−1xtg¯β˜g¯km+∑g¯=gGxtg¯β˜g¯km−1, and obtain the m-th iterations’ residual vector rkm=r1kmr2km…rNkm⊤ with
rtkm=yt+1k−P̂tkm/wtkm.
Then, compute the pl×pl matrix νgkm=n1−1xg⊤Wkmxg, g=1,2,…,G, and the pl×1 vector zgkm=n1−1xg⊤Wkmrkm+νgkmβ˜gkm, g=1,2,…,G.
Update the g-th group estimator β˜gkm+1:
1. G-LASSO estimator
  β˜gkm+1←Szgkmλpgνgkm,E29
  where the soft-thresholding operator
  Szgkmλpg=S∥zgkm∥λpgzgkm∥zgkm∥=1−λpg/∥zgkm∥+zgkm.
2. G-SCAD estimator
  β˜gkm+1←Szgkmλpgνgkm,if∥zgkm∥≤λpg∥νgkm∥+1,Szgkmλpgγ/γ−1νgkm−1/γ−1,ifλpg∥νgkm∥+1<∥zgkm∥≤∥νgkm∥λpgγ,zgkmνgkm,if∥zgkm∥>∥νgkm∥λpgγ,γ>1+1/c∗βgk,E30
  where c∗βgk denotes the minimum eigenvalue of N−1xg⊤Wkmxg.
3. G-MCP estimator
  β˜gkm+1←Szgkmλpgνgkm−1/γ,if∥zgkm∥≤∥νgkm∥λpgγ,zgkmνgkm,if∥zgkm∥>∥νgkm∥λpgγ,γ>1/c∗βgk.E31

(C) update rkm+1←rkm−x⊤β˜gkm+1−β˜gkm.

Step 3. Update m←m+1.

Step 4. Repeat steps 2–3 until convergence or the maximum iteration number is reached, the iterations stop and the final estimators are obtained.

For λ depending on the sample size, the group number, and the degree of freedom within each group, we can apply the grid search method to λ on a grid λ0…λK+1. Let λmin=λ0>λ1>…>λK>λK+1=λmax=max1≤g≤s∥zgkm∥, where zgkm=N−1xgWkmrkm+νgkmβ˜gkm and Wkm,rkm,νgkm and β˜gkm. For any given γ, the algorithm starts at λmin and proceeding toward λmax. Each time sets the initial value of βgk to be β˜gkm estimated from the previous grid point to ensure that the initial values will never be far from the solution. To find the optimal λ on a grid λ0…λK+1, we divide the stock dataset into S subsets Skk=1S with the same sample size and apply the S-fold cross-validation procedure to choose the optimal λ.

3.3 Three-class prediction performance

We firstly combine the GCD algorithm with the training set xtyt+1kt=1n1 to learn G-LASSO/G-SCAD/G-MCP penalized trinomial logit dynamic models, then combine group estimators

β˜=β˜01β˜1⊤β˜02β˜2⊤β˜03β˜3⊤⊤

with the testing set xtyt+1kt=n1+1n1+n2 to compute class probability estimations

P˜yt+1k=kXtβ˜=expβ˜0k+xtβ˜k∑k¯=13expβ˜0k¯+xtβ˜k¯,k=1,2,3,E32

and finally introduce the Bayes classifier to predict class indexes

Ŷt+1c=kifP˜yt+1k=kXtβ˜=maxk¯∈1,2,3P˜yt+1k=k¯Xtβ˜.E33

Table 6 lists three-class confusion matrix for assessing the prediction performance.

True \ Predicted class	Ŷt=1	Ŷt=2	Ŷt=3	Total
Yt=1	V11	V12	V13	V1⋅
Yt=2	V21	V22	V23	V2⋅
Yt=3	V31	V32	V33	V3⋅
Total	V⋅1	V⋅2	V⋅3	V⋅⋅

Table 6.

Three-class confusion matrix.

Vkk¯ is the number of items that truly belong to class k and are predicted as class k¯. Vk⋅ is the number of items that truly belong to class k. V⋅k¯ is the number of items that were predicted as class k¯.

From Table 1, one can compute Accuracy. For unordered multi-category response variables, some traditional accuracy measures such as AUC are not applicable, a few important statistical concepts for assessing multi-category classification accuracy have been proposed. For example, KappacoefficientKappa=∑k=1Kpkk−∑k=1Kpk+p+k/1−∑k=1Kpk+p+k with pkk=PYt=kŶt=k,pk+=PYt=k,p+k=PŶt=k measures multi-category classification accuracy, and Kappa∈00.2, (0.2,0.4), (0.4,0.6), (0.6,0.8), (0.8,1.0) represent no consistency, slightly consistency, mild consistency, reasonably consistency, strong consistency, almost identical, respectively. PDI evaluates the probability of an event related to simultaneously classifying n subjects from K categories. For a fixed category k, define the class-specific PDI to be PDIk=P(pkk>pjk,j≠kYt=k) with pij be the probability of placing a subject from category i into category j. The overall PDI is PDI=∑k=1KPDIk/K.

According to the three-class confusion matrices and the three testing sets, we compute sensitivity and specificity for class 1–3. Plotting true classification rates in three dimensions (3D) obtains the 3D ROC surface in the unit cube. HUM is a reflection of the intrinsic accuracy of three-class classifier. Here, we choose R package mcca to draw the ROC surfaces. The HUM value is higher, and the prediction performance is better. More details see Refs. [24, 25].

4. Conclusions and prospects

In this chapter, we combine different technical indicators with five penalized logistic regressions, three-group penalized logistic regressions and three-group penalized trinomial logit regressions to predict stock return movement direction. The future research focuses on adding more variables such as fundamental factors, specific news, national policies and market sentiment to the proposed regressions, and further improves the prediction performance to stock return movement direction.

5. Discussions

In the aforementioned methods, it is very important to select important (group) variables by tuning two parameters λ and γ. The different tuning parameters will bring the different prediction performance. However, after we choose the relative optimum tuning parameters λ and γ by some model selection criterion such as cross-validation and BIC, the proposed methods perform very robust in terms of accuracy, AUC or HUM, Kappa and PDI. In addition, the proposed CD or GCD algorithm converges very rapidly. They can be generalized to high-dimensional or super high-dimensional cases by containing more variables influencing stock movement directions. From Refs. [13, 14, 15], we can found that the proposed methods outperform some classical machine learning or deep learning methods in terms of robustness and effectiveness and calculation. Therefore, these prediction methods are worth of further promotion.

References

1. Murphy JJ. Technical Analysis of the Financial Markets. New York: Prentice Hall Press; 1999
2. Edwards RD, Magee J, Bassetti WC. Technical Analysis of Stock Trends. Florida: CRC Press; 2018
3. Cavalcante RC, Brasileiro RC, Souza VLF, Nobrega JP, Oliveira ALI. Computational intelligence and financial markets: A survey and future directions. Expert Systems with Applications. 2016;55(15):194-211. DOI: 10.1016/j.eswa.2016.02.006
4. Nti IK, Adekoya AF, Weyori BA. A systematic review of fundamental and technical analysis of stock market predictions. Artificial Intelligence Review. 2020;53(4):3007-3057. DOI: 10.1007/s10462-019-09754-z
5. Henrique BM, Sobreiro VA, Kimura H. Literature review: Machine learning techniques applied to financial market prediction. Expert Systems with Applications. 2019;124:226-251. DOI: 10.1016/j.eswa.2019.01.012
6. Jiang WW. Applications of deep learning in stock market prediction: Recent progress. Expert Systems with Applications. 2021;184:115537. DOI: 10.1016/j.eswa.2021.115537
7. Nelson DMQ, Pereira ACM, De Oliveira RA. Stock market’s price movement prediction with LSTM neural networks. In: 2017 International Joint Conference on Neural Networks (IJCNN). IEEE; 2017. pp. 1419-1426. DOI: 10.1109/IJCNN.2017.7966019
8. Chen W, Jiang MR, Zhang WG, Chen ZS. A novel graph convolutional feature based convolutional neural network for stock trend prediction. Information Sciences. 2021;556:67-94. DOI: 10.1016/j.ins.2020.12.068
9. Long J, Chen Z, He W, Wu T, Ren J. An integrated framework of deep learning and knowledge graph for prediction of stock price trend: An application in Chinese stock exchange market. Applied Soft Computing. 2020;91:106205. DOI: 10.1016/j.asoc.2020.106205
10. Zhao J, Zeng D, Liang S, Kang H, Liu Q. Prediction model for stock price trend based on recurrent neural network. Journal of Ambient Intelligence and Humanized Computing. 2021;12:745-753. DOI: 10.1007/s12652-020-02057-0
11. Xing FZ, Cambria E, Welsch RE. Natural language based financial forecasting: A survey. Artificial Intelligence Review. 2018;50(1):49-73. DOI: 10.1007/s10462-017-9588-9
12. Lee J, Kim R, Koh Y, Kang J. Global stock market prediction based on stock chart images using deep Q-network. IEEE Access. 2019;7:167260-167277. DOI: 10.1109/ACCESS.2019.2953542
13. Jiang HF, Hu XM, Jia H. Penalized logistic regressions with technical indicators predict up and down trends. Soft Computing. 2022;27:1-12. DOI: 10.1007/s00500-022-07404-1
14. Hu XM, Yang JW. G-LASSO/G-SCAD/G-MCP penalized trinomial logit dynamic models predict up trends, sideways trends and down trends for stock returns. Expert Systems with Applications. 2024;249:123476. DOI: 10.1016/j.eswa.2024.123476
15. Lin WC, Tsai CF, Chen H. Factors affecting text mining based stock prediction: Text feature representations, machine learning models, and news platforms. Applied Soft Computing. 2022;130:109673. DOI: 10.1016/j.asoc.2022.109673
16. Li XD, Wu PJ, Wang WP. Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong. Information Processing and Management. 2022;57(5):102212. DOI: 10.1016/j.ipm.2020.102212
17. Yang YL, Hu XM, Jiang HF. Group penalized logistic regressions predict up and down trends for stock prices. The North American Journal of Economics and Finance. 2022;59:101564. DOI: 10.1016/j.najef.2021.101564
18. Hu XM, Jiang HF. Logistic regression model with technical indicators predicts ups and downs for Google stock prices. Journal of Systems Science and Mathematical Sciences (in Chinese). 2021;41(3):1-22
19. Breheny P, Huang J. Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics. 2011;5(1):232. DOI: 10.1214/10-AOAS388
20. Fan JQ, Li RZ. Variable selection via non-concave penalized likelihood and its oracle properties. Journal of American Statistical Association. 2001;96(456):1348-1360. DOI: 10.1198/016214501753382273
21. Hu XM, Liu F. Estimation Theory and Model Recognition for High-Dimensional Statistical Models. Beijing: Higher Education Press; 2020
22. Raghavan R, Ashour FS, Bailey B. A review of cutoffs for nutritional biomarkers. Advances in Nutrition. 2016;7(1):112-120
23. Hong Y, Chung J. Are the directions of stock price changes predictable? In: Statistical Theory and Evidence. Cornell University; 2006
24. Li JL, Gao M, D’Agostino R. Evaluating classification accuracy for modern learning approaches. Statistics in Medicine. 2019;38(13):2477-2503. DOI: 10.1002/sim.8103
25. Li JL, Fine JP. ROC analysis with multiple classes and multiple tests: Methodology and its application in microarray studies. Biostatistics. 2008;9(3):566-576. DOI: 10.1093/biosta-tistics/kxm050

[1] 1. Murphy JJ. Technical Analysis of the Financial Markets. New York: Prentice Hall Press; 1999

[2] 2. Edwards RD, Magee J, Bassetti WC. Technical Analysis of Stock Trends. Florida: CRC Press; 2018

[3] 3. Cavalcante RC, Brasileiro RC, Souza VLF, Nobrega JP, Oliveira ALI. Computational intelligence and financial markets: A survey and future directions. Expert Systems with Applications. 2016;55(15):194-211. DOI: 10.1016/j.eswa.2016.02.006

[4] 4. Nti IK, Adekoya AF, Weyori BA. A systematic review of fundamental and technical analysis of stock market predictions. Artificial Intelligence Review. 2020;53(4):3007-3057. DOI: 10.1007/s10462-019-09754-z

[5] 5. Henrique BM, Sobreiro VA, Kimura H. Literature review: Machine learning techniques applied to financial market prediction. Expert Systems with Applications. 2019;124:226-251. DOI: 10.1016/j.eswa.2019.01.012

[6] 6. Jiang WW. Applications of deep learning in stock market prediction: Recent progress. Expert Systems with Applications. 2021;184:115537. DOI: 10.1016/j.eswa.2021.115537

[7] 7. Nelson DMQ, Pereira ACM, De Oliveira RA. Stock market’s price movement prediction with LSTM neural networks. In: 2017 International Joint Conference on Neural Networks (IJCNN). IEEE; 2017. pp. 1419-1426. DOI: 10.1109/IJCNN.2017.7966019

[8] 8. Chen W, Jiang MR, Zhang WG, Chen ZS. A novel graph convolutional feature based convolutional neural network for stock trend prediction. Information Sciences. 2021;556:67-94. DOI: 10.1016/j.ins.2020.12.068

[9] 9. Long J, Chen Z, He W, Wu T, Ren J. An integrated framework of deep learning and knowledge graph for prediction of stock price trend: An application in Chinese stock exchange market. Applied Soft Computing. 2020;91:106205. DOI: 10.1016/j.asoc.2020.106205

[10] 10. Zhao J, Zeng D, Liang S, Kang H, Liu Q. Prediction model for stock price trend based on recurrent neural network. Journal of Ambient Intelligence and Humanized Computing. 2021;12:745-753. DOI: 10.1007/s12652-020-02057-0

[11] 11. Xing FZ, Cambria E, Welsch RE. Natural language based financial forecasting: A survey. Artificial Intelligence Review. 2018;50(1):49-73. DOI: 10.1007/s10462-017-9588-9

[12] 12. Lee J, Kim R, Koh Y, Kang J. Global stock market prediction based on stock chart images using deep Q-network. IEEE Access. 2019;7:167260-167277. DOI: 10.1109/ACCESS.2019.2953542

[13] 13. Jiang HF, Hu XM, Jia H. Penalized logistic regressions with technical indicators predict up and down trends. Soft Computing. 2022;27:1-12. DOI: 10.1007/s00500-022-07404-1

[14] 14. Hu XM, Yang JW. G-LASSO/G-SCAD/G-MCP penalized trinomial logit dynamic models predict up trends, sideways trends and down trends for stock returns. Expert Systems with Applications. 2024;249:123476. DOI: 10.1016/j.eswa.2024.123476

[15] 15. Lin WC, Tsai CF, Chen H. Factors affecting text mining based stock prediction: Text feature representations, machine learning models, and news platforms. Applied Soft Computing. 2022;130:109673. DOI: 10.1016/j.asoc.2022.109673

[16] 16. Li XD, Wu PJ, Wang WP. Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong. Information Processing and Management. 2022;57(5):102212. DOI: 10.1016/j.ipm.2020.102212

[17] 17. Yang YL, Hu XM, Jiang HF. Group penalized logistic regressions predict up and down trends for stock prices. The North American Journal of Economics and Finance. 2022;59:101564. DOI: 10.1016/j.najef.2021.101564

[18] 18. Hu XM, Jiang HF. Logistic regression model with technical indicators predicts ups and downs for Google stock prices. Journal of Systems Science and Mathematical Sciences (in Chinese). 2021;41(3):1-22

[19] 19. Breheny P, Huang J. Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics. 2011;5(1):232. DOI: 10.1214/10-AOAS388

[20] 20. Fan JQ, Li RZ. Variable selection via non-concave penalized likelihood and its oracle properties. Journal of American Statistical Association. 2001;96(456):1348-1360. DOI: 10.1198/016214501753382273

[21] 21. Hu XM, Liu F. Estimation Theory and Model Recognition for High-Dimensional Statistical Models. Beijing: Higher Education Press; 2020

[22] 22. Raghavan R, Ashour FS, Bailey B. A review of cutoffs for nutritional biomarkers. Advances in Nutrition. 2016;7(1):112-120

[23] 23. Hong Y, Chung J. Are the directions of stock price changes predictable? In: Statistical Theory and Evidence. Cornell University; 2006

[24] 24. Li JL, Gao M, D’Agostino R. Evaluating classification accuracy for modern learning approaches. Statistics in Medicine. 2019;38(13):2477-2503. DOI: 10.1002/sim.8103

[25] 25. Li JL, Fine JP. ROC analysis with multiple classes and multiple tests: Methodology and its application in microarray studies. Biostatistics. 2008;9(3):566-576. DOI: 10.1093/biosta-tistics/kxm050