## Yuping Gu* , Longsheng Cheng** and Zhipeng Chang***## |

Prediction positive | Prediction negative | |
---|---|---|

Actual true | True positive (TP) | True negative (TN) |

Actual false | False positive (FP) | False negative (FN) |

The evaluation metrics of G-means and F-measure are calculated as follows:

where Precision=TP/(TP+FP); TPR=TP/(TP+FN), it represents the classification accuracy of positive instance, and is also called Sensitivity or Recall; TNR=TN/(TN+FP), it represents the classification accuracy of negative instance, and is also called Specificity.

G-means and F-measure are all composite metric. G-means is the square root of the product of correct rate of two classes, and the larger the classification accuracy of two classes, the larger the value of Gmeans. Thus G-means can reasonably evaluate the overall classification performance of imbalanced datasets. F-measure is the harmonic mean of the Precision and the Recall, and the larger the Precision and Recall, the larger the F-measure. The value of these two evaluation metrics are both between 0 and 1. The larger the value of the metrics, the better the classification effect of imbalance data.

In general, the normal sample size of MTS is larger than the abnormal sample size. So define the normal sample of MTS as negative instance, and define the abnormal sample of MTS as positive instance. Thus, the parameter of confusion matrix can be obtained by the following method. Let T be the threshold of MTS, then:

G-means and F-measure can be calculated as follows:

G-means can reasonably evaluate the overall classification performance of imbalanced datasets, and Fmeasure can properly evaluate the classification performance of minority class. Therefore, these two metrics can be used as larger-the-better optimization goals.

Another goal of MTS is variable selection and optimization. It is important to identify the variables which are most effective in distinguishing abnormal samples, especially for the high-dimensional data. Dimensionality reduction can improve the classification efficiency and reduce the classification time and cost. Let p_{select} is the number of variables after dimension reduction, and then pselect/p represents reduction effect which is another optimization goal of MTS.

According to the optimization goals, the following optimization mathematical model can be obtained, and is shown as follows.

Then the optimization mathematical model is shown as follows.

s.t.

where [TeX:] $$f_{1}^{0}$$ equals to 1 minus the G-means of traditional MTS, [TeX:] $$f_{2}^{0}$$ equals to 1 minus the F-measure of traditional MTS. The constraint conditions (16) and (17) represent the optimized G-means and Fmeasure should be larger than the traditional MTS method. The constraint conditions (18), (19), and (20) represent whether the variable will be involved, and also represent that the optimized dimensionality reduction effect is superior to the traditional MTS method.

For MTS, its variables would either be involved in the construction of the reference space or not, while the threshold T can be a certain range of values. So this is a hybrid constraint nonlinear optimization problem, and binary particle swarm optimization (BPSO) can be used as optimal algorithm [12,13]. In binary search space, the particle swarm consists of N particles in the d-dimensional space, and the position of each particle consists of a string of bits. The position of the particle i is expressed as a vector [TeX:] $$X_{i}=\left(x_{i 1}, x_{i 2}, \ldots, x_{i d}\right)$$, where the value of [TeX:] $$x_{i j}$$ is a binary bit 0 or 1, in which 1 indicates that the corresponding variable is selected and 0 indicates that the corresponding variable is not selected. The velocity of particle i is expressed as a vector [TeX:] $$V_{i}=\left(\nu_{i 1}, V_{i 2}, \ldots, V_{i d}\right)$$. The initial velocity is a random fraction within [0, 1], and the velocity is limited within [TeX:] $$\left[V_{\min }, V_{\max }\right] . \text { The vector } P_{b s s i t}=\left(p_{i l}, p_{i 2}, \ldots, p_{i d}\right)$$ represents the optimal position that the particle i. The optimal position of all the particles in the population is called the global optimal position, and denoted by [TeX:] $$G_{b e s t i}=\left(g_{1}, g_{2}, \ldots, g_{d}\right)$$. In the BPSO algorithm, the velocity and position of each particle can be calculated according to formula (21) to (24). And then evaluate the optimal position Pb_{i} and global optimal location Gb_{i}.

In the above formula, w is the inertia weight, which captures the effect of the previous velocity on the updated one, c_{1} and c_{2} are acceleration coefficients attached with cognitive and social components of the velocity of a particle, r_{1}, r_{2} and are random numbers in the range [0, 1], [TeX:] $$v_{i j}^{t} \text { and } v_{i j}^{t+1}$$ are the velocity of the particle i before and after the update respectively, [TeX:] $$x_{i j}^{t} \text { and } x_{i j}^{t+1}$$ are the positions of the particle i before and after the update respectively. Eq. (22) shows that the velocity of each particle i is within [TeX:] $$\left[V_{\min }, V_{\max }\right]$$. In order to show that the velocity value is the probability of which the binary bit is taken as 1, the value of the velocity is mapped to the interval [0, 1]. The sigmoid function is usually used as the mapping method, and is shown as Eq. (23) in which [TeX:] $$s\left(v_{i j}^{t+1}\right)$$ is the probability that position [TeX:] $$x_{i j}^{t+1}$$ is taken as 1. Eq. (24) determines the particles to be updated in the next random iteration. If the value of [TeX:] $$s\left(v_{i j}^{t+1}\right)$$ is greater than the value that randomly generated within (0,1), then set the value of [TeX:] $$x_{i j}^{t+1}$$ be 1, otherwise set the value of [TeX:] $$x_{i j}^{t+1}$$ be 0.

In the BPSO algorithm, the inertia weigh controls the global search and local search capabilities of the particle swarm. The large inertia weight is beneficial to the global search, while the small inertia weight is beneficial to the local search. The inertia weight is a key factor influencing the convergence of the problem; it greatly affects the BPSO search process, and thus affects the accuracy of prediction. Because the BPSO algorithm is easy to fall into the local optimal and this would lead to convergence early, chaotic mapping is introduced into the BPSO algorithm to form a chaotic BPSO (CBPSO) algorithm [14], which could overcome the shortcoming of premature convergence.

Chaos is the method that the non-deterministic stochastic state can be obtained from deterministic equation. Chaos has the characteristics of randomness, ergodicity and regularity. So in the each iteration of the BPSO algorithm, the chaotic map is used to determine the inertia weight. The values of the inertia weights are usually calculated using logistic maps.

where w(t) is a chaotic sequence and its value is limited within (0,1); u is the control parameter, and when [TeX:] $$3.571448 \leq u \leq 4$$, the logistic map is in a chaotic state, especially when u=4, it is in a completely chaotic state. When the inertia weight is close to 1, the global optimal search ability is enhanced. When the inertia weight is close to 0, the local optimal search capability is enhanced.

According to the optimization mathematical model, the multi-objective optimization problem could be transformed into a single-objective problem by integrating all optimization goals. Let the fitness function of BPSO be Min f_{1} f_{2} f_{3}. The basic idea of MTS-CBPSO method is: N particles are produced in each time when the CBPSO algorithm iterates; according to the position of each particle, select the corresponding combination of variables that participate in the operation, then the training samples are classified by MTS method by using the combination of these variables, and compare the fitness values of all particles to obtain the current optimal particle; the optimal fitness value and its corresponding variable combination could be obtained through iteration of the CBPSO algorithm. The variables combination can be regarded as the output of BPSO, and then implement the test sample according to the traditional MTS method.

Similar to the traditional MTS, MTS-CBPO method can be divided into four stages as shown in Fig. 1.

In an economic globalization environment, because of the intense market competition, enterprises are facing various risks during the process of development, especially financial risk. The financial distress can reflect the increased financial risk. So effective financial distress prediction is not only related to the development of the enterprise itself, but also related to the interests of investors, creditors and other stakeholders. Therefore, the study of financial distress of listed companies is a hot topic in the field of corporate governance, risk control and securities investment research in capital market. At present, the financial distress forecast of listed companies is basically regarded as a classification problem. That is, according to the financial situation, enterprises will be divided into two categories of being normal or abnormal. But these studies usually artificially balance the normal and abnormal samples, while ignoring the nature of their own imbalance [15,16]. The following will describe how CBPSO-MTS algorithm is used to study and analyze the financial distress of listed companies.

Take Chinese listed companies as the research objects. Companies that are specially treated (ST) by the China Securities Supervision and Management Committee (CSSMC) are considered as companies in financial distress and those who are never ST are regarded as healthy ones. According to the data between 2010 and 2015, select 150 listed companies which were ST due to abnormal financial situation as abnormal sample, and select 350 listed companies which were never ST as normal sample companies of healthy financial status. The data used in this study was obtained from RESSET database (www.resset.cn). In order to eliminate outliers, companies with financial ratios deviating from the mean value as much as three times of standard deviation are excluded, and thus eventually get 425 sample companies, among which 115 are ST companies and 310 are normal ones.

Because different companies have different reasons to be treated as ST, it is difficult to use simple financial ratio metrics to describe the company’s financial situation. And different researchers choose the different financial ratio metrics. To truly reflect the financial situation of enterprises, 52 metrics of primary financial ratio are selected, and then remove the metrics that have a correlation coefficient greater than 0.95, remaining 38 indicators which can be included in six dimensions such as profitability, solvency, business development capacity, operational capacity, cash flow and capital structure, as shown in Table 2.

Table 2.

Dimensions | Indicators name |
---|---|

Profitability | profit margin on net assets X_{1}; return on assets ratio X_{2}; net profit to total assets X_{3}; net profit to total operation income X_{4}; total operation cost to total operation income X_{5}; asset impairment loss to total operation income X_{6}; operating profit ratio X_{7}; total profit cost ratio X_{8} |

Solvency | current ratio X_{9}; super quick ratio X_{10}; debt to equity ratio X_{11}; earnings before interest to total liability X_{12}; net operating cash flow to total liability X_{13}; net operating cash flow to total current liability X_{14}; cash flow to liability X_{15} |

Business development capacity | earnings per share growth rate X_{16}; operating profit growth rate X_{17}; total profit growth rate X_{18}; net profit growth rate X_{19}; net operation cash flow growth rate X_{20}; net asset growth rate X_{21}; total assets growth rate X_{22} |

Operational capacity | inventory turnover X_{23}; receivables turnover ratio X_{24}; account payable turnover rate X_{25}; current assets turnover rate X_{26}; fixed asset turnover rate X_{27}; total asset turnover rate X_{28} |

Cash flow | sales and service cash to operating income X_{29}; capital expenditure to depreciation and amortization X_{30}; operating income cash coverage X_{31}; operation cash into Asset rate X_{32} |

Capital structure | debt to asset ratio X_{33}; current asset to total asset X_{34}; fixed asset ratio X_{35}; equity to total capital X_{36}; current liability to total liability X_{37}; long asset fit asset X_{38} |

The positive instance refers to the sample of minority class, which is the ST company in this case, and is the abnormal sample of MTS. The corresponding negative instance refers to the non-ST company, which is the normal sample of MTS. The experiment uses the 5-fold cross-validation method, that is, each time the data sets are randomly divided into five parts, each experiment selected four of them randomly as training samples, and the remaining one is used as validation sample. All experiments were performed using the rminer packages and R tool. The classification ability is evaluated by the mean of the results of the five cross experiments. The evaluation criteria are: sensitivity, which is the classification accuracy rate of abnormal sample; specificity, which is the classification accuracy rate of normal sample; accuracy, which are the total classification accuracy rate; G-means and F-measure.

The parameters of CBPSO algorithm are set as the classic value, that is, c_{1}=c_{1}=2; the number of particles N=30; the lower and upper limits of the particle velocity are -2 and 2 respectively; and the maximum number of iterations is 150. The initial value of chaotic sequence w(0)=0.48, u=4, which can ensure that the chaotic system is completely in chaotic state. The convergence of CBPSO is demonstrated in Fig. 2.

The 5-fold cross-validation experiments were carried out using the traditional MTS and MTS-CBPSO, respectively. The result is shown in Table 3.

In terms of variable optimization, the average number of variables being deleted in MTS is 8.6, while the average number of variables being deleted in MTS-CBPSO is 14.8. This indicates that MTS-CBPSO has a better effect of dimensionality reduction. It is because that the evaluation metrics are appropriately set for the imbalanced data, and when the method iterative to achieve optimal, the time cost which is represented by the number of variables is also taken into account. So as to achieve the same classification accuracy, the effect of dimensionality reduction is also considered.

Table 3.

Sensitivity | Specificity | Accuracy | G-means | F-measure | |
---|---|---|---|---|---|

MTS-CBPSO | 0.896 | 0.915 | 0.909 | 0.902 | 0.908 |

MTS | 0.824 | 0.845 | 0.832 | 0.838 | 0.831 |

From the above analysis, it is obviously that the MTS-CBPSO method is superior to the traditional MTS in both dimensionality and classification accuracy. What’s more, the MTS-CBPSO is more robust. Compared with the other three algorithms, the classification result of the normal sample does not have much difference. But as to the classification result of the abnormal sample, MTS-CBPSO is best, followed by are SVM and C4.5, k-NN is worst. The classification result of k-NN method is acceptable with majority class, but is worst with minority class, and this is due to the relatively small amount of abnormal financial class and the relatively large noise, while in the financial distress prediction process, minority class sample recognition is more important. Therefore, the MTS-CBPSO algorithm has the best effect on the financial distress prediction of listed companies; MTS, SVM, C4.5, and k-NN are relatively weak.

Compare with previous study, the contribution of this paper is showed as follows. (1) Based on the traditional MTS, the BPSO algorithm is used to replace the method of orthogonal array and SNR to optimize the variables, and furthermore, the chaotic mapping is introduced into the BPSO algorithm. Because of the fast convergence of BPSO algorithm and global ergodic characteristics of chaos mapping, BPSO algorithm could effectively get rid of the local convergence value, and so that the optimization accuracy could be improved and stable. (2) According to the characteristics of imbalanced data, G-means, F-measure, and dimensionality reduction are used as classification metrics instead of the overall correct rate of classification. (3) Apply CBPSO-MTS method to predict the financial distress of Chinese listed companies based on 38 financial metrics. The results show that the performance of MTS-CBPSO is better than traditional MTS and other commonly used classification methods, and is more suitable to deal with imbalanced data.

However, the financial data used are derived from the annual report, while the data of quarterly and semi-annual report are not used. These data may be more timeliness for financial distress prediction, which are the follow-up research directions.

She received M.S. and Ph.D. degrees from Nanjing University of Science and Technology, Nanjing, China, in 2004 and 2019, respectively. She is currently a teacher in the School of Management Science and Engineering, Anhui University of Finance and Economics. Her current research interests include pattern recognition and data mining.

He received M.S. and Ph.D. degrees from Nanjing University of Science and Technology, Nanjing, China, in 1989 and 1998, respectively. He is currently a professor in the School of Economics and Management, Nanjing University of Science and Technology. His research interests include data mining and Management decision.

- 1 S. Maldonado, J. Lopez, "Imbalanced data classification using second-order cone programming support vector machines,"
*Pattern Recognition*, vol. 47, no. 5, pp. 2070-2079, 2014.doi:[[[10.1016/j.patcog.2013.11.021]]] - 2 G. Taguchi, R. Jugulum,
*The Mahalanobis-Taguchi strategy: A Pattern Technology System*, NY: John Wiley & Sons, New York, 2002.custom:[[[-]]] - 3 P. Shakya, M. S. Kulkarni, A. K. Darpe, "Bearing diagnosis based on Mahalanobis–Taguchi–Gram–Schmidt method,"
*Journal of Sound and Vibration*, vol. 337, pp. 342-362, 2015.doi:[[[10.1016/j.jsv.2014.10.034]]] - 4 B. John, "Application of Mahalanobis-Taguchi system and design of experiments to reduce the field failures of splined shafts,"
*International Journal of Quality & Reliability Management*, vol. 31, no. 6, pp. 681-697, 2014.doi:[[[10.1108/ijqrm-10-2012-0134]]] - 5 X. Jin, T. W. Chow, "Anomaly detection of cooling fan and fault classification of induction motor using Mahalanobis–Taguchi system,"
*Expert Systems with Applications*, vol. 40, no. 15, pp. 5787-5795, 2013.doi:[[[10.1016/j.eswa.2013.04.024]]] - 6 B. Valarmathi, V. Palanisamy, "Opinion mining of customer reviews using Mahalanobis-Taguchi system,"
*European Journal of Scientific Research*, vol. 62, no. 1, pp. 95-100, 2011.custom:[[[-]]] - 7 S. E. Abbasi, A. Aaghaie, M. Fazlali, "Applying Mahalanobis–Tagouchi system in detection of high risk customers: a case-based study in an insurance company,"
*Journal of Industrial Engineering*, vol. 45, no. 2, pp. 1-12, 2011.custom:[[[-]]] - 8 C. L. Huang, Y. H. Chen, T. L. J. Wan, "The Mahalanobis Taguchi system: adaptive resonance theory neural network algorithm for dynamic product designs,"
*Journal of Information and Optimization Sciences*, vol. 33, no. 6, pp. 623-635, 2012.custom:[[[-]]] - 9 W. H. Woodall, R. Koudelik, K. L. Tsui, S. B. Kim, Z. G. Stoumbos, C. P. Carvounis, "A review and analysis of the Mahalanobis-Taguchi system,"
*Technometrics*, vol. 45, no. 1, pp. 1-15, 2003.custom:[[[-]]] - 10 R. Jugulum, G. Taguchi, S. Taguchi, J. O. Wilkins, "Discussion: a review and analysis of the Mahalanobis-Taguchi system,"
*Technometrics*, vol. 45, no. 1, pp. 16-21, 2003.custom:[[[-]]] - 11 M. Sokolova, G. Lapalme, "A systematic analysis of performance measures for classification tasks,"
*Information Processing & Management*, vol. 45, no. 4, pp. 427-437, 2009.doi:[[[10.1016/j.ipm.2009.03.002]]] - 12 R. Eberhart, J. Kennedy, "A new optimizer using particle swarm theory," in
*Proceedings of the 6th International Symposium on Micro Machine and Human Science*, Nagoya, Japan, 1995;pp. 39-43. custom:[[[-]]] - 13 J. Kennedy, R. C. Eberhart, "A discrete binary version of the particle swarm algorithm," in
*Proceedings of 1997 IEEE International Conference on Systems*, Man, and Cybernetics: Computational Cybernetics and Simulation, Orlando, FL, 1997;pp. 4104-4108. custom:[[[-]]] - 14 L. Y. Chuang, C. H. Yang, J. C. Li, "Chaotic maps based on binary particle swarm optimization for feature selection,"
*Applied Soft Computing*, vol. 11, no. 1, pp. 239-248, 2011.doi:[[[10.1016/j.asoc.2009.11.014]]] - 15 X. Xu, Z. Xiao, "Soft set theory oriented forecast combination method for business failure prediction,"
*Journal of Information Processing Systems*, vol. 12, no. 1, pp. 109-128, 2016.doi:[[[10.3745/JIPS.04.0016]]] - 16 R. Geng, I. Bose, X. Chen, "Prediction of financial distress: an empirical study of listed Chinese companies using data mining,"
*European Journal of Operational Research*, vol. 241, no. 1, pp. 236-247, 2015.doi:[[[10.1016/j.ejor.2014.08.016]]]