Article Information
Corresponding Author: Batnasan Namsrai , batnasan@num.edu.mn
Enkhtuul Bukhsuren, Dept. of Information and Computer Sciences, National University of Mongolia, enkhtuul.b@num.edu.mn
Uyanga Sambuu, Dept. of Information and Computer Sciences, National University of Mongolia, uyanga@num.edu.mn
Oyun-Erdene Namsrai, Dept. of Information and Computer Sciences, National University of Mongolia, oyun-erdene@num.edu.mn
Batnasan Namsrai, Dept. of Marketing and Trade, National University of Mongolia, batnasan@num.edu.mn
Keun Ho Ryu, Database and Bioinformatics Laboratory, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju, Korea, khryu@chungbuk.ac.kr
Received: March 8 2022
Revision received: April 1 2022
Revision received: April 29 2022
Accepted: May 13 2022
Published (Print): October 31 2022
Published (Electronic): October 31 2022
1. Introduction
The rapidly increasing volume, velocity, variety, and complexity of data are due to the dynamic development of information technology and IoT devices, such as the day-in-day-out operational growth of all sectors, including research institutions, businesses, trade, and industry. It is becoming increasingly challenging to use traditional data analysis techniques to accurately and efficiently process large amounts of data that are being generated in each sector today. Traditional data analysis methods that rely heavily on workforce can no longer effectively process, analyze extensive data, and help take decisions [1,2]. Data mining, artificial intelligence, and cloud computing technology [3-5] are gradually becoming the key techniques for analyzing large amounts of data.
Since the dawn of human history, financial markets have played a crucial role in economic and social activities of organizations [6,7]. Financial activities play central role in the current economic development of many countries worldwide, and they contribute to the growth of the world economy. Financial markets depend on many factors [8]. Researchers working in such areas as data mining, finance, and mathematics have in the past essentially focused on investing in financial markets. Accordingly, some data mining algorithms have been proposed to support investors in different financial markets [9]. Financial markets are risky and sensitive, and so it is critical to base your long-term investment decisions on research. It is becoming increasingly challenging in dealing with complex financial data using statistical methods only. Which is why, researchers use data mining methods and machine learning methods to process complex financial data [10].
The banking sector accounts for 95% of the financial market in Mongolia. Although companies can use long-term financing from the stock market at a lower rate than the lending from the banks, companies in Mongolia derive the majority of their investments from high-interest bank loans given the development status-quo of this market. Owing to the weak stock market development, long-term, low-cost financing channels to support real economic growth are limited, economic growth is inaccessible, and asset valuation is underdeveloped. As a result, there is insufficient capital circulation in the economy, foreign investment in non-mining sectors is relatively low, and foreign exchange inflows are small. The World Bank publishes an annual Competitiveness Index report for 137 countries. In this report, Mongolia was ranked at 82 in terms of public decision-making, 123 in corporate ethics, 128 in audit and reporting standards, 132 in board efficiency, 131 in protection of small shareholders' rights, 6.8 in terms of investor protection (maximum score of 10), 114 in consumer awareness and information, and 131 in securities trading regulation as of 2017–2018. These indicators bear evidence to the dearth of corporate governance, corporate ethics, board efficiency-related information in Mongolia, which are needed by the investors. Moreover, majority of the stocks of listed companies are held by a few shareholders. As a result, corporate governance is underdeveloped, and investors are reluctant to invest in the company.
We do not have enough systematic software or decision support systems to process information on listed and stock companies, provide investors with valuable and accurate information through qualitative and technical analysis, and assist them in their investment decisions. Therefore, it is imperative to develop an integrated approach to diversify stock market products, process large stock market data, and augment decision-making systems for stock portfolio selection methodology.
Markowitz's model offers solution based on the correlation of stock. Our proposed method is much more accurate than the existing methodologies; we have used data mining techniques to support decision-makers. Investors can select a portfolio based on stocks. We evaluated the financial, corporate gov¬ernance, and bankruptcy indicators to identify the stock ranking applying the TOPSIS method of the multicriteria decision-making method.
2. Related Work
Global stock exchange development trends have changed owing to global economic crisis and other negative factors. Until the 1980s, stock trading was traditional, and thereafter electronic commerce made strong headway with the progressive improvement of computer technology. With the advent of computer systems, data were stored in databases, and large amounts of data were generated. The concept of "big data" emerged in the1990s and research by scientists is focused on how this large amount of data can be processed efficiently to meet human needs and also help people make informed decisions.
Hsu [11] implemented a hybrid of self-organizing map (SOM) and genetic programming to predict stock prices on the Taiwan Stock Exchange and artificial neural network for stock market prediction [12- 14] was likewise applied. Nanda et al. [15] used k-means, SOM, and fuzzy C-means to cluster stocks listed on the Bombay Stock Exchange (BSE). Using genetic algorithm (GA), a study of Oh et al. [16] in 2015 suggests a portfolio optimization scheme for index fund management. Topaloglou et al. [17] worked on a dynamic stochastic programming model for global portfolio management. Ghosh and Mahanti [18] studied mathematical and econometric models used in 63 research papers and studies were conducted between 2009 and 2014 under securities portfolio management. More than 30 algorithms were used in these works, with mathematical modelling algorithms accounting for almost 24% of the research. Sharp ratios were used in the most, constitution 39.68%, to measure portfolio performance.
Recent financial theories have also shown that it is important to consider investor behavior in portfolio selection. Researchers have proposed a financial risk indicator system based on similarity measurement and item clustering. They constructed a financial risk model [19] on the clustering algorithm to classify and optimize financial risks [20,21]. In the past decade, data mining and machine learning algorithms have been used to analyze financial activities [22-25]. Recent financial theories have also revealed the quintessence of considering investor behavior in portfolio selection.
3. Proposed Methodology
Based on the comparative analysis of optimal risky portfolio selection research papers, and by taking into consideration Mongolia's political, economic, legal, and corporate governance indicators, we have developed the following six-step methodology.
The overall structure of the Decision Support System (DDS) developed for stock portfolio selection in the Mongolian Stock Exchange is shown in Fig. 1.
The steps and stages in the decision support system are as follows:
STEP-1. Data collection, cleaning, and portfolio creation
STEP-2. Stock clustering
STEP-3. Stock ranking
STEP-4. Stock weighting
STEP-5. Portfolio ranking
STEP-6. Selection of optimal portfolio
The main steps of our proposed methodology are described in detail as follows.
3.1 STEP-1. Data Collection, Cleaning, and Portfolio Creation
We have selected records from stock exchange information, financial statements, and activity report information of top-20 highly capitalized stocks traded at the Mongolian Stock Exchange from 2013 to 2017. Following which, we refined the data for further analysis, and some records were duplicated and regenerated for preprocessing. On the whole, 16,018 records were used in our experiment. Furthermore, we manually selected the required information from the company's financial statements and corporate governance reports and used around 100 data records with 15 indicators.
3.2 STEP-2. Stock Clustering
Numerous researches have revealed that k-means are appropriate for investors to make investment decisions based on stock returns and risks. Proceeding from this premise, we clustered the stocks according to their return and risk, using the k-means clustering analysis [26].
Algorithm basic k-means algorithm
1. Select k points as initial centroids.
2. repeat
3. Form k clusters by assigning each point to its closest centroid.
4. Recompute the centroid of each cluster
5. until centroids do not change.
3.3 STEP-3. Stock Ranking
We calculated financial, corporate governance, and bankruptcy indicators to identify the stock ranking. We identified the stock rankings by the TOPSIS method of multicriteria decision-making method.
Company Financial Indicators: One of the most critical factors that investors must consider before investing in a company's stock is the company's financial performance. A company's financial statements show the company's financial position over time, future profits and dividends, and they contain the right information needed to assess the company's prospects. As a result of our research, the following financial ratios are considered in determining the ranking of stocks: return on equity (ROE), return on assets (ROA), earnings per share (EPS), price-earnings ratio (P/E ratio), and Debt ratio.
Financial ratios that affect a company's bankruptcy: Private and corporate investors take investment decisions by forecasting the company's financial difficulties and bankruptcy. In other words, it is essential for a company to trade before the stock price falls and to predict changes in the value of stocks and additional stocks in the portfolio. Tsolmon [27] developed a bankruptcy prediction model and identified four independent variables that significantly impact on company's bankruptcy. These include: interest and pre-tax profit/total assets or EBITDA, equity/total assets or ETA, liabilities/equity or LE, logarithm of total assets or LOGTA.
Corporate governance and its indicators: In Mongolia, there is no clear understanding of what corporate governance means, but in our research, we have selected essential factors in corporate governance by considering the following: the size of the board, independent board of directors, institutional shareholder(s), audit committee, percentage of shareholders (more than 5%) and firm size, which means that there are no more devices to be discovered in the partition. Go to the next partition.
The weight of the criteria influencing decision-making is highly dependent on the subjective judgment of the decision-maker. The emotional weight of the indicators consists of the decision-makers' experience, knowledge, and understanding of the issue. For this reason, a variety of subjective weighting methods have been developed. The results of these methods are not reliable, however. The critical method, which does not have an emotional nature, was proposed by Diakoulaki et al. [28]. The technique uses a standard deviation of the criteria and a correlation between the indicators.
Hwang and Yoon [29] developed the technique for order preference by similarity to ideal solution (after this TOPSIS) based on choosing a decision option close to the positive solution of the criterion and far from the negative ideal solution.
The stocks were prioritized based on the TOPSIS method of the multicriteria decision-making method. TOPSIS sketches are as follows:
Step 1. Develop a decision matrix and the standard decision matrix (R);
Step 2. Calculate the weighted (R) matrix;
Step 3. Determine the best and the worst solutions;
Step 4. Compute the distance of each alternative from the positive ideal solution and the distance of each alternative from the negative ideal solution; and,
Step 5. Calculate the similarity to the worst condition.
3.4 STEP-4. Stock Weighting
Markowitz [30,31] developed the portfolio selection theory in uncertain conditions, identified the difference between risks of particular assets and portfolios in the math equation, and proved that the risk of a portfolio depends on the covariance of assets in that portfolio. The Markowitz model suggests maximizing expected return and minimizing the portfolio's risk for risk and return.
The key formula for maximization of the expected rate of return of the portfolio is:
where [TeX:] $$\omega_i$$ must satisfy the following constraint.
where [TeX:] $$\omega_i-i$$ is the weight of an asset in the portfolio, [TeX:] $$E\left(R_i\right)-i$$ is the expected return of asset in the portfolio, and [TeX:] $$\operatorname{cov}\left(R_i, R_j\right)$$ is the covariance between asset i and j.
3.5 STEP-5. Portfolio Ranking
The portfolio's ranking considers the weight of the stock in the portfolio, its return and risk, the portfolio beta, the Sharpe ratio, and the Treynor ratio.
A web-based decision support system automatically helps to by-pass the creation of an optimal risky portfolio. The software module shows the ranked portfolio to the user.
3.6 STEP-6. Selection of Optimal Portfolio
We resolved the risk minimization problem at the desired return level and established a threshold for the stock return curve. An investor can select any one of the appropriate portfolios in accordance with their sentiment, decide on how to handle portfolio management, and react to market conditions.
4. Experimental Results
We developed a web-based decision support system called “Decision support system for stock investment” using PHP. We wrote 650 lines of code with a total of nine modules. As a database manage¬ment system, we used MySQL. Our database contains nine tables with 16,018 records.
According to industry classification, we have classified the Mongolian Stock Exchange listed companies into five sectors: production, construction and transportation, mining, agriculture, trade and services, and 14 sub-sectors.
Fig. 2 shows the system settings window with the following tabs: sector, sub-sectors, companies, financial indicators, and market index. Each tab is available to update, delete, insert, and select operations. We calculated the financial ratio from the company's financial statements, and governance indicators are taken from the company's operational report. The system allows users to manually enter, import, and enter data into a database.
Results of stock ranking processing using the TOPSIS method of multivariate decision-making
The stability of a joint-stock company and its sound financial performance are indicators of a company's suitable governance mechanism. The weight of the factors influencing the stocks' ranking was calculated using the critical method.
The weight decision-making indicators for each sector are shown in Table 1. Of course, the most significant ratio for decision-making is the P/E ratio. Financial and governance indicators for 2013–2017 are considered for each sector. The stock was ranked by the TOPSIS method. The following results were obtained for the companies ranking based on each sector's 2013–2017 financial and governance indicators of each sector.
Table 2 shows the following information: based on the 2017 financial ratios and governance indicators, Arig Gal JSC was ranked highest among production companies. Hermes JSC was ranked highest among construction companies, Baganuur JSC was highest among mining companies, and BDSec JSC was highest among the trade and service companies.
Clustering by the k-means method
An investor creating a portfolio by the highest rank stocks of that clustering is the most accurate decision-making. Based on companies' historical stock trading data with the top-20 listed companies from January 1, 2013, to December 31, 2017, we have created six clusters using the k-means method by considering the expected return on a stock and expected average risk.
Fig. 3 shows a screenshot of K-means clustering. The cluster includes stocks of companies with similar financial characteristics. The number of clusters created by the investor should be less than the number of companies registered in the web system. The stocks are placed in one cluster in a descending order. The optimal portfolio can be created by selecting one stock from each cluster of investors. The system allows investors to create the most optimal portfolio by selecting one stock from each cluster according to their sentiment.
Weight of decision-making indicators for each sector in 2016–2017
Stock ranking calculated by the TOPSIS method
Clustering by the k-means method.
Markowitz's basic model calculates a portfolio's expected average return and risk when the investor chooses the same weight of stocks to be included in the portfolio. The highest-ranked stocks will appear at the beginning of the clustering, for example, from Cluster 2 BDSec and Baganuur from Cluster 4 Makhimpex, Cluster 5 Darkhan Nekhii, and Cluster 6 APU.
The following results were obtained by calculating the expected average return and risk of a portfolio with the same weight or 20% of the stocks in the portfolio using the Markowitz model:
Expected return: - 0.00015
Expected risk: 0.00043
At a next step, we solved stock return maximization for these five stocks at the risk level of 0.00043 to estimate the optimal stock weights of each stock, as shown in Fig. 4.
Weights of stocks in optimal portfolio k-means clustering.
Table 3 shows the expected average return, risk, and weight of shares in a portfolio's efficiency margin curve.
An investor can select the appropriate portfolio in accordance with his/her sentiment, decide on portfolio management behavior and react to market conditions.
Portfolio options offered by the system
5. Conclusion
In our study, we proposed a six-step decision support system for an optimal portfolio of stocks suitable in the conditions of Mongolia. This model is based on data mining clustering techniques that reflect the impact of political, economic, legal, and corporate governance in Mongolia. We have developed a web-based decision support information system for financial intermediaries and investors, which offers optimal risky portfolio options.
As a dataset, we have selected stock exchange information, financial statements, and activity report information of top-20 highly capitalized stocks traded at the Mongolian Stock Exchange from 2013 to 2017. Based on the stuides conducted, 15 financial and non-financial indicators were identified that have the most significant impact on the ranking of Mongolian joint-stock companies. The weight of decision-making factors is calculated using a critical method, which is advantageous as it allows the individual to avoid subjective judgments. K-means clustering method is used to cluster stock returns and risks. The stock weight within the optimal portfolio and portfolio return and risks were estimated using the Markowitz method. In addition to offering investors a portfolio of stocks issued by the system, investors can choose a portfolio that suits their behavior. The system is crucial because it allows investors to build the most profitable portfolio according to their economic situation, behavior, and risk tolerance. With the support of the system that we have developed, investors will make non-intuitive decisions based on factual information and calculations.
As a result of the research, we have come to the conclusion that the creation of a portfolio by clustering the stocks is the most efficient method yet.
In the future, we plan to conduct research into the following: (1) Discover association rules on stock sales by applying the association analysis method; (2) Predict stock selection based on investor's behavior; (3) Identify the association between investors' decisions and joint-stock company information posted on social networks; and (4) Predict future stock prices from historical trading data.
Acknowledgement
The research has received funding from the National University of Mongolia (Grant No. P2019-3739).