## Chanchan Zhao* , Feng Liu* and Xiaowei Hai***## |

Passenger node | Cluster | Passenger node | Cluster |

S1 | 1 | S26 | 3 |

S2 | 1 | S27 | 3 |

S3 | 1 | S28 | 3 |

S4 | 1 | S29 | 3 |

S5 | 2 | S30 | 3 |

S6 | 2 | S31 | 3 |

S7 | 2 | S32 | 3 |

S8 | 1 | S33 | 3 |

S9 | 1 | S34 | 4 |

S10 | 2 | S35 | 4 |

S11 | 1 | S36 | 4 |

S12 | 1 | S37 | 4 |

S13 | 2 | S38 | 4 |

S14 | 2 | S39 | 4 |

S15 | 2 | S40 | 4 |

S16 | 2 | S41 | 4 |

S17 | 2 | S42 | 4 |

S18 | 2 | S43 | 4 |

S19 | 2 | S44 | 4 |

S20 | 3 | S45 | 4 |

S21 | 3 | S46 | 4 |

S22 | 3 | S47 | 4 |

S23 | 3 | S48 | 4 |

S24 | 3 | S49 | 4 |

S25 | 3 | S50 | 4 |

Table 2.

Cluster | Passenger node | Sample size | Proportion (%) | Cumulative proportion (%) |

1 | S1,S2,S3,S4,S8,S9,S11,S12 | 8 | 16 | 16 |

2 | S5,S6,S7,S10,S13,S14,S15,S16,S17,S18,S19 | 11 | 22 | 38 |

3 | S20,S21,S22,S23,S24,S25,S26,S27,S28, S29,S30,S31,S32,S33 | 14 | 28 | 66 |

4 | S34,S35,S36,S37,S38,S39,S40,S41,S42, S43,S44,S45,S46,S47,S48,S49,S50 | 17 | 34 | 100 |

Table 3.

Cluster | | | | |

1 | 91,894 | 121 | 1,156 | 8,207 |

2 | 49,329 | 43 | 999 | 3,984 |

3 | 26,133 | 25 | 531 | 2,221 |

4 | 14,894 | 12 | 575 | 1,742 |

In order to compare our proposed clustering approach with other clustering methods, we apply the same data to the following different algorithms: system clustering and traditional k-means algorithm. Through the comparative performance analysis, the effectiveness and rationality of our newly proposed clustering approach is proved.

5.3.1 Compared with system clusteringWe use the system clustering method to cluster 50 passenger nodes, and the result is shown in Fig. 9. It can be seen from Fig. 9 that the sample data can be divided into three or four categories. If they are divided into three categories, the first category only includes S1, the second category includes S2 and S3 as well as the rest are the third category. If they are divided into four categories, the first category only includes S1, the second category includes S2 and S3, the third category includes S4, S5, S6, S7, and S8 as well as the rest are the fourth category.

If the method is used for clustering, it is necessary to manually determine the final number of clusters, which makes the personal experience to a certain extent have a significant impact on the results. Compared with this method, the DBI in the clustering method proposed in this paper can directly work out the number of optimal clusters is 4. Furthermore, the algorithm of system clustering is based on regression analysis that is essentially linear correlation analysis. So some errors are inevitable. The clustering method of this paper is based on the classification of the whole sample data, which can reflect the essence of the problem more truly. In conclusion, our proposed approach is superior to system clustering.

5.3.2 Compared with traditional k-means algorithmWe use the traditional k-means algorithm to cluster 50 passenger nodes, and the results are shown in Tables 4 and 5.

Table 4.

Cluster 1 | 1.000 |

Cluster 2 | 2.000 |

Cluster 3 | 31.000 |

Cluster 4 | 16.000 |

Effective | 50.000 |

Defective | 0.000 |

From the clustering results in Tables 4 and 5, it can be seen that the clustering effect is not very satisfactory. This clustering method brings 50 passenger nodes into two categories: the third and the fourth. The data in these two categories account for 94% of the total. In addition, the clustering number in the traditional k-means is entered manually. And the clustering number in this paper is determined by the DBI, which avoids the fact that the k-means algorithm only calculates the objective function to lead to a local optimal situation. In conclusion, our proposed approach is superior to traditional kmeans algorithm.

Table 5.

| Passenger node | Cluster | Distance | | Passenger node | Cluster | Distance |

1 | S1 | 1 | 0.000 | 26 | S26 | 3 | 5579.675 |

2 | S2 | 2 | 3206.297 | 27 | S27 | 3 | 5680.704 |

3 | S3 | 2 | 3206.297 | 28 | S28 | 3 | 5110.310 |

4 | S4 | 4 | 33440.595 | 29 | S29 | 3 | 4133.863 |

5 | S5 | 4 | 18448.326 | 30 | S30 | 3 | 3321.081 |

6 | S6 | 4 | 14069.658 | 31 | S31 | 3 | 2011.108 |

7 | S7 | 4 | 13404.683 | 32 | S32 | 3 | 1092.909 |

8 | S8 | 4 | 10096.357 | 33 | S33 | 3 | 1947.974 |

9 | S9 | 4 | 3416.392 | 34 | S34 | 3 | 2372.474 |

10 | S10 | 4 | 1567.185 | 35 | S35 | 3 | 2540.183 |

11 | S11 | 4 | 6102.071 | 36 | S36 | 3 | 3064.050 |

12 | S12 | 4 | 6461.678 | 37 | S37 | 3 | 3659.759 |

13 | S13 | 4 | 6187.772 | 38 | S38 | 3 | 3845.323 |

14 | S14 | 4 | 6206.689 | 39 | S39 | 3 | 3827.404 |

15 | S15 | 4 | 11113.626 | 40 | S40 | 3 | 4971.550 |

16 | S16 | 4 | 12785.420 | 41 | S41 | 3 | 5220.099 |

17 | S17 | 4 | 14271.046 | 42 | S42 | 3 | 5355.394 |

18 | S18 | 4 | 15410.351 | 43 | S43 | 3 | 5683.571 |

19 | S19 | 4 | 15518.065 | 44 | S44 | 3 | 6209.003 |

20 | S20 | 3 | 15367.264 | 45 | S45 | 3 | 6360.139 |

21 | S21 | 3 | 11645.563 | 46 | S46 | 3 | 6566.281 |

22 | S22 | 3 | 10591.104 | 47 | S47 | 3 | 6440.974 |

23 | S23 | 3 | 9172.625 | 48 | S48 | 3 | 6525.897 |

24 | S24 | 3 | 8192.050 | 49 | S49 | 3 | 7467.720 |

25 | S25 | 3 | 8117.674 | 50 | S50 | 3 | 7100.199 |

A train operation plan based on a reasonable hierarchical dividing of passenger nodes could help to satisfy passenger flow better, and enlarge the competitive edge of passenger dedicated line market. In this work, we present a new approach for the hierarchical dividing of passenger nodes based on SOM and k-means algorithm. It eliminates the individual influence and local optimum above mentioned existing in traditional hierarchical dividing process, helps railway authorities better seize the significance of passenger nodes, and guides them, in certain ways, in compiling train operation plan and conducting transportation allocation. Nevertheless, since the main purpose of this paper lies in verifying the reasonability and effectively of the proposed approach, the index parameters selected for hierarchical dividing of passenger nodes still require to be modified in accordance with situations of specific cities; Furthermore, appropriate readjustment should be conducted for cluster results on the basis of qualitative analyses. And these above problems still need to be further studied.

This work is supported by the Natural Science Foundation of Inner Mongolia (No. 2016MS0706 and 2017MS0702), the Institution of Higher Learning Science Research Project of Inner Mongolia (No. NJZY078), and the Science Research Project of Inner Mongolia University of Technology (No. ZD201522).

She received the M.E. in Computer Science from Taiyuan University of Technology in 2007. She is pursuing the Ph.D. degree in School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China. Her current research interests include optimization theory and method, wireless sensor networks, information fusion and interoperability technology.

He received the Ph.D. degree in School of Information, Renmin University of China in 2010. He is now a Professor in School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China. His research interest include computer software, communications software and network management.

He received the Ph.D. degree in School of Economics and Management, Beijing Jiaotong University in 2014. He is now an associate professor in Management College, Inner Mongolia University of Technology, Hohhot, China. His current research interests include information management, software engineering and highspeed railway technology.

- 1
*ChinaDaily, 2016 (Online). Available:*, http://www.chinadaily.com.cn/opinion/2016-07/01/content_25925793.htm - 2 W. X. Wang, H. X. Lyu, "Classification of railway passenger transport nodes based on affinity propagation cluster,"
*Application Research of Computers, 2016*, vol. 33, no. 10, pp. 2926-2928. custom:[[[https://www.researchgate.net/publication/300935064_Classification_of_railway_passengers_based_on_cluster_analysis]]] - 3 B. Gao, Y. Qin, X. M. Xiao, L. X. Zhu, "K-means clustering analysis of key nodes and edges in Beijing subway network,"
*Journal of Transportation Systems Engineering and Information Technology, 2014*, vol. 14, no. 3, pp. 207-213. custom:[[[https://www.researchgate.net/publication/286069132_K-means_clustering_analysis_of_key_nodes_and_edges_in_Beijing_subway_network]]] - 4 Y. Z. Xu, Y. Qin, "Factor analysis of key nodes in urban rail network," in
*Proceedings of IEEE International Conference on Intelligent Transportation Engineering*, Singapore, 2016;pp. 27-31. doi:[[[10.1109/ICITE.2016.7581302]]] - 5 P. F. Zhou, B. M. Han, Q. Zhang, "High-speed railway passenger node classification method and train stops scheme,"
*Applied Mechanics and Materials, 2014*, vol. 505-506, pp. 632-636. doi:[[[10.4028/www.scientific.net/amm.505-506.632]]] - 6 J. S. Park, K. Lee, "Classification of the Seoul Metropolitan Subway Stations using graph partitioning,"
*Journal of the Economic Geographical Society of Korea, 2012*, vol. 15, no. 3, pp. 343-357. doi:[[[10.23841/egsk.2012.15.3.343]]] - 7 T. Kohonen, "Self-organizing map," in
*Proceedings of the IEEE*, 1990;vol. 78, no. 9, pp. 1464-1480. doi:[[[10.1109/5.58325]]] - 8 F. Wang, B. L. Xu, Y. W. Qian, Y. M. Dai, Z. Q. Wang, "Anomaly Detection Model Based on Hybrid Classifiers,"
*Journal of System Simulation, Feb. 2012*, vol. 24, no. 2, pp. 854-858. custom:[[[-]]] - 9 Y. H. Jin, A. Kawamura, S. C. Park, N. Nakagawa, H. Amaguchi, J. Olsson, "Spatiotemporal classification of environmental monitoring data in the Yeongsan River basin, Korea, using self-organizing maps,"
*Journal of Environmental Monitoring, 2011*, vol. 13, no. 10, pp. 2886-2894. doi:[[[10.1039/c1em10132c]]] - 10 M. Alvarez-Guerra, C. Gonzalez-Pinuela, A. Andres, B. Galan, J. R. Viguri, "Assessment of self-organizing map artificial neural networks for the classification of sediment quality,"
*Environment International, 2008*, vol. 34, no. 6, pp. 782-790. doi:[[[10.1016/j.envint.2008.01.006]]] - 11 K. Nishiyama, S. Endo, K. Jinno, C. B. Uvo, J. Olsson, R. Berndtsson, "Identification of typical synoptic patterns causing heavy rainfall in the rainy season in Japan by a self-organizing map,"
*Atmospheric Research, 2007*, vol. 83, no. 2-4, pp. 185-200. doi:[[[10.1016/j.atmosres.2005.10.015]]] - 12 V. S. Lobo, "Application of self-organizing maps to the maritime environment,"
*in Information Fusion and Geographical Information Systems. Heidelberg: Springer2009,*, pp. 19-36. doi:[[[10.1007/978-3-642-00304-2_2]]] - 13 M. Liukkonen, E. Havia, H. Leinonen, Y. Hiltunen, "Quality-oriented optimization of wave soldering process by using self-organizing maps,"
*Applied Soft Computing, 2011*, vol. 11, no. 1, pp. 214-220. doi:[[[10.1016/j.asoc.2009.11.011]]] - 14 J. C. Creput, A. Hajjam, A. Koukam, O. Kuhn, "Self-organizing maps in population based metaheuristic to the dynamic vehicle routing problem,"
*Journal of Combinatorial Optimization, 2012*, vol. 24, no. 4, pp. 437-458. doi:[[[10.1007/s10878-011-9400-8]]] - 15 J. B. MacQueen, "Some methods for classification and analysis of multivariate observations," in
*Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability*, Berkeley, CA, 1967;pp. 281-297. custom:[[[http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.308.8619]]] - 16 L. Zhang, M. Scholz, A. Mustafa, R. Harrington, "Application of the self-organizing map as a prediction tool for an integrated constructed wetland agroecosystem treating agricultural runoff,"
*Bioresource Technology, 2009*, vol. 100, no. 2, pp. 559-565. doi:[[[10.1016/j.biortech.2008.06.042]]] - 17 D. Bedoya, V. Novotny, E. S. Manolakos, "Instream and offstream environmental conditions and stream biotic integrity: importance of scale and site similarities for learning and prediction,"
*Ecological Modelling, 2009*, vol. 220, no. 19, pp. 2393-2406. doi:[[[10.1016/j.ecolmodel.2009.06.017]]] - 18 S. Greco, R. Slowinski, I. Szczech, "Properties of rule interestingness measures and alternative approaches to normalization of measures,"
*Information Sciences, 2012*, vol. 216, pp. 1-16. doi:[[[10.1016/j.ins.2012.05.018]]] - 19 H. L. Garcia, I. M. Gonzalez, "Self-organizing map and clustering for wastewater treatment monitoring,"
*Engineering Applications of Artificial Intelligence, 2004*, vol. 17, no. 3, pp. 215-225. doi:[[[10.1016/j.engappai.2004.03.004]]] - 20 D. L. Davies, D. W. Bouldin, "A cluster separation measure,"
*IEEE Transactions on Pattern Analysis and Machine Intelligence, 1979*, vol. 1, no. 2, pp. 224-227. doi:[[[10.1109/TPAMI.1979.4766909]]] - 21 A. Hentati, A. Kawamura, H. Amaguchi, Y. Iseri, "Evaluation of sedimentation vulnerability at small hillside reservoirs in the semi-arid region of Tunisia using the self-organizing map,"
*Geomorphology, 2010*, vol. 122, no. 1-2, pp. 56-64. doi:[[[10.1016/j.geomorph.2010.05.013]]]