## Miao-Miao Liu* , Qing-Cui Hu* , Jing-Feng Guo** and Jing Chen## |

Dataset | |V| | |E| | Percentage of edges (%) | |
---|---|---|---|---|

Positive | Negative | |||

Epinions | 131828 | 840799 | 85.00 | 15.00 |

Slashdot | 82144 | 549202 | 77.40 | 22.60 |

Wikipedia | 138592 | 740106 | 78.70 | 21.30 |

FEC | 28 | 42 | 71.40 | 28.60 |

GGS | 16 | 58 | 50.00 | 50.00 |

CRA | 36 | 74 | 93.20 | 6.800 |

The classical index AUC [2] is suitable for the evaluation of link prediction accuracy in traditional social networks with only positive links. However, the total similarity score between two nodes in the TLG algorithm may be positive or negative. Its absolute value represents the probability of the two nodes to establish a link, and it is positive or negative represents the sign type of the predicted link. Therefore, in our experiments, the classical AUC index is adjusted to obtain a new index, written as AUC’, which can correctly evaluate the link prediction results in signed networks. Its definition is shown as follows. In the experiment, the total similarity of each node pair corresponding to the edge selected randomly from the test set and the edge selected from the non-existent edge-set is calculated. And the comparison is done only when the signs of the two similarity values are the same. If the absolute value of the total similarity between the two nodes corresponding to the edge selected from the test set is greater than that of the edge selected from the nonexistent edge-set, [TeX:] $$\bar{n}^{\prime}$$' will plus 1. If the absolute values of the two are equal, [TeX:] $\bar{n}^{\prime \prime}$$$ will plus 1. If the signs of the two are different, the calculation will be canceled, and we will choose edges again. In each group of experiments, it should be ensured that there are 20,000 times that the signs of the total similarity value corresponding to the two edges selected are the same. That is to say, the value of [TeX:] $$\bar{n}$$ is 20,000.

For the sign prediction in signed networks, a series of evaluation indexes can be obtained, such as TP, FP, TN, FN, Recall, Precision, Accuracy, F1-score, etc. The sign prediction in signed networks needs to evaluate the comprehensive prediction accuracy of positive and negative signs. Relevant researches have shown that in most real signed networks the ratio of the number of positive links to the number of negative links is greater than 3:1. That is to say, the probability that positive links are selected in the experiment is much higher than that of negative links. Therefore, we improved the index of Accuracy, and the index Accuracy' is obtained by adjusting the above indexes to give a weight of 1 for sign prediction results of positive edges and a weight of 0.5 for sign prediction results of negative edges, which is used as the comprehensive evaluation index, as follows.

(1) Analysis of link prediction accuracy: The experimental results of the link prediction accuracy of the TLG algorithm based on the AUC′ evaluation index are shown in Fig. 4. Prediction accuracy and the average value of ten independent experiments which is abbreviated as AVG are obtained. The number of links that are effectively predicted and compared in each independent experiment is 20,000, namely, [TeX:] $$\bar{n}$$=20,000 . As can be seen from Fig. 4, the TLG algorithm shows high link prediction accuracy for the three large real signed networks and the small simulation network CRA (as shown in Figs. 5 and 6). These four datasets are signed networks with conventional topology in which the number of positive links is much larger than that of negative links and the degree distribution of nodes has no particularity.

For the GGS network, the accuracy of the proposed algorithm is lower than that of the first four datasets. This network describes the real political alliance and enmity among 16 sub-tribes as shown in Fig. 7. The proportion of positive links and negative links is 1:1, and the degree distribution of 16 nodes is shown in Fig. 8. For the dataset with the same number of positive and negative links, the prediction accuracy of the TLG algorithm can still reach 67%, which shows its good robustness.

For the FEC network, the accuracy of the TLG algorithm remains 0.5. The topology and node degree distribution of the dataset are shown in Figs. 9 and 10. It can be seen that 28 nodes are divided into two types with the same degree distribution. When calculating AUC′, the topological structure of the links obtained from the test set and the nonexistent edge set is almost the same in most cases. In other words, the probability that the degree distribution of the node pairs corresponding to these two links is not the same is [TeX:] $$\mathrm{C}_{24}^{2} \mathrm{C}_{4}^{2} / \mathrm{C}_{28}^{2} \mathrm{C}_{26}^{2}$$, which equals 0.0135. Namely, [TeX:] $$\bar{n}^{\prime} \approx 0$$ and [TeX:] $$\bar{n}^{\prime \prime} \approx \bar{n}$$. Therefore, the value of AUC′ should be 0.5. The experimental results also verify the correctness of the TLG algorithm.

(2) Analysis of sign prediction accuracy: Similarly, the sign prediction accuracy of the proposed algorithm is verified on six datasets. The experimental results are shown in Table 2 and Fig. 11.

Table 2.

Epinions | Slashdot | Wikipedia | FEC | GGS | CRA | |
---|---|---|---|---|---|---|

TP(+/+) | 0.8218 | 0.6867 | 0.8659 | 0.6000 | 0.5000 | 0.8750 |

FP(+/–) | 0.0623 | 0.0989 | 0.0617 | 0.2000 | 0.1667 | 0.0000 |

TN(–/v) | 0.1048 | 0.1511 | 0.0593 | 0.0000 | 0.3333 | 0.0000 |

FN(–/+) | 0.0111 | 0.0633 | 0.0131 | 0.2000 | 0.0000 | 0.1250 |

Recall | 0.6533 | 0.9156 | 0.9851 | 0.7500 | 1.0000 | 0.8750 |

Precision | 0.9295 | 0.8741 | 0.9335 | 0.7500 | 0.7500 | 1.0000 |

F1-score | 0.9573 | 0.8944 | 0.9586 | 0.7500 | 0.8571 | 0.9333 |

Accuracy | 0.9266 | 0.8378 | 0.9252 | 0.6000 | 0.8888 | 0.8750 |

(3) Comparison with other algorithms: Experimental comparisons are carried out with the CN-Predict, ICN-Predict and the PSNBS algorithm. The AUC′, Accuracy′, and AUC [1] are used as evaluation indexes. Results are shown in Fig. 12, Fig. 13, and Fig. 14, respectively. Note that the results of the PSNBS algorithm are the highest accuracy when the step size influence factor [TeX:] $$\lambda$$ is assigned the optimal value. Results show that the TLG algorithm does not need to preset the influence factor, and it can achieve a good prediction effect for both conventional large-scale signed networks and small-scale signed networks with the special topology. Moreover, its accuracy is higher than other algorithms.

(4) The Top-k recommended links: According to the TLG algorithm, the Top-100 recommended links are given for the first three large-scale networks, as shown in Figs. 15–17. Meanwhile, the top-50 recommended links are also given for the two small-scale datasets, as shown in Fig. 18. In these four figures, the serial number of the node pairs and their similarities are displayed. And the prediction of positive links and negative links are respectively represented by different colors.

Besides, the recommendation results of links based on the TLG and PSNBS algorithm are compared on the two small datasets, as shown in Fig. 19. Results of the proposed algorithm are consistent with that of the PSNBS algorithm, which further verifies the correctness and effectiveness of the TLG algorithm.

A link prediction algorithm for signed networks is proposed. It effectively obtains the local and global structural features that affect the similarity of nodes and can achieve the dual goals of link prediction and sign prediction. Experimental results show its better prediction effect than other methods. However, for large-scale signed networks, how to reduce the computational complexity is a key problem. Besides, the proposed algorithm takes the static snapshot of the signed network as the research object while the real social network is dynamic. Therefore, in the next research work, perhaps we should integrate deep learning methods to mine more structural properties to complete the link prediction of signed networks.

This work was supported by the Natural Science Foundation of China (No. 42002138 and 61871465), Natural Science Foundation of Heilongjiang Province (No. LH2019F042 and LH2020F003), Youth Science Foundation of Northeast Petroleum University (No. 2018QNQ-01), and Science & Technology Program of Hebei (No. 20310301D).

She was born in 1982 and she is currently a full professor and the Master’s Supervisor of Northeast Petroleum University in China. She got her doctorate in computer science and technology at Yanshan University in 2017. Her main research interests include community discovery and link prediction in social networks.

- 1 M. M. Liu, J. F. Guo, J. Chen, "Link prediction in signed networks based on the similarity and structural balance theory,"
*Journal of Information Hiding and multimedia Signal Processing*, vol. 8, no. 4, pp. 831-846, 2017.custom:[[[-]]] - 2 M. M. Liu, Q. C. Hu, J. F. Guo, J. Chen, "Survey of link prediction algorithms in signed networks,"
*Computer Science*, vol. 47, no. 2, pp. 21-30, 2020.custom:[[[-]]] - 3 D. Li, D. Shen, Y. Kou, Y. Shao, T. Nie, R. Mao, "Exploiting unlabeled ties for link prediction in incomplete signed networks," in
*Proceedings of 2019 3rd IEEE International Conference on Robotic Computing (IRC)*, Naples, Italy, 2019;pp. 538-543. custom:[[[-]]] - 4 X. Su, Y. Song, "Local labeling features and a prediction method for a signed network,"
*CAAI Transactions on Intelligent Systems*, vol. 13, no. 3, pp. 437-444, 2018.custom:[[[-]]] - 5 P. Shen, S. Liu, Y. Wang, L. Han, "Unsupervised negative link prediction in signed social networks,"
*Mathematical Problems in Engineering*, vol. 2019, no. 7348301, 2019.doi:[[[10.1155//7348301]]] - 6 N. Girdhar, S. Minz, K. K. Bharadwaj, "Link prediction in signed social networks based on fuzzy computational model of trust and distrust,"
*Soft Computing*, vol. 23, no. 22, pp. 12123-12138, 2019.custom:[[[-]]] - 7 X. Chen, J. F. Guo, X. Pan, C. Zhang, "Link prediction in signed networks based on connection degree,"
*Journal of Ambient Intelligence and Humanized Computing*, vol. 10, no. 5, pp. 1747-1757, 2019.custom:[[[-]]] - 8 X. Zhu, Y. Ma, "Sign prediction on social networks based nodal features,"
*Complexity*, vol. 2020, no. 4353567, 2020.doi:[[[10.1155//4353567]]] - 9 G. Beigi, S. Ranganath, H. Liu, "Signed link prediction with sparse data: the role of personality information," in
*Companion Proceedings of the 2019 W orld Wide W eb Conference*, San Francisco, CA, 2019;pp. 1270-1278. https://doi.org/10.1145/3308560.3316469. custom:[[[-]]] - 10 T. Derr, Z. Wang, J. Dacon, J. Tang, "Link and interaction polarity predictions in signed networks,"
*Social Network Analysis and Mining*, vol. 10, no. 18, 2020.doi:[[[10.1007/s13278-020-0630-6]]]