## Minji Seo* and Ki Yong Lee*## |

Group | Description | Weight range |
---|---|---|

1 | Graphs similarin shape to Graph A | 0.0–30.0 |

2 | 50.0–185.0 | |

3 | Graphs similar in shape to Graph B | 0.0–30.0 |

4 | 50.0–185.0 | |

5 | Graphs similar in shape to Graph C | 0.0–30.0 |

6 | 50.0–185.0 |

For the real dataset, we used the real chemical compound dataset provided by PubChem [19]. PubChem is a large database provided and managed by the National Center for Biotechnology Information (NCBI), which contains structures, descriptions, and experimental results for more than 100 million chemicals. From PubChem, we selected five compounds (i.e., Benzene, Hydroxylamine, Lorazepam, Beta-lactam, and Macrolide), which are shown in Fig. 7. Then, for each compound, we selected 20 similar compounds with the help of experts. Thus, in this case, a total of five graph groups were created.

In the experiments, we evaluated whether the proposed graph embedding technique generates similar embedding vectors for similar weighted graphs and different embedding vectors for different weighted graphs. For this purpose, we conducted the experiments as follows: given groups of weighted graphs, we first generated the embedding vectors of all the graphs in the groups using a graph embedding technique to be evaluated. After that, we randomly selected a fixed number of graphs from each group. Then, for each selected graph, we extracted the k graphs with embedding vectors closest to the graph’s embedding vector among all the graphs in the dataset. We then measured the precision at k, which is the proportion of graphs belonging to the same group as the selected graph among the extracted k graphs. Therefore, the higher the value of precision at k, the better the embedding technique generates similar vectors for similar weighted graphs and different vectors for different weighted graphs. We used the cosine distance to measure the distance between two embedding vectors.

In the experiments, we compared the proposed method with the methods proposed in [9] and [11], denoted by RNN autoencoder and Graph2vec, respectively. For the proposed method, we used the ordinal and one-hot encoding to encode node labels, respectively, and trained the LSTM autoencoder with the loss functions MSE, MSE+KLD, and MSE+CCE, respectively. Also, we used the mean, trimmed mean, and mode vectors to generate final embedding vectors, respectively. In Section 4.2, we present the performance of the proposed method for each case. For RNN autoencoder and Graph2vec, we set all their parameters to the values used in [9] and [11], respectively.

We implemented all the methods in Python 3.7.4 using the Jupyter Notebook 6.0.1, and built an LSTM autoencoder using Keras 2.2.5 and TensorFlow GPU 1.14.0. All the experiments were performed on a PC running Ubuntu 16.04.6 LTS equipped with Intel Core i7-9700K 3.60 GHz CPU, 16 GB RAM, 1 TB HDD, and NVIDIA Titan V GPU. We released our code on GitHub at https://github.com/minkky/Graph-Embedding.

In this section, we present the performance evaluation results on the synthetic dataset. In this experiment, we first generated the embedding vectors of all the graphs in the synthetic dataset using the proposed graph embedding technique. It took about 3 hours and 40 minutes to extract node-weight sequences, train an LSTM autoencoder, and generate the final embedding vectors of all the graphs. The total number of node-weight sequences extracted from all the graphs was 4,200. We then randomly selected 10 graphs from each of the six groups in the synthetic dataset. For each selected graph, we extracted the k graphs with the closest embedding vectors, which were generated by the proposed method, and then measured precision at k. We measured precision at k by increasing k from 10 to 50. Fig. 8 shows the performance of the proposed method on the synthetic dataset. Fig. 8(a), (b), and (c) show precision at k when node labels are encoded using the ordinal encoding, while Fig. 8(d) and (e) show precision at k when node labels are encoded using the one-hot encoding. Also, Fig. 8(a) and (d), Fig. 8(b) and (e), and Fig. 8(c) show precision at k when the loss function is MSE, MSE+KLD, and MSE+CCE, respectively (Note that we do not present precision at k when node labels are encoded using the one-hot encoding and the loss function is MSE+CCE because the LSTM autoencoder failed to be trained in that case, i.e., the loss function diverged). In each subfigure of Fig. 8, we compared precision at k when using the mean, trimmed mean, and mode vectors to generate final embedding vectors.

In Fig. 8, we can see that embedding vectors generated by the proposed method always show more than 95% precision at k. This means that the proposed method correctly generates similar embedding vectors for similar weighted graphs and dissimilar embedding vectors for graphs with different structures or different weights. Therefore, we can confirm that the proposed method is very effective for embedding weighted graphs.

When comparing the ordinal and one-hot encoding, the one-hot encoding shows slightly better performance than the ordinal encoding, especially when the mean or trimmed mean vectors are used to generate final embedding vectors. This is because, when training the LSTM autoencoder with node labels encoded by the ordinal encoding, nodes with similar encoding values may be misinterpreted as similar nodes. For example, if node labels “A”, “B”, and “C” are encoded as 1, 2, and 3, respectively, by the ordinal encoding, “A” and “B” may be misinterpreted as being more similar than “A” and “C”. Thus, by being trained in this way, the LSTM autoencoder may learn wrong features from node-weight sequences. On the other hand, this problem is alleviated when we use the one-hot encoding.

When comparing the three loss functions, MSE+KLD and MSE+CCE generally show better performance than MSE. This means that when training the LSTM autoencoder, it is more effective to consider not only the difference between each input and output, but also the difference between the distributions of inputs and outputs.

Finally, when comparing the three final embedding vector generation methods (i.e., the mean, trimmed mean, and mode vectors), we can see that the mean and trimmed mean vectors show similar good performance, while the mode vector shows relatively poor performance. This is because the mode vector reflects only some of the values in the encoding vectors and ignores the rest. On the other hand, the mean and trimmed vectors can represent the information contained in all the encoding vectors well. Especially, the trimmed mean vector sometimes shows better performance than the mean vector because it removes the effect of outlier values in the encoding vectors. Therefore, using the mean or trimmed mean vectors seems to be a more effective way to generate final embedding vectors.

Fig. 9(a) shows the precision at k of the proposed method, RNN autoencoder [9], and Graph2vec [11] on the synthetic dataset. For the proposed method, we used the one-hot encoding to encode node labels, MSE+KLD as the loss function, and the trimmed mean vectors to generate final embedding vectors. In Fig. 9(a), we can see that the proposed method always shows more than 96% precision at k regardless of k, while RNN autoencoder and Graph2vec show only about 50% precision at k. Since RNN autoencoder and Graph2vec only consider the structure of graphs and not the weights of edges in graphs, they generate almost similar embedding vectors for two different groups where graphs have similar shapes but have very different weights (e.g., Group 1 and Group 2 in Table 1). As a result, the embedding vectors for the two different groups are hardly distinguishable from each other. Thus, for each selected graph, its k most similar graphs contain graphs from the two different groups in almost the same proportion (i.e., 50%:50%). On the other hand, the proposed method can generate different embedding vectors for the two different groups by considering the weights of edges in graphs.

Fig. 9(b) shows the recall and area under the receiver operating characteristic curve (AUC) of the three methods on the synthetic dataset. To measure the recall, for each selected graph, we first extracted the 100 graphs with embedding vectors closest to the selected graph’s embedding vector among all the graphs in the dataset. We then computed the recall as the number of graphs belonging to the same group as the selected graph in the extracted 100 graphs, divided by the number of graphs belonging to the same group as the selected graph in the whole dataset (i.e., 100). On the other hand, the AUC measures how well the ranking of graphs in terms of their distances from the selected graph reflects the true ranking. An AUC of 1 represents a perfect ranking and 0 represents a perfect inverse ranking. From Fig. 9(b), we can see that the proposed method shows better performance than the other methods also in terms of both the recall and AUC. Therefore, we can confirm that the proposed method is effective in generating embedding vectors for weighted graphs.

In this section, we show the performance evaluation results on the real compound dataset. The real dataset consists of five groups of similar graphs, each of which consists of 20 chemical compounds. In this experiment, we first generated the embedding vectors of all the graphs in the real dataset using the proposed method, RNN autoencoder, and Graph2vec, respectively. In the case of the proposed method, it took about 1 hour and 40 minutes to extract node-weight sequences, train an LSTM autoencoder, and generate the final embedding vectors of all the graphs. The total number of node-weight sequences extracted from all the graphs was 650. We then randomly selected five graphs from each group and, for each selected graph, extracted the k graphs with embedding vectors closest to the selected graph’s embedding vector among all the graphs. We measured precision at k by increasing k from 1 to 15. Because there are 10 types of atoms in the real dataset, we encoded each atom using the ordinal encoding as in Table 2.

Fig. 9(c) shows the precision at k of the proposed method, RNN autoencoder, and Graph2vec on the real dataset. For the proposed method, we used MSE as the loss function and the trimmed mean vector as the final embedding vector generation method. In Fig. 9(c), the proposed method always shows more than 94% precision at k for all k values. Note that, in particular, the proposed method shows 100% precision at k for k = 1, which corresponds to finding the most similar graph. On the other hand, we can see that RNN autoencoder and Graph2vec show only about 65%–90% precision at k. We can also see that the performance of RNN autoencoder decreases significantly as k increases. This is because RNN autoencoder does not consider the weights of edge (i.e., the binding force between atoms) so it generates similar embedding vectors for compounds with different binding forces but similar shapes. Graph2vec also shows worse performance than the proposed method because Graph2vec only considers the degree of each node but not weights between nodes. On the other hand, the proposed method generates more effective embedding vectors for chemical compounds by considering not only the structure of compounds but also the binding force between atoms. Fig. 9(d) shows the recall and AUC of the three methods on the real dataset. Also in this case, the proposed method shows better performance than the other methods in terms of both the recall and AUC. Thus, we can conclude that embedding vectors generated by the proposed method capture similarities between weighted graphs more effectively.

In this paper, we propose an effective, deep learning-based graph embedding technique for weighted graphs. Unlike the previous deep learning-based graph embedding techniques, which consider only the structure of graphs and not the weights of edges in graphs, the proposed method considers both the structure and weights of graphs to generate their embedding vectors. To do so, the proposed method extracts node-weight sequences from given graphs and encodes them into fixed-length vectors using an LSTM autoencoder. Finally, the proposed method combines these encoding vectors to generate the final embedding vector for each graph.

In the proposed method, we used two encoding methods to encode node labels (i.e., the ordinal and one-hot encoding), three loss functions to train an LSTM autoencoder (i.e., mean-squared error, Kullback-Leibler divergence, and categorical cross entropy), and three generation methods to generate final embedding vectors (i.e., the mean, trimmed mean, and mode vectors). Through extensive experiments, we investigated the performance differences of the proposed method when each of these methods is used. The experimental results on the synthetic and real datasets show that the proposed method outperforms the existing methods significantly in generating embedding vectors for weighted graphs. Therefore, we can conclude that the proposed method can be effectively used in measuring the similarity between weighted graphs.

Compared with previous work for graph embedding, this paper makes a contribution by extending the target of graph embedding to weighted graphs for the first time. Although the main idea of the proposed method itself is mainly based on [9], we extend node sequences used in [9] to node-weight sequences to consider the weights of edges in graphs. Furthermore, the contributions of this paper also come from the following: (1) In order to make the proposed method suitable for weighted graphs, there are many options or alternatives in each step, including the node encoding method, the loss function of the LSTM autoencoder, and the final embedding vector generation method. In this paper, we propose at least two or three strategies for each step, which constitute the originality of this paper. (2) Through extensive experiments, we comprehensively evaluate and analyze the effect of each of these strategies. As a result, we can gain insight about the effect of using each strategy. (3) We not only show the effectiveness and practical applicability of the proposed method by using both synthetic and real datasets, but also publish our code on GitHub. Therefore, this paper makes a practical contribution. In future work, we will study fast and effective embedding methods for large-scale weighted graphs because the execution time of the proposed method increases directly with the number of graphs as well as the number of nodes in graphs.

She received the B.S. degree from the Division of Computer Science, Sookmyung Women's University, Korea, in 2018 and M.S. degree from the Department of Computer Science, Sookmyung Women’s University, Korea, in 2020. Her research interests include databases, data mining, deep learning and graph embedding.

He received his B.S., M.S., and Ph.D. degrees in Computer Science from KAIST, Daejeon, Republic of Korea, in 1998, 2000, and 2006, respectively. From 2006 to 2008, he worked for Samsung Electronics, Suwon, Korea as a senior engineer. From 2008 to 2010, he was a research assistant professor of the Department of Computer Science at KAIST, Daejeon, Korea. He joined the faculty of the Division of Computer Science at Sookmyung Women’s University, Seoul, in 2010, where currently he is a professor. His research interests include database systems, data mining, big data, and data streams.

- 1 M. J. J. Ghrabat, G. Ma, I. Y. Maolood, S. S. Alresheedi, Z. A. Abduliabbar, "An effective image retrieval based on optimized genetic algorithm utilized a novel SVM-based convolutional neural network classifier,"
*Human-centric Computing and Information Sciences*, vol. 9, no. 31, 2019.custom:[[[-]]] - 2 D. Lee, J. H. Park, "Future trends of AI-based smart systems and services: challenges, opportunities, and solutions,"
*Journal of Information Processing Systems*, vol. 15, no. 4, pp. 717-723, 2019.custom:[[[-]]] - 3 E. Gultepe, M. Makrehchi, "Improving clustering performance using independent component analysis and unsupervised feature learning,"
*Human-centric Computing and Information Sciences*, vol. 8, no. 25, 2018.custom:[[[-]]] - 4 H. Cai, V. W. Zheng, K. C. C. Chang, "A comprehensive survey of graph embedding: problems, techniques, and applications,"
*IEEE Transactions on Knowledge and Data Engineering*, vol. 30, no. 9, pp. 1616-1637, 2018.doi:[[[10.1109/TKDE.2018.2807452]]] - 5 P. Goyal, E. Ferrara, "Graph embedding techniques, applications, and performance: a survey,"
*Knowledge-Based Systems*, vol. 151, pp. 78-94, 2018.doi:[[[10.1016/j.knosys.2018.03.022]]] - 6 S. V. N. Vishwanathan, N. N. Schraudolph, R. Kondor, K. M. Borgwardt, "Graph kernels,"
*Journal of Machine Learning Research*, vol. 11, pp. 1201-1242, 2010.custom:[[[-]]] - 7 F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, G. Monfardini, "The graph neural network model,"
*IEEE Transactions on Neural Networks*, vol. 20, no. 1, pp. 61-80, 2009.doi:[[[10.1109/TNN.2008.2005605]]] - 8 T. N. Kipf, M. Welling, "Semi-supervised classification with graph convolutional networks," in
*Proceedings of International Conference on Learning Representations (ICLR)*, Toulon, France, 2017;custom:[[[-]]] - 9 A. Taheri, K. Gimpel, T. Berger-Wolf, "Learning graph representations with recurrent neural network autoencoders," in
*Proceedings of the 24th ACM SIGKDD Conference of Knowledge Discovery and Data Mining: Deep Learning Day*, London, UK, 2018;custom:[[[-]]] - 10 P. Yanardag, S. V. N. Vishwanathan, "Deep graph kernels," in
*Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, Sydney, Australia, 2015;pp. 1364-1374. custom:[[[-]]] - 11 A. Narayanan, M. Chandramohan, R. Venkatesan, L. Chen, Y., S. Jaiswal, "graph2vec: learning distributed representations of graphs," in
*Proceedings of 13th International Workshop on Mining and Learning with Graphs (MLG)*, Halifax, Canada, 2017;custom:[[[-]]] - 12 Q. Le, T. Mikolov, "Distributed representations of sentences and documents," in
*Proceedings of International Conference on Machine Learning*, Beijing, China, 2014;pp. 1188-1196. custom:[[[-]]] - 13
*T. Mikolov, K. Chen, G. Corrado, and J. Dean, 2013 (Online). Available:*, https://arxiv.org/abs/1301.3781 - 14 M. Seo, K. Y. Lee, "A weighted graph embedding technique based on LSTM autoencoders," in
*Proceedings of the Korea Software Congress (KSC)*, Pyeongchang, South Korea, 2019;custom:[[[-]]] - 15 S. Hochreiter, J. Schmidhuber., "Long short-term memory,"
*Neural computation*, vol. 9, no. 8, pp. 1735-1780, 1997.doi:[[[10.1162/neco.1997.9.8.1735]]] - 16 M. A. Kramer, "Nonlinear principal component analysis using autoassociative neural networks,"
*AIChE Journal*, vol. 37, no. 2, pp. 233-243, 1991.custom:[[[-]]] - 17 S. Maity, M. Abdel-Mottaleb, S. S. Asfour, "Multimodal biometrics recognition from facial video with missing modalities using deep learning,"
*Journal of Information Processing Systems*, vol. 16, no. 1, pp. 6-29, 2020.custom:[[[-]]] - 18 A. Grover, J. Leskovec, "node2vec: scalable feature learning for networks," in
*Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, San Francisco, 2016;pp. 855-864. custom:[[[-]]] - 19
*PubChem: an open chemistry database at the National Institutes of Health (Online). Available:*, https://pubchem.ncbi.nlm.nih.gov/