Research on Chinese Microblog Sentiment Classification Based on TextCNN-BiLSTM Model

Haiqin Tang and Ruirui Zhang

Abstract

Abstract: Currently, most sentiment classification models on microblogging platforms analyze sentence parts of speech and emoticons without comprehending users’ emotional inclinations and grasping moral nuances. This study proposes a hybrid sentiment analysis model. Given the distinct nature of microblog comments, the model employs a combined stop-word list and word2vec for word vectorization. To mitigate local information loss, the TextCNN model, devoid of pooling layers, is employed for local feature extraction, while BiLSTM is utilized for contextual feature extraction in deep learning. Subsequently, microblog comment sentiments are categorized using a classification layer. Given the binary classification task at the output layer and the numerous hidden layers within BiLSTM, the Tanh activation function is adopted in this model. Experimental findings demonstrate that the enhanced TextCNN-BiLSTM model attains a precision of 94.75%. This represents a 1.21%, 1.25%, and 1.25% enhancement in precision, recall, and F1 values, respectively, in comparison to the individual deep learning models TextCNN. Furthermore, it outperforms BiLSTM by 0.78%, 0.9%, and 0.9% in precision, recall, and F1 values.

Keywords: Chinese Microblog Review , Deep Learning , Sentiment Classification , TextCNN-BiLSTM

1. Introduction

In the era of networked media, a growing number of individuals have turned to online platforms to gather information, voice their opinions, and express their emotions [1]. As a widely-used social medium disseminating copious real-time information, analyzing the emotional content of microblog comments holds significant practical importance for platform management and public opinion regulation [1]. Sentiment analysis primarily involves scrutinizing the content of generated text to discern its emotional polarity: positive or negative [2]. There are three principal approaches to sentiment analysis: those based on emotion dictionaries [3], machine learning [4], and deep learning [5] for emotional analysis.

The emotion dictionary-based approach necessitates the construction of an emotion lexicon. HowNet stands as the most prevalent Chinese emotion dictionary [6]. This lexicon constitutes a crucial emotional resource for sentiment analysis, applicable to tasks of varying granularity such as words, phrases, and attribute sentences [7]. Leveraging HowNet and SentiWordNet dictionaries, Zhou et al. [8] deconstructed Chinese words, calculated their emotional polarity, established the SLHS Chinese dictionary, and utilized the SVW classifier to analyze emotions in microblog texts, achieving an accuracy of 83.84%. Addressing issues of scale and colloquial word scarcity, Zhao et al. [9] constructed a 100,000-scale emotion lexicon based on extensive microblog data, resulting in a 1.13% improvement in microblog emotion classification performance. Jiang et al. [10] addressed unknown emotional words in text content by selecting HowNet as the seed and constructing a domain-specific emotion dictionary using pointwise mutual information (PMI) and Word2vec algorithms, yielding a substantial enhancement in dictionary accuracy compared to others.

Machine learning models have progressively found application in sentiment classification. Pang et al. [ 11] pioneered the incorporation of machine learning into emotional analysis. Through experimental comparisons on film reviews, support vector machines emerged with the highest efficacy, achieving an accuracy of 82.92%. Li et al. [ 12] introduced a multi-label maximum entropy-based machine learning model, applying it to datasets comprising Twitter, microblog, and other comments, achieving an accuracy of 86.06%. Kaur et al. [ 13] leveraged N-grams for feature extraction and, in conjunction with the knearest neighbor classification algorithm, raised accuracy to 82%. Zhang et al. [ 14] integrated the emotional dictionary with a machine learning model, augmenting emotional characteristics with negative words in the health domain, and subsequently applied the model to classify negative topics on microblogs, yielding an accuracy of 74.1%.

Advancements in technology have facilitated the integration of deep learning into natural language processing (NLP)-based emotion analysis. In cases of intricate content, hybrid models prove superior to singular models for certain sentiment classification tasks. Sun et al. [15] harnessed the GloVe model for word vector training, employing the bidirectional gated recurrent unit (BiGRU) for contextual feature extraction, and incorporated an attention mechanism to achieve emotion classification. This model exhibited an accuracy of 91.21% on the IMDB dataset. Zhao et al. [16] introduced the bidirectional long short-term memory (BiLSTM) and a convolutional neural network (CNN) serial hybrid network to comprehend lexeme information in texts, achieving an accuracy of 91.31% on a dataset encompassing review texts from six fields. Miao et al. [17] argued that the CNN-BiGRU model effectively captured static and sequential features through CNN and bidirectional GRU, thereby simplifying the feature extraction process, reducing dimensionality, and enhancing accuracy and training efficiency. Yang et al. [18] leveraged GloVe for preprocessing, incorporating an attention mechanism into BiGRU to extract critical information from texts, and further extracted features through the forward attention mechanism, resulting in an augmented effectiveness in sentiment classification. In response to the challenge of embedding models struggling to incorporate word sentiment information characteristics, Yan et al. [19] proposed the parallel CNN and BilSTM-attention model for evaluating JD e-commerce review datasets, substantially enhancing overall effectiveness. Fan and Li [20] generated character and word vectors using FastText, employing the GRU neural network for sentiment classification, leading to a 92% improvement in sentiment classification accuracy for networked short texts. Yang et al. [21] addressed the issue of neglecting the potency of sentiment feature words and introduced an enhanced BiLSTM-CNN+Attention model based on a comprehensive emotional dictionary, proficiently extracting semantic features and elevating accuracy. Shen et al. [22] introduced a model integrating knowledge-enhanced semantic representation with a dual attention mechanism, amalgamating contextual and sentiment features to enhance sentiment analysis accuracy. Ali Al-Abyadh et al. [23] conducted a comparative study between single models and various hybrid deep learning models, revealing that the hybrid deep learning-support vector machine model elevated sentiment analysis accuracy by 91.3%. Hu et al. [24] integrated a multilayer attention mechanism, combined with BiGRU and multi-granularity convolutional neural network, achieving a sentiment analysis accuracy of 92.75% in the hotel review dataset. Khan et al. [25] applied a combination of machine learning models and deep learning hybrid modes to scrutinize the sentiment of Urdu language reviews. The experiments demonstrated that models utilizing Bert pre-trained word embedding exhibited exceptional effectiveness. Wu et al. [26] engineered a sentiment classifier based on word embedding and lexical polarity analysis, implementing a two-level long short-term memory network, effectively resolving the challenge of applying the model to high-quality training sets characterized by high label accuracy.

The method founded on emotion dictionaries only considers the semantics of individual words, neglecting contextual semantic information. Additionally, it demands substantial time investment for constructing an emotion lexicon, imposing inherent limitations. Machine learning methods hinge on human annotation and struggle to discern deeper semantics. Although deep learning models excel in sentiment classification, there remains a need for refinement in efficiently eliminating special symbols and accurately comprehending contextual semantic information, particularly in the face of complex microblog comments, regardless of whether a singular or hybrid model is employed. In this study, a TextCNN-BiLSTM hybrid model is employed to comprehensively extract both local and contextual features, thereby enhancing the efficiency of sentiment classification for microblog comments.

2. Theoretical Basis

2.1 Word2vec Vectorization

Initially, sentiment analysis necessitates the vectorization of the target comment texts. Common vectorization methods encompass One-hot, Word2vec, Glove, and BERT. Word2vec constructs word embeddings through skip-grams or continuous bag-of-words (CBOW). CBOW determines the central word based on several consecutive words preceding and following it, enabling it to predict the target word in context. In contrast, Skip-gram predicts its context from the central word. This article employs the CBOW model for word embedding.

2.2 TextCNN Network Structure

CNN, a feedforward neural network with a convolution structure, comprises convolution and pooling layers, proficient at extracting local features for text classification [27]. TextCNN represents a variant of CNN that can simultaneously employ filters of varying sizes to extract features of different dimensions from the text, thereby obtaining representative features. The TextCNN model consists of convolution, pooling, and classification layers. The schematic representation of the TextCNN structure is illustrated in Fig. 1.

At the heart of TextCNN lies the convolutional layer, responsible for acquiring diverse text features through multiple convolutional kernels. The calculation formula is as follows:

(1)
[TeX:] $$h_i=f\left(w_{i(x, y)} * c_{(x, y)}+b_i\right),$$

where f denotes the activation functions, encompassing Tanh, ReLU, sigmoid, and others. [TeX:] $$w_{i(x, y)}$$ signifies the weight of the filter input node corresponding to the i node in the output matrix; (x, y), denotes the value of the node [TeX:] $$c_{(x, y)}$$ in the filter; [TeX:] $$b_i$$ represents the bias corresponding to the i node. Local feature extraction is achieved by employing three filters with convolution kernel sizes of 1, 3, and 5, culminating in the final result of the convolution layer [TeX:] $$h_i$$.

The pooling layer serves to reduce the dimensionality of features post-convolution and guards against overfitting. Given that BiLSTM must be integrated after TextCNN, retaining the spatial information of the text becomes imperative. Since pooling would lead to the loss of this information, it is omitted in this study.

The classification layer amalgamates the features extracted from the pooling layer into a composite vector. Subsequently, this vector undergoes classification using the softmax classifier to accomplish emotion classification, with dropout implemented as a preventive measure against overfitting.

Fig. 1.
Text convolutional neural network.
2.3 BiLSTM Network Structure

LSTM network, a type of recurrent neural network [28], establishes a crucial link between texts and sentences due to the diversification and complexity inherent in Chinese texts. Microblog comments encapsulate users’ sentiments and perspectives expressed in varying tenses. Hence, to gain a comprehensive contextual understanding, employing bidirectional LSTM becomes imperative.

BiLSTM comprises two LSTM networks operating in reverse, jointly determining the output of the entire network. It remedies the limitation of LSTM’s inability to encode information in a bidirectional manner. The internal structure of BiLSTM is depicted in Fig. 2.

[TeX:] $$f_t$$ represents the forgetting gate. This calculation allows us to choose which information from the upper layer should be forgotten. The formula for the forgetting gate is as follows: σ denotes the activation function, [TeX:] $$W_f$$ signifies the weight of the forgetting gate, and [TeX:] $$b_f$$ represents the offset of the forgetting gate.

(2)
[TeX:] $$f_t=\sigma\left(W_f *\left[h_{t-1}, x_t\right]+b_f\right).$$

[TeX:] $$i_t$$ represents the input gate. The gated state’s input includes the previous moment’s hidden layer [TeX:] $$h_{t-1}$$ and the current input word [TeX:] $$x_t.$$ At this stage, there exists a temporary cell state [TeX:] $$\tilde{c}_t.$$ The calculation formulas for [TeX:] $$i_t \text{ and } c_t$$ are as follows: [TeX:] $$W_i$$ represents the weight of the input gate, [TeX:] $$b_i$$ denotes the offset of the input gate, [TeX:] $$W_c$$ signifies the weight of cell information, and [TeX:] $$b_c$$ represents the offset of cell information.

Fig. 2.
The internal structure of bidirectional long short-term memory model.

(3)
[TeX:] $$i_t=\sigma\left(W_i *\left[h_{t-1}, x_t\right]+b_i\right),$$

(4)
[TeX:] $$\tilde{c}_t=\tanh \left(W_c *\left[h_{t-1}, x_t\right]+b_c\right) .$$

Following the application of the forgetting gate and the input gate, the cell state from the last moment, [TeX:] $$c_{t-1},$$ undergoes an update to yield the new cell state, [TeX:] $$c_t.$$ This can be calculated using the following formula:

(5)
[TeX:] $$c_t=f_t * c_{t-1}+i_t * c_t .$$

[TeX:] $$o_t$$ stands for the output gate, governing the output information of the network structure. The final output is represented by [TeX:] $$h_t$$. The formulas for [TeX:] $$h_t \text{ and } o_t$$ are as follows: [TeX:] $$W_o$$ denotes the weight of the output gate, and [TeX:] $$b_o$$ signifies the offset of the output gate.

(6)
[TeX:] $$o_t=\sigma\left(W_o *\left[h_{t-1}, x_t\right]+b_o\right),$$

(7)
[TeX:] $$h_t=o_t * \tanh \left(c_t\right) .$$

3. Network Structure Design

3.1 Research Process

The research process in this article unfolds as follows: initially, comment data is inputted. We employ the Jieba Word segmentation tool to process the inputted comments, forming word vectors through Word2vec. Subsequently, the results of word vectorization are fed into the TextCNN model for feature extraction, while BiLSTM is employed for context feature extraction. Finally, sentiment polarity classification is accomplished, categorizing the polarity of each sentence comment into two classes. The sentiment analysis research process is illustrated in Fig. 3.

3.2 TextCNN-BiLSTM Network Structure

Firstly, the preprocessed microblog comments are trained into a word vector using the CBOW in the Word2vec model, which serves as the input for the hybrid model. Next, TextCNN, without a pooling layer, is utilized to extract local features, while BiLSTM is employed to capture global features reflecting contextual information. Ultimately, a softmax layer is employed for the sentiment classification of microblog comments. The structure of the hybrid model is depicted in Fig. 4.

Fig. 3.
The research process.
Fig. 4.
The research process.

4. Experimental Results and Analysis

4.1 Experimental Data

The dataset utilized in this article is derived from Weibo_senti_100K, comprising over 100,000 microblog comments annotated with emotions. Weibo_senti_100K encompasses 50,000 positive and 50,000 negative comments, designated as 1 and 0, respectively. For the experiment, 10,000 data samples were selected from both positive and negative sentiment datasets, resulting in a total of 20,000 microblog comment data pieces for the model experiment. The proportion of training, testing, and validation datasets was set at 8:1:1. Sample comments are illustrated in Figs. 5 and 6.

Fig. 5.
Examples of positive comments.
Fig. 6.
Examples of negative comments.
4.2 Data Processing

1) Text segmentation: Chinese text lacks explicit segmentation, necessitating methods for segmentation to train the word vector via the pre-training model. Common Chinese word segmentation tools include Jieba, SnowNLP, NLPIR, and THULAC. This article employs Jieba as the word segmentation tool.

2) Using stop-word list: Chinese texts often contain unnecessary high-frequency words like conjunctions, prepositions, and modal words, which can impact the efficiency and accuracy of classification models. Thus, these words should be removed during data processing. Microblog data often contains interference symbols and URLs. After regular expression processing, the stopword list is applied to process the data. Due to the specific nature of microblog comments, this study combines Harbin Institute of Technology (HIT) and Baidu stop-word lists, supplementing them with additional specialized stop words to form an enhanced mixed stop-word list, thereby improving data processing effectiveness. The process is as follows: first, the texts are segmented using the combined Baidu and HIT stop-word list. Then, based on the word segmentation results, high-frequency symbols and English words are added to the combined list, resulting in a new, more comprehensive mixed stop-word list. The added stop words are detailed in Fig. 7.

Fig. 7.
Examples of added stop words.

In the experimental data (Section 4.1), 30 positive and negative data samples were processed using the Jieba segmentation tool along with the newly integrated mixed stop-word list. The results of word segmentation for these examples are illustrated in Figs. 8 and 9.

Fig. 8.
The word segmentation results of positive comments in Fig. 5.
Fig. 9.
The word segmentation results of negative comments in Fig. 6.

3) Vectorization processing: After data preprocessing, the processed data undergoes vectorization, transforming the result of word segmentation into the model’s input vector. In this paper, the CBOW model in Word2vec is adopted for word vectorization. Partial results after word vectorization are depicted in Fig. 10.

Fig. 10.
The result of vectorization by continuous bag-of-word model.
4.3 Experimental Environment and Evaluation Indicators

4.3.1 Experimental environment

The environmental parameters are shown in Table 1.
Table 1.
The experimental environment and environmental parameter

4.3.2 Evaluation indicators

The standard evaluation indicators for sentiment classification include accuracy rate (Acc), precision rate (P), recall rate (R), and comprehensive evaluation value (F1). The calculation formulas are as follows:

(8)
[TeX:] $$\operatorname{Acc}=\frac{T}{N},$$

(9)
[TeX:] $$\mathrm{P}=\frac{T P}{T P+F P},$$

(10)
[TeX:] $$\mathrm{R}=\frac{T P}{T P+F N},$$

(11)
[TeX:] $$\mathrm{F} 1=\frac{2 * P * R}{P+R} .$$

The quantity of correctly predicted results by the classification model is represented by T, while N denotes the total number of samples. TP signifies the number of positive class data correctly forecasted as positive, FP stands for the number of negative class data incorrectly forecasted as positive, and FN represents the number of positive class data incorrectly forecasted as negative. This paper adopts the average values of the positive and negative indicators, specifically AP, AR, and AF1, as the evaluation criteria.

4.4 Analysis of the Experimental Results
The TextCNN-BiLSTM hybrid sentiment classification model is built on the PyTorch deep learning framework. Extensive parameter tuning tests were conducted to achieve optimal effectiveness (Table 2).
Table 2.
Model hyperparameters

1) Comparison experiment for stop-word list

As a critical step in sentiment analysis, data processing directly impacts the accuracy and efficiency of deep learning model training. In Jieba word segmentation, the stop-word list is applied to eliminate highfrequency interfering vocabulary, ensuring data reliability.

Based on the same parameters of the TextCNN-BiLSTM model, experiments were conducted using different stop-word lists: Baidu stop-word list (stopwords1), HIT stop-word list (stopwords2), Baidu and HIT mixed stop-word list (stopwords3), and the new mixed stop-word list (stopwords4). The experimental results are shown in Fig. 11.

Fig. 11.
The experimental results of TextCNN-BiLSTM model with different stop-word lists.

From Fig. 11, it’s evident that the single Baidu and HIT stop-word lists are inadequate for data processing. The Baidu stop-word list covers only a limited number of specific words, while the HIT stopword list lacks English stop words. Data processing based on the combined Baidu and HIT mixed stopword list has a significantly improved effect. The new mixed stop-word list, built upon the combined Baidu and HIT list, demonstrates even better data processing effectiveness, particularly for special emoticons and interfering English words in microblog comments. The accuracy rate of the new mixed stop-word list is 4.13%, 2.56%, and 0.28% higher than that of the Baidu stop-word list, HIT stop-word list, and Baidu and HIT mixed stop-word list, respectively.

2) Comparison experiment for activation function

Common activation functions include Tanh, ReLU, and sigmoid, with ReLU being the frequently used function in the model. In this paper, while keeping other parameters constant, the activation functions of the TextCNN convolution layers and the softmax layers of the hybrid model were set to Tanh, ReLU, and sigmoid. The experimental results are shown in Fig. 12.

From Fig. 12, it’s evident that the model achieves the highest precision when Tanh is used as the activation function. It outperforms ReLU by 0.3% and Sigmoid by 0.41%. Given the numerous hidden layers in BiLSTM, using Sigmoid can lead to gradient disappearance and relatively poor classification results. While ReLU helps mitigate the vanishing gradient problem, it comes with the issue of some neurons remaining inactive, preventing parameter updates. Tanh function proves effective in handling the large number of hidden layers in the hybrid model, making it more suitable for binary classification tasks, resulting in overall superior performance.

Fig. 12.
The experimental results of TextCNN-BiLSTM model with different activation functions.

3) Comparison experiment of the classification model

To validate the effectiveness of our proposed improved hybrid model, we conducted tests using the same dataset. We continued to train word vectors with Word2vec, which were then used as inputs for the model. TextCNN, LSTM, BiLSTM, and LSTM-ATT models were employed for comparative testing. The experimental results of each model are presented in Table 3.
Table 3.
The experimental results
Upon examination of Table 3, it becomes evident that TextCNN, a variant of CNN, excels at extracting local features. However, due to its limited ability to capture long-distance dependencies between words, its classification effectiveness is relatively modest. LSTM and BiLSTM models demonstrate similar classification effects. Yet, the BiLSTM model, with its focus on context, exhibits a slightly enhanced classification effect compared to the LSTM model which solely emphasizes the preceding context. The BiLSTM-ATT model, leveraging bidirectional LSTM for context feature extraction followed by the application of an attention mechanism to weigh the extracted crucial features, achieves an outstanding classification effect. The TextCNN-BiLSTM-ATT model combines a convolutional neural network and bidirectional LSTM to extract features, followed by a weighted feature processing step. However, given that microblog comments are inherently brief, the accuracy of the extracted features diminishes after weighting. The hybrid TextCNN-BiLSTM model in this paper, by integrating TextCNN for local feature extraction and BiLSTM for context feature extraction, improves accuracy, recall rate, and F1 value by 1.21%, 1.25%, and 1.25%, respectively, compared with the standalone TextCNN model. Furthermore, it demonstrates an improvement of 0.78%, 0.9%, and 0.9% respectively when compared with the BiLSTM model. The comparative results for these models are illustrated in Fig. 13.
Fig. 13.
The experimental results of different sentiment analysis models.
Fig. 14.
Examples of e-commerce reviews.
Fig. 15.
The experimental result of different models in e-commerce reviews.
4) Verification Experiments with Different Datasets The aforementioned experimental results indicate that the TextCNN-BiLSTM classification model excels in sentiment analysis of microblog comments. To corroborate the validity of this model, we selected another dataset for verification under the same experimental environment and parameters as in experiment (3).

The online_shopping_10_cats dataset encompasses e-commerce reviews for 10 categories of products, ranging from books and mobile phones to computers and hotels. Each category consists of positive and negative comments, with positive comments labeled as 1 and negative comments as 0. This dataset offers broad coverage and is highly representative. We selected specific reviews from online_shopping_10_cats to form an experimental dataset, comprising 10 thousand positive reviews and 10 thousand negative reviews. A portion of the experimental dataset is displayed in Fig. 14.

Comparative experiment results demonstrate that the improved hybrid model continues to outperform other models in sentiment analysis of e-commerce reviews. The experimental results for these classification models are depicted in Fig. 15.

5. Conclusion

This paper introduces an enhanced mixed deep learning model for classifying emotions in microblog user comments. The methodology involves using Jieba word segmentation with a mixed stop-word list for data preprocessing, followed by Word2vec model utilization for word vectorization. Subsequently, the TextCNN model is employed for local feature extraction, while BiLSTM fully captures context features. The outcome is effective emotion classification. Experimental results demonstrate that the improved hybrid neural network model surpasses single models in sentiment analysis, achieving a precision of 94.75%. By employing a mixed stop-word list and Tanh activation function, the model demonstrates superiority over the unimproved version. Specifically, the results indicate that the model with the new stop-word list and Tanh activation function enhance accuracy rates by 4.1% and 0.4%, respectively. However, it should be noted that this study predominantly focuses on classifying emotions in Chinese texts, whereas microblog comments on major social platforms often include English content. Therefore, further research into sentiment classification of mixed Chinese and English texts is warranted in the future.

Biography

Haiqin Tang
https://orcid.org/0000-0002-5485-9596

She received B.S. degree in Financial Management from Southwest Petroleum University in 2020. She is currently an M.S. candidate in Business and Tourism School from Sichuan Agricultural University, Chengdu, China. Her current research interests include natural language processing and deep learning.

Biography

Ruirui Zhang
https://orcid.org/0000-0003-1898-1487

She received B.S., M.S., and Ph.D. degrees in School of Computer Science from Sichuan University in 2004, 2007, and 2012, respectively. She is a lecturer at the School of Business and Tourism, Sichuan Agricultural University, China. Her current research interests contain network security, wireless sensor networks, intrusion detection and artificial immune systems.

References

  • 1 R. Zeng and X. Xu, "A study on early warning mechanism and index for network opinion," Journal of Information, vol. 28, no. 11, pp. 52-54, 2009.custom:[[[-]]]
  • 2 R. Li, Z. Lin, H. Lin, W . Wang, and D. Meng, "Text emotion analysis: a survey," Journal of Computer Research and Development, vol. 55, no. 1, pp. 30-52, 2018. https://doi.org/10.7544/issn1000-1239.2018.20170055doi:[[[10.7544/issn1000-1239.2018.0055]]]
  • 3 J. Wu, K. Lu, and S. B. Wang, "Sentiment analysis of film review based on multiple sentiment dictionary and SVM," Journal of Fuyang Normal University (Natural Science edition), vol. 36, no. 2, pp. 68-72, 2019.doi:[[[10.1145/3319921.3319966]]]
  • 4 Z. Cheng and L. Wang, "Sentiment analysis method of online comments based on support vector machine," Electronic Technology & Software Engineering, vol. 2019, no. 16, pp. 3-4, 2019.custom:[[[-]]]
  • 5 W. J. Cui, "Deep learning-based text emotion analysis," Ph.D. dissertation, Jilin University, Changchun, China, 2018.custom:[[[-]]]
  • 6 L. Dang and L. Zhang, "Method of discriminant for Chinese sentence sentiment orientation based on HowNet," Application Research of Computers, vol. 27, no. 4, pp. 1370-1372, 2010.custom:[[[-]]]
  • 7 K. Wang and R. Xia, "A survey on automatical construction methods of sentiment lexicons," Acta Automatica Sinica, vol. 42, no. 4, pp. 495-511., 2016 https://doi.org/10.16383/j.aas.2016.c150585doi:[[[10.16383/j.aas.2016.c150585]]]
  • 8 Y . M. Zhou, J. L. Yang, and A. M. Yang, "A method on building Chinese sentiment lexicon for text sentiment analysis," Journal of Shandong University (Engineering Science), vol. 43, no. 6, pp. 27-33, 2013.custom:[[[http://gxbwk.njournal.sdu.edu.cn/CN/abstract/abstract1400.shtml]]]
  • 9 Y . Zhao, B. Qin, Q. Shi, and T. Liu, "Large-scale sentiment lexicon collection and its application in sentiment classification," Journal of Chinese Information Science, vol. 31, no. 2, pp. 187-193, 2017.custom:[[[http://jcip.cipsc.org.cn/CN/Y2017/V31/I2/187]]]
  • 10 C. Jiang, Y . Guo, and Y . Liu, "Constructing a domain sentiment lexicon based on Chinese social media text," Data analysis and Knowledge Discovery, vol. 3, no. 2, pp. 98-107, 2019. https://doi.org/10.11925/infotech.2096-3467.2018.0578doi:[[[10.11925/infotech.2096-3467.2018.0578]]]
  • 11 B. Pang, L. Lee, and S. V aithyanathan, "Thumbs up? Sentiment classification using machine learning techniques," in Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, PA, USA, 2002, pp. 78-86.custom:[[[https://arxiv.org/abs/cs/0205070]]]
  • 12 J. Li, Y . Rao, F. Jin, H. Chen, and X. Xiang, "Multi-label maximum entropy model for social emotion classification over short text," Neurocomputing, vol. 210, pp. 247-256, 2016. https://doi.org/10.1016/j.neucom.2016.03.088doi:[[[10.1016/j.neucom.2016.03.088]]]
  • 13 S. Kaur, G. Sikka, and L. K. Awasthi, "Sentiment analysis approach based on N-gram and KNN classifier," in Proceedings of 2018 1st International Conference on Secure Cyber Computing and Communication (ICSCCC), Jalandhar, India, 2018, pp. 1-4.custom:[[[https://ieeexplore.ieee.org/document/8703350]]]
  • 14 L. Zhang, Y . Tan, L. Zhu, and W. Dong, "Analyzing the features of negative sentiment microblog," Intelligence Theory and Practice, vol. 42, no. 7, pp. 132-137, 2019. http://www.itapress.cn/CN/Y2019/V42/I7/132custom:[[[-]]]
  • 15 M. Sun, Y . Li, Z. Zhuang, and T. Qian, "Sentiment analysis based on BGRU and self-attention mechanism," Journal of Jianghan University (Natural Science Edition), vol. 48, no. 4, pp. 80-89, 2020. https://doi.org/10.16389/j.cnki.cn42-1737/n.2020.04.011doi:[[[10.16389/j.cnki.cn42-1737/n.2020.04.011]]]
  • 16 H. Zhao, L. Wang, and W. Wang, "Text sentiment analysis based on serial hybrid model of bi-directional long short-term memory and convolutional neural network," Journal of Computer Applications, vol. 40, no. 1, pp. 16-22, 2020. https://doi.org/10.11772/j.issn.1001-9081.2019060968doi:[[[10.11772/j.issn.1001-9081.200968]]]
  • 17 Y . Miao, Y . Ji, S. Zhang, W. Cheng, and E. Peng, "Application of CNN-BiGRU model in Chinese short text sentiment analysis," Information Science, vol. 39, no. 4, pp. 85-91, 2021.doi:[[[https://dl.acm.org/doi/abs/10.1145/3377713.3377804]]]
  • 18 Q. Yang, Y . Zhang, J. Zhu, and T. Wu, "Text sentiment analysis based on fusion of attention mechanism and BiGRU," Computer Science, vol. 48, no. 11, pp. 307-311, 2021. https://doi.org/10.11896/jsjkx.201000075doi:[[[10.11896/jsjkx.00075]]]
  • 19 L. Yan, X. Zhu, and X. Chen, "Emotional classification algorithm of comment text based on two-channel fusion and BiLSTM-attention," Journal of University of Shanghai for Science and Technology, vol. 43, no. 6, pp. 597-605, 2021. https://doi.org/10.13255/j.cnki.jusst.20210102001doi:[[[10.13255/j.cnki.jusst.2021010]]]
  • 20 H. Fan and P . F. Li, "Sentiment analysis of short text based on FastText word vector and bidirectional GRU recurrent neural network: take the microblog comment text as an example," Information Science, vol. 39, no. 4, pp. 15-22, 2021. https://lib.cqvip.com/Qikan/Article/Detail?id=7104517819custom:[[[https://lib.cqvip.com/Qikan/Article/Detail?id=7104517819]]]
  • 21 X. Yang, M. Guo, H. Hou, J. Yuan, X. Li, K. Li, W. Wang, S. He, and Z. Luo, "Improved BiLSTM-CNN+ Attention sentiment classification algorithm fused with sentiment dictionary," Science Technology and Engineering, vol. 22, no. 20, pp. 8761-8770, 2022. http://www.stae.com.cn/jsygc/article/abstract/2112609?st=alljournalscustom:[[[http://www.stae.com.cn/jsygc/article/abstract/2112609?st=alljournals]]]
  • 22 B. Shen, X. Yan, L. Zhou, G. Xu, and Y . Liu, "Microblog sentiment analysis based on ERNIE and dual attention mechanism," Journal of Yunnan University (Natural Science Edition), vol. 44, no. 3, pp. 480-489, 2022. https://doi.org/10.7540/j.ynu.20210263doi:[[[10.7540/j.ynu.2063]]]
  • 23 M. H. Ali Al-Abyadh, M. A. Iesa, H. A. Hafeez Abdel Azeem, D. P . Singh, P . Kumar, M. Abdulamir, and A. Jalali, "Deep sentiment analysis of twitter data using a hybrid ghost convolution neural network Model," Computational Intelligence and Neuroscience, vol. 2022, article no. 6595799, 2022. https://doi.org/10.1155/2022/6595799doi:[[[10.1155//6595799]]]
  • 24 Y . Hu, T. Tong, X. Zhang, and J. Peng, "Self-attention-based BGRU and CNN for sentiment analysis," Computer Science, vol. 49, no. 1, pp. 252-258, 2022. https://doi.org/10.11896/jsjkx.210600063doi:[[[10.11896/jsjkx.00063]]]
  • 25 L. Khan, A. Amjad, N. Ashraf, and H. T. Chang, "Multi-class sentiment analysis of Urdu text using multilingual BERT," Scientific Reports, vol. 12, no. 1, article no. 5436, 2022. https://doi.org/10.1038/s41598-022-09381-9doi:[[[10.1038/s41598-022-09381-9]]]
  • 26 O. Wu, T. Yang, M. Li, and M. Li, "Two-level LSTM for sentiment analysis with lexicon embedding and polar flipping," IEEE Transactions on Cybernetics, vol. 52, no. 5, pp. 3867-3879, 2022. https://doi.org/10.1109/TCYB.2020.3017378doi:[[[10.1109/TCYB.2020.3017378]]]
  • 27 Y . LeCun, L. Bottou, Y . Bengio, and P . Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998. https://doi.org/10.1109/5.726791doi:[[[10.1109/5.726791]]]
  • 28 S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 17351780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735doi:[[[10.1162/neco.1997.9.8.1735]]]

Table 1.

The experimental environment and environmental parameter
Experimental environment Environmental parameter
Operating system Windows 10
CPU Intel Core i7-6700HQ CPU
GPU NVIDIA GeForce GTX 965M (8 G)
Internal storage 16 GB
Programming language Python 3.9
Deep learning framework PyTorch
Exploitation environment PyCharm

Table 2.

Model hyperparameters
Parameter name Value
Embedding dimension 300
The number of convolutional cores 200
Kernel size [1,3,5]
Number of training samples 32
Epoch 20
Learning rate 1e-3
Dropout rate 0.2

Table 3.

The experimental results
Model AP (%) AR (%) AP (%)
LSTM 92.98 92.95 92.95
BiLSTM 93.97 93.85 93.85
TextCNN 93.45 93.50 93.50
LSTM-ATT 93.38 93.35 93.35
BiLSTM-ATT 94.41 94.40 94.40
TextCNN-LSTM-ATT 93.76 93.75 93.75
TextCNN-BiLSTM-ATT 93.90 93.90 93.90
TextCNN-LSTM 94.11 94.05 94.05
Proposed model 94.75 94.75 94.75
Text convolutional neural network.
The internal structure of bidirectional long short-term memory model.
The research process.
The research process.
Examples of positive comments.
Examples of negative comments.
Examples of added stop words.
The word segmentation results of positive comments in Fig. 5.
The word segmentation results of negative comments in Fig. 6.
The result of vectorization by continuous bag-of-word model.
The experimental results of TextCNN-BiLSTM model with different stop-word lists.
The experimental results of TextCNN-BiLSTM model with different activation functions.
The experimental results of different sentiment analysis models.
Examples of e-commerce reviews.
The experimental result of different models in e-commerce reviews.