1. Introduction
Searching similar patterns between various concepts is a major issue in medical information science and systems [1-4]. In this regard, some researchers have tried to discover useful patterns by measuring semantic similarity in Chinese medicine [5], biological knowledge discovery [6], and health record systems [7]. All of these methods are unable to represent and analyze the inner relationships between various semantic levels [8]. To address these relationships, concept lattice theory (CLT) was proposed to process domain knowledge with generalization and instantiation relationships among the attributes. In other words, the formal context might create a large number of concepts, generating considerable information. In that case, CLT is combined with other models [9,10]. Each of the available approaches provide a way to assign conceptual weights to the attributes, without the ability to differentiate attributes based on their values in different applications, such as assigning different weights according to various similarities between concepts. Therefore, a novel approach to compute the combined weight of the attributes is introduced in the domain context to obtain similarity distances among formal concepts.
To characterize the properties of similar symptoms to known diseases in medical sets, at present, formal concept analysis (FCA)-based similarity methods are proposed to process the concepts. This motivates the proposed paper to concentrate on similarity measurements among different entities. Considering the numerous research methods, we divide the main semantic calculation methods into four categories [11]. The distance-based approach is the most common model for calculating semantic distances among semantic nodes, where the main parameter is the path and its connection. The feature-based approach introduces the feature of attributes as a main variable [12]. The information content (IC)-based model exploits normalized entities via information classification knowledge, according to Shannon information theory [13]. The integrated approach fully harnesses the advantages of the previous measures. Gao et al. [14] considered the cooccurrence frequency of concepts to optimize the measurement of conceptual similarity distance. Ghafourian et al. [15] partitioned a given ontology by mixing the feature-based approach and structural network approach according to its semantic structure to achieve reusability. Comparisons of these approaches are shown in Table 1. In addition, the approaches mentioned above have two drawbacks. First, these approaches rely on relatively clear classification information or a stable ontology structure. Specifically, the concepts in the semantic graph are represented in a certain way without considering the nonbinary relationships between the concepts, which fails to assign weights of non-taxonomic relations between the concepts. Second, the feature-based approach does not flexibly differentiate concepts that belong to different classification relationships, and it is difficult to assign corresponding semantic weights based on different application contexts.
Comparisons among different approaches for similarity measurement
The objective of the current paper is to find more relevant concept sets in the electrocardiogram (ECG) domain ontology. To achieve this goal, the ECLisd is proposed to measure semantic similarity on the basis of an entropy-weighted concept lattice with inclusion degree and similarity distance, which introduces the ECG ontologies as a semantic representation to extract essential domain properties. Actually, considering specific application backgrounds, it is much easier to obtain the semantic relationship among entities in the domain ontology. Specifically, the ECLisd first computes the combined weight of the attributes of each concept by integrating its inclusion-degree importance and entropy-degree importance in the domain context; then, the weighted concept lattice with a hierarchical structure is constructed based on the formal concept lattice. Finally, the semantic similarity between weighted formal concepts is calculated.
The current paper focuses on the following steps to achieve the above goals:
(1) The specific classification of the combined weights of attributes is extracted and represented as a formal context;
(2) The inclusion degree and entropy degree are combined to obtain the importance degree of weighted attributes;
(3) The hierarchical relationship for weighted entities can be computed to analyze the semantic distance with an illustration.
The rest of this article is arranged as follows: a weighted concept lattice with combined weights is built on the basis of information entropy in Section 2. Section 3 measures the hierarchical distances of each entity. Section 4 calculates the similarity distances of formal concepts based on the ECLisd and discusses the results concisely. Finally, in Section 5, the contributions of our approach are summarized, and the direction of future work is outlined.
2. Weighted Concept Lattice
2.1 Extraction of the Combined Weight of Attributes
In this section, we mainly discuss the specific domain knowledge to make the demonstration clearer. Ontology is viewed as a useful method for representing knowledge in many studies [16]. In ECG information science, various scholars have managed to establish and extract essential attributes of ECG-related concepts from prominent ECG standards. The necessary attributes of ontologies based on categories are extracted from GB/T14396 (disease classification and coding standards) and definitions of these categories from T27733-2011 (medical material reference standard). The ECG ontologies repre¬senting ECG-related concepts can effectively express the implicit semantic relationship between the ontology concepts and their attributes [17]. ECG-related concepts with attributes are listed in Table 2. The attributes with marks in Table 2 are listed in Table 3.
ECG-related concepts with attributes table
Attributes of Table 2 and their marks
Assigning and determining conceptual weights is important for obtaining precise results. In the decision table, the properties with different weights are referred to as a feature, which are represented as an attribute with values. In our model, we compute the weight of an attribute by combining two factors with independent influences. We first introduce the inclusion-degree importance as the first factor to quantify the importance of different attributes. Then, the entropy-degree importance is proposed, denoted as the overall information of weighted concepts. With this in mind, it is easy to determine how an attribute affects the semantic distance of the concept by integrating two different factors.
2.1.1 The inclusion-degree-importance of attributes
To obtain attributes of different importance through a decision table, the degree of attribute importance is introduced in rough set theory. Some necessary knowledge of rough set theory is briefly described as follows.
Definition 1 [18]. Let [TeX:] $$S=(U, A, V, f)$$ be an information system, where [TeX:] $$A=C U D$$ is a nonempty set of attributes, in which C is a set of conditional attributes and D is a set of decision properties; [TeX:] $$f: U \times A \rightarrow V$$ is a mapping function that designates a nonempty value to each attribute of its object. S can be denoted as a decision table only [TeX:] $$\text { if } C \cap D=\varnothing$$.
For instance, if Table 1 contains the decision properties that satisfy the conditions in Definition 1, it can be defined as a decision table, where U contains AMI, CAD, IE, AF, RHD and CHD and A includes time, amplitude, lead, waveform, segment, reference beat, and complex.
Definition 2 [18]. [TeX:] $$\text { Let } S=(U, C \cup D, V, f)$$ be a decision table, where C is a set of conditional attributes and D is a set of decision properties; then, [TeX:] $$\gamma_{c}(D)$$ denotes the degree of dependence of D with respect to C:
where [TeX:] $$\operatorname{POS}_{c}(D)$$ represents the positive region of partition [TeX:] $$U / D \text { and }|\cdot|$$ represents the number of the set.
For example, in Table 1, [TeX:] $$U /\{t i m e\}=\{\{A M I, C A D, I E, A F, R H D\},\{C H D\}\}, \text { and } U /\{a m p l i t u d e\}=\{\{A M I, C A D, A F\},\{I E, R H D, C H D\}\}.$$ In addition, let [TeX:] $$B \subseteq C \text { and } B=\{\text {time}\} \text { and } D=\{ \text {reference}beat \}$$ be a decision property; thus, [TeX:] $$\operatorname{POS}_{B}(D)=\{\operatorname{CAD}, C H D\},\left|\operatorname{POS}_{B}(D)\right|=2,|U|=6 \text { and } \gamma_{B}(D)=1 / 3.$$
Definition 3 [18]. For a given decision table [TeX:] $$S=(U, C U D, V, f), \text { if } B \subseteq C, \operatorname{sig}_{\gamma}(B, C, D)$$ denotes the importance of U / D with respect to C.
When B={a}, the importance of attribute a with respect to decision attribute D is:
For instance, in Table 1, [TeX:] $$\left|P O S_{C}(D)\right|=6,|U|=6 \text { and } \gamma_{C}(D)=1 . \gamma_{C \text { (times }}(D)=\left|P O S_{C \text { (times }}(D)\right| /|U|=6 /6=1 ; \text { thus, } \operatorname{sig}_{\gamma}(\{\text {time}\}, C, D)=0.$$
In our work, the decision table is constructed by introducing related domain specifications and distinguishing the decision properties in the object attribute table. For example, the decision properties can be obtained by extracting essential and electrophysiological properties and relations, based on which a decision table (Table 1) is built by tagging condition attributes and decision properties. Basically, domain specifications with related knowledge, such as the prominent ECG standards (AHA/MIT-BIH [19], SCP-ECG [20], and HL7 aECG [21]), can represent the decision properties, which reflect the general understanding of a specific context to a great extent. Consequently, the inclusion-degree importance of the attributes, which is calculated from the decision properties, reflects the degree of influence of this conditional property of the semantic concept relatively and precisely.
When there are redundant attributes in the decision table, the importance of the conditional attribute will decrease. Therefore, we introduce the inclusion degree and the importance of the inclusion degree to diminish the attributes that are unnecessary for the semantic representation.
Definition 4 [22]. Let [TeX:] $$S=(U, A, V, f)$$ be a decision table, in which [TeX:] $$A_{1}, A_{2} \subseteq A, \text { and } U / A_{1} \text { and } U / A_{2}$$ are partitions of U, defined as [TeX:] $$\left\{X_{1}, X_{2} \ldots X_{n}\right\} \text { and }\left\{Y_{1}, Y_{2}, \ldots, Y_{m}\right\},$$ respectively; [TeX:] $$C O N\left(A_{1} / A_{2}\right)$$ is denoted as the inclusion degree of [TeX:] $$U / A_{2} \text { to } U / A_{1}:$$
We assume that the given set, X and Y, satisfies a function as follows: when [TeX:] $$X \not \in Y, \operatorname{con}(X / Y)=0, \text { and }\text { if } X \subseteq Y, \operatorname{con}(X / Y)=1.$$ In the above condition, the range of the function satisfies the following: [TeX:] $$0 \leq C O N\left(A_{1} / A_{2}\right) \leq n.$$
For instance, in Table 1, [TeX:] $$U /\{t i m e\}=\{\{A M I, C A D, I E, A F, R H D\},\{C H D\}\} \text { and } U /\{r e f e r e n c e\text { beat } \}=\{\{A M I, I E, A F, R H D\},\{C A D\},\{C H D\}\};$$ thus, [TeX:] $$CON(time / reference beat) =1 \ and \ CON(reference beat/time) =2.$$
Definition 5 [22]. Let [TeX:] $$S=(U, C \cup D, V, f)$$ be a decision table, where a is a property of the conditional properties [TeX:] $$C ; \operatorname{SIG}_{\gamma}^{R}(a, C, D)$$ ) is denoted as the importance of the inclusion degree:
where
[TeX:] $$m_{a} \text { and } n_{a}$$ represent the influence of the positive region and the inclusion to the property importance, respectively. Notably, [TeX:] $$0 \leq m_{a}, n_{a} \leq 1.$$
For instance, in Table 1, [TeX:] $$m_{t i m e}=(1+1 / 6) /(6+1)=0.1667 \text { and } n_{t i m e}=(6+2 / 6) /(6+1)=0.905$$ thus, [TeX:] $$\mathrm{SIG}_{\gamma}^{R}(\text {time}, C, D)=(0+6 * 0.1667-0.905+1) /(36+6+1)=0.0255.$$
2.1.2 The entropy-degree-importance of attributes
The partial order relation is needed to construct a concept lattice, especially when we employ semantic integration between ECG data standards. This relation can be achieved through attributes of reasonable importance. Thus, we propose another representation of combined weights to quantify the inclusion degree importance of conditional properties more precisely, which is called entropy-degree importance on the basis of Shannon entropy theory [23].
Definition 6 [23]. Let [TeX:] $$K_{m}=(O, P, I, R)$$ be a formal context, wher [TeX:] $$O=\left\{o_{1}, o_{2}, \ldots, o_{n}\right\} \text { and } P=\left\{p_{1}\right.\left.p_{2}, \ldots, p_{k}\right\};$$ then, the probability of the i-th object [TeX:] $$\left(o_{i}\right)$$ processing the corresponding j-th attribute [TeX:] $$\left(p_{j}\right)$$ is computed by [TeX:] $$P\left(p_{j} / o_{i}\right).$$ The average information entropy weight of attribute [TeX:] $$p_{j}$$ can be represented by [TeX:] $$E\left(p_{j}\right)$$ as follows:
where m represents the total number of attributes in the given formal context [TeX:] $$K_{m}.$$ For example, in Table 1, [TeX:] $$E(a)=-\sum_{j-1}^{10}(5 / 6) * \log _{2}(5 / 6)=0.219.$$
Definition 7. Let [TeX:] $$K_{m}=(O, P, I, R)$$ be a formal context with the conditional properties, where [TeX:] $$P=\left\{p_{1}\right.\left.p_{2}, \ldots, p_{k}\right\}$$ denotes a set of conditional properties. Then, [TeX:] $$\operatorname{SIG}\left(p_{j}, C, D\right)$$ denotes the inclusion degree importance of a conditional property [TeX:] $$p_{j}:$$
where [TeX:] $$N\left(p_{j}, C, D\right)$$ represents a function to obtain the value of the conditional property [TeX:] $$p_{j} \in C$$ C in decision table [TeX:] $$\text { S. }\left|N\left(p_{j}, C, D\right)\right|$$ is the number of elements of [TeX:] $$N\left(p_{j}, C, D\right).$$
For instance, considering the decision table (Table 1), (a, C, D) = (b, C, D) = time; thus, [TeX:] $$|N(a, C, D)|=2.$$ Therefore, [TeX:] $$\operatorname{SIG}(a, C, D)=\operatorname{SIG}_{\gamma}^{R}(N(\operatorname{tim} e, C, D)) /| N(\text {time}, C, D) |=0.0255 / 2=0.0128.$$
To evaluate the attribute importance of formal concepts, we define a set of the combined weights of the entropy-degree importance (Eq. (5)) and the inclusion-degree importance of a conditional property (Eq. (6)).
Definition 8. Let [TeX:] $$K_{m}=(O, P, I, R)$$ be a formal context, where [TeX:] $$P=\left\{p_{1}, p_{2}, \dots, p_{k}\right\} \text { and } i\left(p_{j}\right)(1 \leq j \leq k)$$ indicates the importance of an attribute [TeX:] $$p_{j} \text { in } P . w_{c} \in W_{C}$$ is defined as the combined weight in [TeX:] $$K_{n r}$$ Then, a weighted formal context is defined as [TeX:] $$p_{j}$$ can be denoted as follows:
where [TeX:] $$E\left(p_{j}\right) \text { and } \operatorname{SIG}\left(p_{j}, C, D\right)$$ can be calculated by using Eq. (5) and Eq. (6), respectively. For example, the combined weights of attribute a are calculated as follows:
2.2. Construction of the Weighted Concept Lattice
2.2.1 A formal context based on the decision table
Based on the analysis above, we first introduce a formal context consisting of an object set and an attribute set according to FCA. Within the formal context, whole concepts can be constructed into a concept lattice using a partial order relation.
Definition 9. A formal context is described by a triple [TeX:] $$K_{m}=(O, P, I, R),$$ where O and P are two nonempty sets of objects and attributes, respectively, W represents the attribute values, and [TeX:] $$R \subseteq O \times P \times I$$ represents a subset of the Cartesian product of O and [TeX:] $$P(I \subseteq O \times P) . R$$ is a ternary relation between the three sets for which [TeX:] $$(o, p, i) \in R . When \ o \ in \ O \ and \ p \ \in \ P, \ o R p$$ means that object o has attribute p and that attribute p belongs to object o.
For example, Table 4 is a formal context, where [TeX:] $$O=\left\{m_{1}, m_{2}, m_{3}, m_{4}, m_{5}, m_{6}\right\} \text { and } P=\{a, b, c, d, e, f,g, h, i, j, k, l, m, n, o, p\}.$$ Table 4 illustrates partial objects of ECG concepts and their essential attributes by FCA in which each row and column of the formal context represent the disease related to the ECG diagnosis (extents) and their attributes (intents), according to the clinical effect of different elements on the diagnosis results. If the relation between an object and an attribute exists, then the relation is denoted as ''. On the basis of Definition 9, the value of the decision properties from the decision table can be transformed into attributes in the formal context. Therefore, ''marked in Table 4 can also be viewed as an attribute that holds for a specific concept.
A formal context in binary converted from the decision table
2.2.2 A weighted concept lattice
Based on Definition 9, we define the weighted formal context by assigning the combined weight to each attribute of the formal context. Then, the weighted concept lattice from the weighted formal context is constructed.
Definition 10. Given a weighted formal context [TeX:] $$K_{m}^{\prime}=\left(O, P, I, R, W_{c}\right)$$ and a complete concept lattice [TeX:] $$(L(O, P, I, R), \leq), C=(A, B)$$ is a formal concept based on [TeX:] $$(L(O, P, I, R), \leq).$$ The triple [TeX:] $$C_{w}=(A, B, \left.w_{c}(B)\right)$$ can be defined as a weighted formal concept with combined weights generated from [TeX:] $$(L(O, P, I)R), \leq),$$ in which [TeX:] $$w_{c}(B)$$ represents the weights of attribute B.
For instance, let [TeX:] $$C=\left(\left\{m_{2}, m_{4}\right\},\{a, c, p\}\right)$$ be a formal concept, where the sum of the combined weight of the attribute {a, c, p} is [TeX:] $$w_{c}(a, c, p)=0.1649.$$ Thus, [TeX:] $$C_{w}=\left(\left\{m_{2}, m_{4}\right\},\{a, c, p\}, 0.1649\right)$$ is a weighted formal concept.
3. Semantic Similarity Measurement
After constructing a weighted concept lattice according to a formal context, we calculate the weight of each weighted concept via Eq. (8). In this section, the semantic similarity between weighted concepts is measured to analyze the semantic distance in the weighted concept lattice. Moreover, partial order relations among the weighted concepts are represented by weighted concepts with the same attributes in the concept lattice. Before calculating the semantic similarity between the weighted concepts, we first introduce the hierarchical distance between formal concepts to calculate the semantic similarity according to the general special relation between the concepts.
3.1 Hierarchical Distance between Formal Concepts
Definition 11. Let [TeX:] $$(L(O, P, I, R), \leq)$$ be a complete concept lattice. Given that [TeX:] $$L_{\mathrm{sup}} \subseteq L$$ is a subset of L, [TeX:] $$\sup \left(L_{\text {sup }}\right)$$ is denoted as a supremum of the subset [TeX:] $$L_{\mathrm{sup}}.$$ In particular, sup(a, b) is defined as the supremum of concepts a and b only if [TeX:] $$L_{\mathrm{sup}}$$ contains two formal concepts [TeX:] $$\left(L_{\text {sup }}=\{a, b\}\right).$$
For instance, in the complete concept lattice [TeX:] $$(L, \leq)(\text { Fig. } 1), \sup (C 3, C 4)=\sup (C 4, C 3)=C 1.$$
A complete concept lattice.
Definition 12 [TeX:] $$\text { Let }(L(O, P, I, R), \leq)$$ be a complete concept lattice, where a and c are defined as two formal concepts belonging to L. Given that a is the subconcept of c, dis (a, c) is denoted as the shortest distance from a to c in the concept lattice, which is the least number of connecting edges from vertices a to c. In particular, we define dis (a, a) = 0.
For example, in Fig. 1, one can calculate [TeX:] $$\text { dis }(C 11, C 2)=2 \text { and } d i s(C 6, C 2)=1, \text { as } C 11<C 2 \text { and } C 6<C 2$$ in the concept lattice. Because the shortest distance in Definition 12 is merely defined from the lower vertex (subconcept) to its higher vertex (superconcept) or itself, [TeX:] $$\operatorname{dis}(C 2, C 11)$$ is not defined.
In the classic transformation model, the distance between concepts is normally an indicator that reflects their semantic similarities [24]. Meanwhile, the vertices representing concepts are usually measured by a specific type of distance, which is also used to reflect the similarity between these two concepts. Similarly, the shortest distance defined in Definition 12 possesses a function to calculate the semantic similarity according to the general special relation between the concepts. In the formal concept lattice, the intent set of the subconcept basically inherits the intent of its superconcept. As a result, the subconcepts always share the same subset of characteristics of the superconcept. For example, in Fig. 1, concepts C6 and C7 have the same superconcept C0; however, [TeX:] $$\operatorname{dis}(C 6, C 0)=2,$$ while [TeX:] $$\operatorname{dis}(C 7, C 0)=1.$$ Therefore, concept C7 has a smaller degree of difference from C0 than concept C6.
A general concept with a smaller depth is usually more abstract, which tends to cover less information content [25]. Conversely, if the underlying concept is more specific, it always inherits more information, and the probability of the upper information being shared between concepts is much higher. Therefore, the semantic similarity between underlying concepts is generally greater than that between high-level concepts. Based on the analysis above, we propose a comparative hierarchical distance in Definition 13 to distinguish the hierarchical differences of formal concepts.
Definition 13. [TeX:] $$\text { Let }(L(O, P, I, R), \leq)$$ be a complete concept lattice, where a and c are defined as two formal concepts belonging to [TeX:] $$\text { L. When } a \leq c, \operatorname{chd}(a, c)$$ is denoted as the comparative hierarchical distance from a to c in the concept lattice, which is defined as follows:
where [TeX:] $$\sup \left(L_{\text {sup}}\right)$$ is the supremum of the subset [TeX:] $$L_{\text {sup}}.$$ In particular, we define chd (a, a) = 0.
For instance, in Fig. 1, one can calculate chd(C11, C1) = 2 / (2+1) = 2/3, while chd(C2, C1) = 1 / (1+1) = 1/2, as [TeX:] $$\mathrm{Cl} 1<\mathrm{Cl} \text { and } \mathrm{C} 2<\mathrm{Cl}$$ in the concept lattice. The comparative hierarchical distance in Definition 13 depends on the shortest distance parameter in Definition 12, which is only defined from the subconcept to its superconcept or itself. Therefore, chd(C1, C11) is not defined. The purpose of the comparative hierarchical distance is to distinguish the differences between concepts when they have equal shortest distances. For example, dis(C6, C1) = 1, and dis(C11, C6) = 1. However, it is obvious that the information shared between C6 and C2 and C11 and C6 is basically different. Thus, by taking into consideration the hierarchical distance, chd(C6, C1) = 1/2 and chd(C11, C6) = 1/3, which indicates that the concept with a higher hierarchy often leads to a larger impact on the degree of difference.
3.2 Similarity Distance Model
Tversky measured the degree of similarity between concepts by using the shared feature sets of entities [26]. The computational model is as follows.
where Sim(x, y) denotes the similarity between the concepts m and n; M and N are the feature sets of x and y, respectively; f is a metric function of the feature sets; (X–Y) represents a set of attributes that belongs to concept M but not to concept N; and similarly, (Y–X) represents a set of attributes that belongs to concept Y but not to concept N.
In this section, we introduce the semantic ratio similarity between formal concepts in the entropy-weighted concept lattice, which enables to address the measurement of an asymmetric property of concepts to some extent.
Definition 14. [TeX:] $$\text {Let }(L(O, P, I, R), \leq)$$ be a complete concept lattice, where a and c are defined as two formal concepts belonging to L. Given that [TeX:] $$a=\left(A_{1}, B_{1}\right) \text { and } b=\left(A_{2}, B_{2}\right)$$ are two formal concepts, sPath(a, b) is denoted as the weighted entropy distance from a to b in the concept lattice, which represents the weighted relative shortest path from the concepts from a to b. In particular, we define sPath(a, a) = 0.
where [TeX:] $$w_{c(a)}\left(w_{c(b)}\right)$$ is the combined weight of the importance and entropy of the attributes that belong to its formal concept a (b); chd(a, b) is the comparative hierarchical distance from a to b.
For example, in Fig. 1, based on Eq. (11), one can calculate [TeX:] $$\text {sPath}(\mathrm{Cl} 1, \mathrm{C} 6)=\left(w_{c(\mathrm{Cl} 1)}-w_{c(\mathrm{cs})}\right) * 1 / 2=0.048 \text { and } s \text {Path}(\mathrm{C} 6, C 1)=\left(w_{c(c 6)}-w_{c(C 1)}\right) * 1 / 2=0.0202,$$ which indicates that the concept with a higher weighted entropy distance often leads to a larger impact on the path difference.
Accordingly, we propose asymmetric semantic similarity to evaluate the combined weight of each attribute.
Definition 15. [TeX:] $$\text { Let } K_{m}^{\prime}=\left(O, P, I, R, W_{c}\right)$$ ) be a weighted concept lattice, where a and b are defined as two formal concepts and r represents root concepts. Given [TeX:] $$c, r \in L \text { and } c=\sup (a, b), \operatorname{sim}(a, b)$$ is denoted as the asymmetric semantic similarity between a and b, which is defined as follows:
For instance, let the lattice shown in Fig. 1 be a weighted concept lattice, with the weighted concepts [TeX:] $$\mathrm{Cl} 1=\left(\left\{m_{1}, m_{2}\right\} ;\{d, l, n\} ; 0.16\right) \text { and } C 6=\left(\left\{m_{1}, m_{3}\right\} ;\{l, n\} ; 0.064\right).$$ It is assumed that the combined weight of the importance and entropy of the attributes is as follows: [TeX:] $$\text {sPath}(\mathrm{Cl} 1, \mathrm{C} 6)=0.048, \text { sPath}(\mathrm{C} 6 \ C 0)=0.427 \text { and } \operatorname{sPath}(C 6, C 6)=0.$$ Therefore, [TeX:] $$\operatorname{sim}(C 11, C 6)=0.735.$$
In Eq. (12), the entropy-weighted attributes and the comparative hierarchical distance between formal concepts are considered. The semantic similarity between two concepts has a symmetric property, which only depends on the semantic relationship of the same attributes that they share. Therefore, sim(a, a) = 1, and, in particular, sim(a, b) = 0.5 indicates that the similarity and dissimilarity distances between concepts a and b are equal.
4. Results and Discussion
4.1 Results
In Section 4, we first construct the attributes table (Table 1) on the basis of the essential properties extracted from the specifications for disease classification (GB/T14396). Afterwards, the decision property is labeled to obtain a decision table under the guidance of the definitions of these concepts from GB/T27733-2011. Eventually, a formal context (Table 3) in binary is converted from the decision table.
Using Eq. (4), the inclusion-degree importance of each attribute in the decision table is calculated, which is shown in Table 5. Then, we compute the entropy-degree importance of the attributes via Eq. (5) and Eq. (6) to substitute the value of each [TeX:] $$\left.\mathrm{SIG}_{\gamma}^{R} \text { (property, } C, D\right)$$ in Table 4. The results of the combined weights of each attribute are calculated by using Eq. (7) in Table 6.
The inclusion-degree importance of each property in the decision table
Computed weight value of each attribute of the formal context
After computing all the combined weights of the formal concepts based on Eq. (8), the weighted formal concepts are demonstrated as follows.
[TeX:] $$\mathrm{C} 0=\left(\left\{m_{1}, m_{2}, m_{3}, m_{4}, m_{5}, m_{6}\right\} ; \varnothing ; 0\right)$$
[TeX:] $$\mathrm{C} 1=\left(\left\{m_{1}, m_{2}, m_{3}, m_{4}, m_{5}\right\} ;\{a\} ; 0.0236\right)$$
[TeX:] $$\mathrm{C} 2=\left(\left\{m_{1}, m_{3}, m_{4}, m_{5}\right\} ;\{a, l, j\} ; 0.0364\right)$$
[TeX:] $$\mathrm{C} 3=\left(\left\{m_{3}, m_{4}, m_{5}, m_{6}\right\} ;\{f\} ; 0.1712\right)$$
[TeX:] $$\mathrm{C} 4=\left(\left\{m_{1}, m_{2}, m_{4}\right\} ;\{a, c\} ; 0.0703\right)$$
[TeX:] $$\mathrm{C} 5=\left(\left\{m_{2}, m_{4}, m_{5}\right\} ;\{a, p\} ; 0.1182\right)$$
[TeX:] $$\mathrm{C} 6=\left(\left\{m_{1}, m_{2}, m_{3}\right\} ;\{a, g\} ; 0.0640\right)$$
[TeX:] $$\mathrm{C} 7=\left(\left\{m_{3}, m_{5}, m_{6}\right\} ;\{f, d\} ; 0.0831\right)$$
[TeX:] $$\mathrm{C} 8=\left(\left\{m_{3}, m_{4}, m_{5}\right\} ;\{a, f, j, l\} ; 0.2076\right)$$
[TeX:] $$\mathrm{C} 9=\left(\left\{m_{2}, m_{6}\right\} ;\{k\} ; 0.0555\right)$$
[TeX:] $$\mathrm{C} 10=\left(\left\{m_{2}, m_{4}\right\} ;\{a, c, p\} ; 0.1649\right)$$
[TeX:] $$\mathrm{C} 11=\left(\left\{m_{1}, m_{2}\right\} ;\{a, c, e, g\} ; 0.1600\right)$$
[TeX:] $$\mathrm{C} 12=\left(\left\{m_{1}, m_{4}\right\} ;\{a, l, j, c\} ; 0.2179\right)$$
[TeX:] $$\mathrm{C} 13=\left(\left\{m_{4}, m_{5}\right\} ;\{a, f, h, j, l, p\} ; 0.3448\right)$$
[TeX:] $$\mathrm{C} 14=\left(\left\{m_{1}, m_{3}\right\} ;\{a, g, j, l, o\} ; 0.3115\right)$$
[TeX:] $$\mathrm{C} 15=\left(\left\{m_{3}, m_{5}\right\} ;\{a, d, f, j, l\} ; 0.2543\right)$$
[TeX:] $$\mathrm{C} 16=\left(\left\{m_{2}\right\} ;\{a, c, e, g, k, n, p\} ; 0.4279\right)$$
[TeX:] $$\mathrm{C} 17=\left(\left\{m_{6}\right\} ;\{b, d, f, i, k, m\} ; 0.3376\right)$$
[TeX:] $$\mathrm{C} 18=\left(\left\{m_{4}\right\} ;\{a, c, f, h, j, l, p\} ; 0.3915\right)$$
[TeX:] $$\mathrm{C} 19=\left(\left\{m_{1}\right\} ;\{a, c, e, g, j, l, o\} ; 0.4075\right)$$
[TeX:] $$\mathrm{C} 20=\left(\left\{m_{5}\right\} ;\{a, d, f, h, j, l, p\} ; 0.3915\right)$$
[TeX:] $$\mathrm{C} 21=\left(\left\{m_{3}\right\} ;\{a, d, f, g, j, l, o\} ; 0.3946\right)$$
[TeX:] $$\mathrm{C} 22=(\varnothing ;\{a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p\} ; 1)$$
A weighted concept lattice.
We build a weighted concept lattice with the whole weighted formal concepts (Fig. 2) by using a partial order relation. It is easy to observe that the formal concepts C19, C16, C21, C18, C20, and C17 contain only one original ECG-related concept in their extents, representing the concepts of m1−6 in Table 4. Therefore, the semantic similarities of the concepts represent the weighted formal concepts in the weighted concept lattice.
To evaluate the effectiveness of the above method, we first calculate the degree of similarity between concepts via the shared feature sets according to Eq. (11) as a comparison, and the parameters and are equal to 0.5. For instance, in Table 3, [TeX:] $$f\left(m_{1}\right)=7, \text { and } \operatorname{Sim}\left(m_{1}, m_{2}\right)=4 /(4+0.5 * 3+0.5 * 3)=0.57.$$ With this in mind, we compute the semantic similarities between each pair of formal concepts in Table 4 [TeX:] $$\left(m_{1-6}\right)$$ and those among various concepts of different levels in the weighted concept lattice (C1 and other concepts), which are listed in Tables 7 and 8, respectively.
After obtaining the values of the formal concepts, we then compute the semantic similarities of the same weighted formal concepts based on Definitions 14 and 15. The semantic similarities of concepts [TeX:] $$m_{1-6}$$ and the semantic similarities of the concepts (C1 to other concepts) are demonstrated in Tables 9 and 10, respectively.
Semantic similarity of concepts in Table 3 (feature-based approach)
Semantic similarity of the concept C1 and other concepts (feature-based approach)
Semantic similarity of concepts in Table 3 (ECLisd approach)
Semantic similarity of the concept C1 and other concepts (ECLisd approach)
4.2 Comparative Analysis
In Table 7 and Table 9, the results of the semantic similarities between the original ECG-related concepts are listed based on the feature-based approach and ECLisd approach, respectively. When comparing the results, we can obtain the line chart in Fig. 3 and conclude that the distribution between the results is relatively similar. In addition, the correlation coefficient between the results is 0.941, which means that the two results are highly correlated. Meanwhile, we analyze some differences between these two results to prove the validity of the ECLisd approach.
In Fig. 3, we can see that the proposed approach mainly has two advantages. First, the ECLisd approach can reveal small differences in the semantic similarity. According to the categories from GB/T14396, the results illustrate that the concepts that belong to the same supercategory have relatively larger semantic similarities to one another. For example, the objects [TeX:] $$A M I\left(m_{1}\right) \text { and } I E\left(m_{3}\right)$$ in the lattice both have the same supremum (sup(C19, C21) = C14), which are also in the same supercategory in the specification. There are comparatively large semantic similarities (Sim(C19, C21) = 0.908). Similarly, the pair of [TeX:] $$A F\left(m_{4}\right) \text { and } R H D\left(m_{5}\right)$$ also has a large semantic similarity (Sim(C18, C20) = 0.926), where both have the same supremum (sup(C18, C20) = C13). However, these semantic similarities cannot be revealed by the feature-based approach. For instance, Sim(C19, C18) = Sim(C21, C18) = 0.57, while in the ECLisd approach, Sim(C19, C18) = 0.692 and Sim(C21, C18) = 0.719. Furthermore, the ECLisd approach can also reveal the importance of certain attributes. For example, we can see that the similarity distance between the objects [TeX:] $$A M I\left(m_{1}\right)-C H D\left(m_{6}\right) \text { and } C A D\left(m_{2}\right)-C H D\left(m_{6}\right)$$ is different. The reason for this difference is that the shared attributes of [TeX:] $$m_{1}$ and $m_{6}$$ are Segment (j), while [TeX:] $$m_{2}$ and $m_{6}$$ do not have the same attributes. In the ECLisd approach, we can obtain the same conclusion [TeX:] $$\left(\operatorname{Sim}\left(m_{1}\right.\right.,\left.\left.m_{6}\right)=0.345 \text { and } \operatorname{Sim}\left(m_{2}, m_{6}\right)=0.366\right).$$ However, the result from the feature-based approach cannot be obtained [TeX:] $$\left(\operatorname{Sim}\left(m_{1}, m_{6}\right)=\operatorname{Sim}\left(m_{2}, m_{6}\right)=0.14\right).$$
Line chart of semantic similarities from Tables 6 and 8.
Line chart of semantic similarities from Tables 7 and 9.
Moreover, to compare these two methods in another aspect, we represent the semantic relationship between the concept nodes in the vertical dimension of the weighted concept lattice. We choose the concept node C1 and its associated nodes to compute the semantic similarities to reveal more implied information in Fig. 4. We can also see that the ECLisd approach is better than the feature-based approach in two other ways. On one hand, one can reveal implicit domain knowledge of the classification. In Fig. 4, according to the feature-based approach, between C1 and other concepts that belong to the same category at different levels, the similarities between the concepts gradually decrease as the distances between the vertices (concept nodes) increase. For instance, Sim(C1, C6) = 0.67, Sim(C1, C11) = 0.4 and Sim(C1, C16) = 0.25. However, this result is not always correct, especially when the attributes between two concepts have various weights. For example, it is reasonable to see that the semantic similarity between C1 and C16 (Sim(C1, C16) = 0.518) is greater than that between concepts C1 and C11 (Sim(C1, C11) = 0.443), considering the weighted attributes that have relatively high combined weights (Table 4) in the lattice. On the other hand, one might also be able to consider semantic weights and feature distances comprehensively to correct certain misunderstandings. For instance, in the feature-based approach, Sim(C1, C3) = Sim(C1, C7) = Sim(C1, C9) = Sim(C1, C17) = 0, which is obviously incorrect, because there is always an intrinsic association among universes in the same domain. In regard to the proposed method, semantic similarities among concepts are totally different. In this scenario, the ECLisd approach is a valid complement to the existing methods.
5. Conclusions
This research attempts to construct an entropy-weighted concept lattice of ECG ontologies. In addition, the combined weights of attribute features are measured by introducing a decision table and entropy theory, according to the semantic similarity model and the specifications for disease classification (GB/T14396). Specifically, the proposed method (ECLisd) shows the semantic characteristics more intuitively and accurately by distinguishing the semantic weights of conceptual features and assigning various weights to the concepts.
Regarding the proposed method, there are three main innovative points. The first, the method merges the importance of inclusion-degree and entropy-degree of the concepts in semantic similarity measurement. Second, the method converts the impact of an ECG-based relevant concept on the semantic assignment into the combined weights of the attributes. Third, the method includes the hierarchical depth of weighted concepts as an important parameter of the similarity calculation.
Although the ECLisd can be applied to extract properties of concepts from the domain ontology, it is not easy to obtain essential properties from the relevant text or specification. Therefore, our future study will aim to develop an approach to extract properties from text analysis that can represent not only the concept classes with the entity objects but also their classification relationships. In addition, as ECG information is widely applied in different fields, semantic similarity should be adapted to measure domain concepts in various application areas. Furthermore, the algorithm that is used to construct the weighted concept lattice needs much more time and space resources if the context contains large concepts. Thus, our future work will also focus on the adaptation and extension of calculations of the semantic relatedness.
Acknowledgement
This work is supported by the Key Natural Science Projects (No. KJ2017A223, SK2018A1072, and KJ2019A0371).