Chantana Chantrapornchai* , Netnapa Bunlaw** and Chidchanok Choksuchat***
Semantic Image Search: Case Study for Western Region Tourism in Thailand
Abstract: Typical search engines may not be the most efficient means of returning images in accordance with user requirements. With the help of semantic web technology, it is possible to search through images more precisely in any required domain, because the images are annotated according to a custom-built ontology. With appropriate annotations, a search can then, return images according to the context. This paper reports on the design of a tourism ontology relevant to touristic images. In particular, the image features and the meaning of the images are described using various properties, along with other types of information relevant to tourist attractions using the OWL language. The methodology used is described, commencing with building an image and tourism corpus, creating the ontology, and developing the search engine. The system was tested through a case study involving the western region of Thailand. The user can search specifying the specific class of image or they can use text-based searches. The results are ranked using weighted scores based on kinds of properties. The precision and recall of the prototype system was measured to show its efficiency. User satisfaction was also evaluated, was also performed and was found to be high.
Keywords: Image Search , Ontology , Semantic Web , Tourism , Western Thailand
The western region of Thailand contains many forms of tourist attraction, such as national parks, historical places, temples, etc. It is also not far from Bangkok, the capital of Thailand, which also attracts many tourists from other parts of Thailand every year. Current Internet technology has the potential to stimulate tourists to visit the region and provide a large amount of information to them. However, they need to go through the results and to select the required information carefully.
If the information returned to the user is domain-specific, it can be narrowed down in order to search within the results which can help the user to find satisfactory information quickly. Collecting domainspecific data from multiple sources and representing them properly are important in facilitating the searching process. Knowledge representation in the form of an ontology has become popular since it can present the meaning of each keyword term in a domain as well as presenting the relations between them. Further, relationships between data can be inferred. Similarly, for image searches, if the only keywords searched are from the description of the image or the title, the results can include both correct and unwanted images.
Moreover, integrating related information can help tourists to search for images more precisely based on other information as well. For example, in a search for the keyword “Sanam Chandra Palace” using traditional string matching would probably match the documents with the individual words “Sanam”, “Chandra” and “Palace”, and combinations of them. The results would display all findings: both related and unrelated, and in this example, many unrelated images might be shown, such as those of fields (“sanam”), palaces, the moon (“Chandra”) as well as other extraneous images. This is because the search engine does not check for the semantics of all the words but only uses straightforward word-for-word matching. The user then needs to browse through these results to identify the required images. With a semantic image search, the search engine would aim to identify a set of pictures of the palace itself in Nakhon Pathom province, or pictures of King Rama VI who was the owner of the palace or even of the Silpakorn University campus at that location.
In this work, the main contribution is to demonstrate the use of a tourism ontology schema for images relevant to tourists in western of Thailand. Ontological data relating to each region, including images, was collected and analyzed in order to extract its properties, and the images were then annotated and recorded as data. To our knowledge, at the time the study was conducted, this was the first semantic search engine relevant to tourism in Thailand. In addition, the tourism ontology was adapted to a Thai cultural style and the search engine accepted user inputs in natural language as well those consisting of a specific selection. The engine returns related images and information to the user with a ranking. The web application developed satisfies users’ needs for images based on the results of searches.
In this paper, the methodology used is described including corpus building, data collection, the creation of the ontology, search and query mapping, and result ranking. User satisfaction evaluation was performed and the precision and recall searches were measured.
Semantic webs can help users to find information on the Internet effectively by building relationships between data coming from different data sources, termed linked data. Typical languages used to represent semantic webs are the XML language, Resource Description Framework Schema (RDFS) and Web Ontology Language (OWL) . There are two main issues in building a semantic web:
1. Collecting enough information resources relating to the given domain.
2. Classifying the information, relating it both vertically and hierarchically and discovering its properties.
2.1 Web Ontology Language
Ontology is the term used to describe all the concepts of interest in a specific domain without ambiguity in such a way that human beings, computers or software applications can understand the connections in meaning among all the things described. A typical ontology exists in the form of a hierarchical data structure, employing relationships such as Part-of a, Is-a, and their synonyms. Currently, ontologies are applied in many research areas such as Artificial Intelligence (AI), knowledge engineering, and they are also applied in Natural Language Processing (NLP) .
An ontology is a particular way of representing knowledge. Its structure is created in such a way that it describes the existing knowledge in a field of interest and it can thus be extended to new knowledge added to the domain. It can also describe the details of and the relationships among things. The data described by an ontology can be openly distributed, creating networks of data to provide the means of simple and quick searches. Each relationship within an ontology is structured as a subject, predicate, and object, denoted as a triple.
OWL is a language which was been developed from RDFS and can be used to describe an ontology. It describes the data structure using the RDFS language, or represents the data values of features in an RDF Graph, including a collection of RDF triples in the OWL language. Fig. 1 shows a roadmap of the Semantic Web “Layer Cake” structure which can be modeled using OWL. It contains a logic layer, which can help to provide inferences a particular query.
Fig. 2 shows a description of the resource at URI “#GOLDEN_ LAKE_VIEW_RESORT” which denotes the facility and the property value as “Meeting”. From Fig. 2, it can be described through OWL as shown in Fig. 3.
SPARQL  is a language used for querying semantically and for searching meaningfully for information within an ontology, and the data can be accessed as triple structures (Subject, Predicate, Object) . The concepts used in searching with the SPARQL language are comparable with those of the SQL language. Searches are divided into two parts: “SELECT” and “WHERE”. The variables are prefixed with “?” and those for which result values are returned and denoted by “SELECT”. “WHERE” defines the data on which the search is based. In the example below in Fig. 4, which shows an example of a SPARQL query to locate pictures relating to nature and wild life, the variable, ‘picture’ is searched for based on three components of the description of the image data in OWL. In order to Subject, which is ?At. Predicate shows properties of Contact, Image, and Picture Name. Object displays ?Province, ?Picture, and ?AccommodationName, from top-down rows respectively with abbreviate.
Searches using OWL and traditional semantic searches may return very different results as illustrated in Fig. 5, which shows the search results for the term attractive in Google and Swoogle, where the number of search results from Google exceeds 130 million .
The Google search engine does not work well with documents encoded in RDF and OWL as it is designed to work with natural language keywords and does not recognize the semantic content. It does not understand the structural information encoded in documents and, thus, cannot exploit their advantages . Swoogle is a crawler-based indexing and retrieval system for semantic web documents, i.e., RDF or OWL documents without metadata. Documents are found using the system index by computing the similarity between information among available archives. It attributes levels of importance to documents and an ontology is used to organize them.
2.2 Image Annotation
Different image annotation tools process data differently. However, all of them entail two concepts, the domain concept and the image characteristics concept . The tools for image annotation used can be as simple as a plug-in for Protégé  or some special tool that uses RDFS. These tools may also include the query user interface.
In Fig. 6, the images are annotated based on their sources and the image relationships. The two concepts are built in and are related. For example, the domain ontology and the annotation ontology may be as described in .
3. Relevant Literature
This section describes previous work relevant to semantic search engines, semantic image retrieval, and image annotation.
3.1 Semantic Search Engines
Several previous studies have described semantic search engines. For example, Ding et al.  presented a search and retrieval system to find and analyze semantic web documents on the web. The system can be used to support tools being developed by researchers to identify relevant ontologies, documents and terms using keywords and additional semantic constraints.
d'Aquin and Motta  focused on individual RDF documents, which do not have embedded index formats and Thomas et al.  demonstrated the use of OntoSearch for in ontologies and AKTiveRank ontology ranking. These have multiple capabilities such as searching by types (OWL, RDF, etc.), searching by keywords, and searching sub-graphs.
There have been many studies of search engines for tourism which can be used to improve the visibility of tourism data . These include the Tourinflux project in France (http://tourinflux.univlr. fr/) which allows tourism stakeholders to manage their data and assess the public perception of their areas and aims at data reuse and interoperability. Among its main concepts are multimedia, rating, contact, organization, language, location, reservation and prices. Harmo-TEN is the European Union’s project to encourage communication on-line within the tourism sector, which exploits semantic web technologies . The study of Sadasivam et al.  relating to the India tourism sector concentrated on ontology-based information retrieval for e-tourism. AcontoWeb is an Australian e-tourism initiative which uses a semantic web and includes both integration and utilization tools . These initiatives are similar in that they address issues related to tourism in specific domains and regions.
3.2 Semantic Image Retrieval
Several studies have been conducted relating to semantic image retrieval . Hyvonen et al.  presented a case study of image retrieval at the Helsinki University Museum. The system described contains features such as annotation terminology, view-based searches, and semantic browsing where related images can be accessed from query keywords and image re-ranking. It is based on the visual features of the images and aims to improve the precision of image ranking after querying.
Magesh and Thangaraj  presented a system for semantic image retrieval. The development contained several typical processes: the annotation of images, ontology development, creating individuals, and an image ontology. Town  presented the OQUEL system for searching for images and videos. The interesting process in this system is the use of a machine learning method to reduce the semantic gap between the user and the system ontology notation. The image and video features are extracted by learned models and are mapped to the ontology with a probabilistic model.
Buitelaar and Eigner  evaluated the OntoSelect ontology library which is based on a document ontology search strategy. OntoSelect uses the GoogleAPI to find ontology data in the DAML, OWL or RDFS formats. Their evaluation was scored based on coverage, connectedness, and structure. Wang et al.  also presented an automatic annotation system for animal images, in which both images and text features are considered. Ranking of semantic relationships is then used to improve the search results. The study compared the search results from the system with the top 200 Google image search results. Table 1 summarizes the differences between a normal image search and a semantic image search.
3.3 Annotation Tool
For image queries, image annotation is one of the most important steps. Several studies have investigated image annotation within the ontology process. The work of Hyvonen et al.  demonstrated the linkage between the image and its content based on an ontology which includes the name of the image and information about it. The system uses Protégé 2000 as the editor for the ontology and annotates the images using RDFS. To make annotations, the annotator takes the photograph and creates an empty instance of the class image. Then the annotator fills in the empty fields with the image elements. If a new instance is needed, where this is the first image of its type to be annotated, the annotator can use the same instance again to annotate other images. The choice depends on how detailed the semantics need to be and on the annotator’s choices.
Koletsis and Petrakis  developed a Semantic Image Annotation (SIA) image annotation framework, employing four annotation steps. In Step 1, the ontology was built. Step 2 describes the image’s similarity which is the relative importance of each low-level feature (color or texture). This is determined using a machine learning process utilizing a decision tree. Step 3 is image matching to the image. The ontology is then searched to retrieve the images that are most similar to it. Image matching is implemented using image content descriptions (color, texture and hybrid features). The combined similarity measure is calculated and used for image similarity. Finally, the images found are ranked in decreasing order of similarity. Step 4 is the image annotation. The new image is classified into one of the known semantic categories. The semantic image category with instances having the best ranks in the retrieved set is chosen. Finally, a description is assigned to the new image. To test the system images of dog breeds were annotated.
Kallergi et al.  described an ontology viewer used for image annotation within ontologies. The tool uses a multi-modal bio-imaging database (CSIDx) which helps users to annotate images. Schreiber et al.  developed a tool for photograph annotation. They defined two types of annotation: annotation for image characteristics and annotation about the domain of the images. The system consists of an internal ontology consisting of a photo annotation ontology, a subject matter ontology, and a domain ontology and also provides a user interface for searching.
In this section, we present the methodology used in the development of our search system including deriving a keyword corpus, designing a tourism ontology and the development of a semantic search capability. Fig. 7 shows the process which eventuates when the user searches for the text ‘Cave Kanchanaburi’, which the system then looks up in the ontology, which includes annotation of the province details and of the images. The tags related to the search text are matched and the results are ordered by a ranking algorithm and then returned to the users in an HTML format response.
The first step in developing the system involved gathering keywords relating to tourist attractions in the Thai language. Several tourism websites related to the region were explored including tourism magazines, etc. The keywords about the attractions and their history were gathered together to create a corpus.
4.1 Ontology Design
First, the ontology was designed. It contains main classes such as: Attraction, Accommodation, Facility, ContactData, Activity, Travel, RoomRate, RoomType, ImageCategory as shown in Fig. 8 and is saved in OWL. Other classes, class properties and data types were also designed, for example: activity class, facility class, contact class.
4.2 Image Preparation
The image information is stored in an XML file. As shown in Fig. 9, 350 images were prepared for the case study and the ontology metrics were grouped into 168 classes, 155 subclasses, 30 object properties, 42 data properties, and 1,240 individuals.
4.3 Searching Method
The following steps are required to search for images: keyword selection, keyword synonym finding, ontology search and ranking of the results. The keywords were divided into five categories based on our corpus: (i) attraction name, (ii) accommodation name, (iii) details of image, (iv) accommodation description, and (v) image description. Taking an image query from the description as an example, first, the user inputs a query as text about an image, its description or details. The text is tokenized, and the tokenized text is used as keywords to search against categories (i), (iii) or (v).
The keywords are used to search for keyword synonyms. For example, the keyword, “Wachiralongkorn Dam”, is related to “Khaolam” (which is actually expressed as “owl:sameAs”). This returns the group of keyword resources related to “Wachiralongkorn Dam” in Table 2. All these synonyms will be used to search in the ontology to find related literals as shown in Table 3.
As shown in Fig. 10, when the user enters “Wachiralongkorn” as the keyword, it will be searched within the property “Name”, and any other individuals with “owl:sameAs”. In this case the results return “Khaolam dam”. Next, other properties such as image, contactData, Description, Resource etc. are searched. SPARQL is used to search the ontology. For instance, to search for images in the same class, the following query would be used:
Fig. 11 shows all images retrieved in the same subclass, NatureAndWildlife for which the sample results are shown in Fig. 12.
In the following example, the query using user the keyword “Wachiralongkorn” is shown, which indicates that the user wants to find any dam in Kanchanaburi province that can be accessed by car with the name containing the substring “Wachiralongkorn”.
In the listing, we search by properties: (1) any name with the substring “Wachiralongkorn”, (2) it is in “Kanchanburi”, (3) it is a dam. The results should show image (5), how to get there (4), the contact number (6), etc., as depicted in Fig. 13. ?x is a dummy related to all resources. The answers for the variable ?Image in the SELECT clause are returned. Fig. 13.
4.4 Ranking Results
When a number of image results are returned it is necessary to rank the results. In the system designed in this work, the weight hierarchies are calculated based on classes, sub-classes, and attributes and the results are ranked based on these weights. Table 4 shows an example of the weights for class, sub-class, and property ranked according to specificity. In this example, the highest weighting is given to Feature which comprises the details of the attraction.
Fig. 14 illustrates the hierarchy of classes shown in Table 4. ContactLocation is a superclass of Amphoe and Province (subClassOf). Generally, the Amphoe and Province classes are specified together since they must appear in any address. Thus “Muang” as an instance of Amphoe, and it is also an instance of Province. The calculated weight of this instance is equal to 1 + 2 = 3. The total result of the very specific details of a tourist location is derived from the data properties of Name of the picture and Feature of the picture, whose weight scores are 7 and 11, respectively.
The user may enter a query based on text, for example, by typing the text search “Japanese bridge World War” and the user selects the province to be Karnchanaburi. The text is then tokenized to be “Japanese”, “bridge”, “World War”. Suppose “Japanese” is found in the property feature, “World War” is found in the property NamePic and “bridge” is found in the property “Name”. These three keywords are checked against the categories (i) attraction, (iii) image detail and (v) image description, respectively and found in the corpus categories as shown in Fig. 15. Then the query is generated based on these categories. In the query, the prefixes that belong to the ontology are highlighted. The boxes indicate the keywords extracted from the above text.
The query results are then ranked by summing the weights in Table 4, while Table 5 shows the example calculation. Fig. 16 shows the results of the query.
The search system was evaluated using 100 sample images. These images were described, and the features were extracted and saved in RDF form in OWL. Twenty-five users then randomly searched the images using 50 distinct Thai phrases. The precisions and recalls were measured. The test cases were divided as following.
1. The test case where the keyword in the phase is Data Properties: NamePic.
2. The test case where the keyword in the phase is a part of an attraction name: Data Properties: Name.
3. The test case where the keyword in the phase is a part of image name: Data Properties: NamePic.
Table 6 shows the precision and recalls of these tests. From Table 6, case 1 and case 3 gave about the same results while case 2 outperforms the other two cases. Next, we tested the results for the case of using conditions rather than keywords. The conditions are specified by properly selecting widgets appearing in the user interface. Fig. 17 shows an example of the user interface. The user may specify the type of attractions (more than 1 type is possible).
The tests for this user interface are of three types.
Type 1. Search by image feature (Data Properties:Feature) where the input phases are tokenized. Then, we obtain the following cases:
Case 1: Attraction name with Data Properties: Name and image feature: Data Properties: Feature.
Case 2: Attraction name with Data Properties: Name and image feature: Data Properties: Feature or Data Properties: Name and name of the picture: Data Properties: NamePic, or Data Properties: NamePic and Data Properties: Feature.
Case 3: Attraction name: Data Properties: Name and name of the picture: Data Properties: NamePic and Data Properties: Feature.
Type 2. Next, we tested the second type of query where the types of attraction are considered. We obtained three cases.
Case 1: Only the attraction name is selected (Data Properties: Name).
Case 2: The attraction name is used with the name of the image (Data Properties: Name and Data Properties: NamePic) or the attraction name is used with the feature of the image (Data Properties: Name and Data Properties: Feature).
Case 3: The attraction name is used with the name of the image and the feature of the image (Data Properties: Name, Data Properties: NamePic, and Data Properties: Feature).
Type 3. For the last type, the types of attractions are given, and the user can specify more than one type. These types are used to perform the logical “AND” while for each subtype in it, the user can select more than one which is considered as the logical “OR” operator. There are three cases:
Case 1. The name of the attraction is given (Data Properties: Name)
Case 2. The name of the attraction is given and the name of the picture is given (Data Properties: Name and Data Properties: NamePic), or the name of the attraction is given and the features of the picture are given (Data Properties: Name and Data Properties: Feature)
Case 3. The name of the attraction is given, the name of the picture is given, and the features of the picture are given (Data Properties: Name, Data Properties: NamePic, and Data Properties: Feature).
Table 7 presents the results from these three cases of three types.
From all the above test cases, the average precision was 84% and the average recall was 100%. In addition, we measured the user satisfaction with the system based on the 25 users mentioned above. Their satisfaction with the general operation of the system, and types of data and the relationships between them, and the user interface design. It was found that the users’ satisfaction with the system in all aspects was above 4 out of 5.
5.1 Comparison with Other Systems
Table 8 shows a comparison of the system developed with some other semantic web applications in the tourism domain based on its general usefulness within the domain, its image descriptions, query language, annotation language, results (either text or image), the number of images tested, the number of users tested, and the measurements. The domains of the work in  and  are national archives and natural attractions in Thailand respectively. In , the historical domain is in the area of the former Qin dynasty in China and the study is similar to our work in that it applies the results of searches for images. In that study, the number of images interrogated was 49 with 30 customized natural language queries (in Chinese, while the present study used 100 images queried with 50 Thai phases. The number of users in this work was 25 while the Chinese study did not mention the number of people who contributed to the study . For , the authors adopted weights according to the level of the RDF properties to compare with the matching RDF keywords before returning the images as search results. In the present study, queries were implemented using SPARQL and the properties of the sub-classes and data types were measured based on a weighting hierarchy. This study also tested performance based on its precision of recall (i.e., the accuracy, which repeated searches produced the same images) in user searches.
This research presented a case study of a prototype system to search for images using the semantic web concept to enhance the performance of search engines in searching for images. An ontology-based model was created describing tourist attractions and images.and the sytem was found to provide results meeting users’ needs. The system designed could be extended to apply to other domains in Thailand.
This research is limited to the scope of image data searches in Thailand and the system was built with the image information in the Thai language. The characteristics of Thai phrases were the natural language of the system which helped users to find the right image of tourist attractions.
Image classification depends on the emphasis. For example, we studied pictures of attractions in the region and the attractions may be divided into many kinds, such as natural parks, historic places, and culture or traditions. The sub-classes of each class for each image should be designed to accommodate this division. However, the general findigns of this research relating to the extraction of an automatic ontology from images is nevertheless interesting.
The image ontology and the domain concept must be designed based on the domain in which it is intended to operate. When more images are added, it is possible that the original domain concepts will not supported. Moreover, the issues of automatic mapping between images and the domain concepts would need to be considered in any future study.
She obtained her Bachelor’s degree (Computer Science) from Thammasat University of Thailand in 1991. She graduated from Northeastern University at Boston, College of Computer Science, in 1993 and University of Notre Dame, Department of Computer Science and Engineering, in 1999, for her Master’s and Ph.D degrees respectively. Currently, she is an associate professor at the Dept. of Computer Engineering, Faculty of Engineering, Kasetsart University, Thailand. Her research interests include: parallel computing, big data processing, semantic webs, computer architecture and fuzzy logic.
She received a Master’s of Science in Computer and Information Science degree from Silpakorn University, Thailand. She currently works as the government officer at Samut Prakan Provincial Land Development Depot. Her research interests include semantic search, image ontology, web ontology language, SPARQL, and sustainable tourism in the west of Thailand.
She received a Ph.D. degree in Computer and Information Science from Silpakorn University. She is currently a lecturer in the Information and Communication Technology Programme, Faculty of Science, Prince of Songkla University, Thailand. Her research interests include the CUDA, Java concurrency, ontology engineering, SPARQL endpoint, linked open data, natural language toolkit, and the Internet of Things.