Interest and usage of semantic annotations for providing structured, well-formed and semantically consistent content data is rapidly growing. With this, the development of artificial intelligence applications that use the semantic techniques and semantically annotated content is intensifying. Traditional websites are becoming increasingly difficult to find on the web and are therefore losing importance. To gain certain visibility and to be found in chatbots and personal digital assistants, such as Facebook Messenger chatbots, Amazon Alexa, Cortana, Google Home, Siri and so on, the best option is to utilize modern technologies and semantic annotations.
Since 2011, schema.org, introduced by the major search engines Bing, Google, Yahoo! and Yandex, became a de facto standard for annotating and publishing structured data on the web. Schema.org focuses on defining the item types and properties to describe “things” on web pages that are most valuable to automated agents (e.g. search engines, chatbots or personal assistant systems). This means search engines and dialog assistants get the structured information they need most to improve search results and give the required answer. For example, Google Search has rich features where events can be presented in a structured layout or recipes in a carousel. The core vocabulary currently consists of 615 Types, 901 Properties, and 114 Enumeration values (version 6.0 of schema.org). Invariably schema.org publishes new releases to improve the core vocabulary by adding more detailed descriptive vocabularies and extensions, such as healthcare-related schemas, e-commerce (GoodRelations project), accommodation extension (STI Accommodation Ontology) and so on. The usage of schema.org gives a motivation to focus on a certain domain for improving vocabulary and providing patterns for semantic annotations. For our study, we chose the Tourism domain as it has a significant impact on the world's economy and represents a great opportunity to apply semantic technologies for tourist needs, especially in Austria.
We analyzed data of the touristic service providers [1] and interactive maps [2] and semantically annotate them. The annotations covered a wide variety of information topics including events, accommodation and accommodation offers, food establishment, ski areas, sports activity locations, the region, press release articles, the organization itself and a large variety of infrastructure information. Based on that we defined domain specifications [4] to provide better quality annotations and to make the annotation process easier for inexperienced users [3]. Each touristic domain specification describes a schema (pattern) with a set of types and properties, which are necessary (required and optional) for semantic annotations. All domain specifications are stored on the semantity.it platform [5]. To support annotation process domain specification editor, annotation editor and evaluator are developed.
The documentation of all available Domain Specifications can be found here: Domain Specification Documentation. This can be seen as a mandatory reference document.
The following is a list of the defined schemas for annotating touristic services:
- Domain Specification for annotating Article and Blog Posting. This Domain Specification represents a set of the most common classes, objects and data properties needed for semantic annotation articles and blog posting content, a detailed description of the selected set and markup examples in JSON-LD format. The aim of the Article and Blog Posting Schema is to make the annotation process easier for users and help them with providing correct and valid semantic annotations.
- Domain Specification for annotating Event. This Domain Specification represents a set of the classes, object and data properties needed for semantic annotation of different types of events (e.g. DanceEvent, BusinessEvent) including event offers (e.g. event tickets). The document consist of the detailed description of the selected set and markup examples in JSON-LD format as well.
- Domain Specification for annotating Food Establishment. This Domain Specification describes the set of classes, objects and data properties needed for semantic annotation of food-related businesses, such as Bakery, BarOrPub, Brewery, CafeOrCoffeeShop, FastFoodRestaurant, IceCreamShop, Restaurant and Winery, and its menu. The document contains an explanation of how to select types and properties, which are required or optional.
- Domain Specifications for annotating Infrastructure. This schema model represents the main infrastructure types, subclasses of LocalBusiness, such as:
• AutomotiveBusiness and AutoRental.
• ChildCare.
• DryCleaningOrLaundry.
• EmergencyService with subtypes FireStation, Hospital and PoliceStation.
• EntertainmentBusiness with subtypes AmusementPark, ArtGallery, Casino, ComedyClub, MovieTheater and NightClub
• FinancialService with subtypes AccountingService, AutomatedTeller, BankOrCreditUnion and InsuranceAgency
• GovernmentBuilding with subtypes CityHall, Courthouse, DefenceEstablishment, Embassy, LegislativeBuilding
• HealthAndBeautyBusiness with subtypes BeautySalon, DaySpa, HairSalon, HealthClub, NailSalon and TattooParlor
• Library
• MedicalOrganization with subtypes Dentist, Pharmacy and Physician
• SportsActivityLocation with subtypes BowlingAlley, ExerciseGym, GolfCourse, HealthClub, PublicSwimmingPool, SkiResort, SportsClub, StadiumOrArena and TennisComplex
• Store with subtypes AutoPartsStore, BikeStore, BookStore, ClothingStore, ComputerStore, ConvenienceStore, DepartmentStore, ElectronicsStore, Florist, FurnitureStore, GardenStore, GroceryStore, HardwareStore, HobbyShop, HomeGoodsStore, JewelryStore, LiquorStore, MensClothingStore, MobilePhoneStore, MovieRentalStore, MusicStore, OfficeEquipmentStore, OutletStore, PawnShop, PetStore, ShoeStore, SportingGoodsStore, TireShop, ToyStore and WholesaleStore
• TravelAgency
The aim of the Infrastructure Domain Specification is to give a common pattern for annotations, make the annotation process easier for users and help them with providing correct and valid semantic annotations. - Domain Specification for annotating Lodging Business. A lodging business, e.g. a hotel, hostel, resort, or a camping site, is essentially the place and local business that houses the actual units of the establishment (e.g. hotel rooms). The lodging business can encompass multiple buildings but in most cases is an individual location. The Lodging Business Domain Specification describes a set of selected types and properties for semantic annotations of the accommodation companies such as Campground, Hotel, and HotelRoom, and their Offers and OfferCatalog.
- Domain Specifications for annotating Points of Interest. Points of interest, that is interesting places in a tourism region, can be very diverse. This includes a museum as well as a historical event, a beautiful vantage point or the venue of a particular event. For the semantic annotation of these reference points we selected the following types with their properties:
- CivicStructure with subtypes Airport, Aquarium, Beach, Bridge, BusStation, BusStop, Campground, Cemetery, EventVenue, GovernmentBuilding, CityHall, Hospital, MovieTheater, Museum, MusicVenue, Park, Church.
- Landform with subtypes BodyOfWater (LakeBodyOfWater, RiverBodyOfWater, SeaBodyOfWater and Waterfall), and Mountain.
- LandmarksOrHistoricalBuildings.
- TouristAttraction.
- Domain Specification for annotating Ski Resort. A ski resort is a self-contained commercial establishment developed for skiing, snowboarding, and other winter sports and endeavors to provide most of a vacationer's wants, such as food, drink, lodging, sports, entertainment, and shopping, on the premises. In schema.org SkiResort has the properties inherited from LocalBusiness, Place, Organization, and Thing and is a subtype of SportsActivityLocation.
- Domain Specification for annotating Ski School and Sport School. Sport and ski school is an establishment that teaches football, volleyball, gymnastics, skiing, snowboarding (typically in a ski resort). Usually, sports and ski schools are represented with schema.org types LocalBusiness and EducationOrganization. The document contains an explanation of how to select types and properties, which are required or optional, for valid and good quality semantic annotations, and a markup example in JSON-LD format.
- Domain Specification for annotating Tourist Information Center. A Tourist Information center is an organization to promote and represent the local interests of tourism, including leisure. The concept TouristInformationCenter with selected properties for annotation is represented in this document.
- Domain Specification for annotating Sports Activity Location, Trail ans Ski Lift. A ski slope is a sloping surface which you can ski down, either on a snow-covered mountain or on a specially made structure. There is no specific type in schema.org for annotation, so the type SportsActivityLocation is used. A ski lift is a mechanism for transporting skiers up to the slope. There is no specific schema.org class for lift, and so types CivicStructure and SportsActivityLocation are used for annotation.
These Domain Specifications improve the use of schema.org and tailors it to specific needs in different business application scenarios, as well as making the annotation process easier, especially for non-technical users. We propose a solution with a set of defined patterns for different domains (e.g. Hotel, Restaurant, Article). It gives a standard for providing correct semantic annotations and increases the quality and quantity of structured data on the web. Future work is to provide more specialized vocabularies for the touristic and other domain to build upon the core and start the process of extending the schema.org standard.
References
- Akbar, Z., Kärle, E., Panasiuk, O., Şimşek, U., Toma, I. and Fensel, D. (Sep 2017). Complete Semantics to empower Touristic Service Providers. In OTM Confederated International Conferences" On the Move to Meaningful Internet Systems", pp. 353-370. Springer, Cham.
- Panasiuk O., Akbar Z., Gerrier T. and Fensel D. (2018). Representing GeoData for Tourism with Schema.org. In Proceedings of the 4th International Conference on Geographical Information Systems Theory, Applications and Management - Volume 1: GISTAM, ISBN 978-989-758-294-3, pp. 239-246. DOI: 10.5220/0006755102390246.
- Panasiuk, O., Kärle, E., Şimşek, U., Fensel, D. (Jan 2018). Defining tourism domains for semantic annotation of web content. e-Review of Tourism Research 9 research notes from the ENTER 2018 Conference on ICT in Tourism.
- Şimşek, U., Kärle, E., Holzknecht, O., and Fensel, D. (2018). Domain specific semantic validation of schema. org annotations. In International Andrei Ershov Memorial Conference on Perspectives of System Informatics, pp. 417-429. Springer, Cham.
- Kärle, E., Şimşek, U., & Fensel, D. (2017). semantify.it, a Platform for Creation, Publication and Distribution of Semantic Annotations. In: SEMAPRO 2017: The Eleventh International Conference on Advances in Semantic Processing, pp. 22-30, ISBN: 978-1-61208-600-2.