You are here

Schemas (Design Patterns) for Semantic Annotation with schema.org

Interest and usage of semantic annotations for providing structured, well-formed and semantically consistent content data is rapidly growing. With this, the development of artificial intelligence applications which use the semantic techniques and semantic annotated content is intensifying. Traditional websites are becoming increasingly difficult to find on the web and are therefore losing importance. To gain a certain visibility and to be found in chatbots and personal digital assistants, such as: Facebook Messenger chatbots, Amazon Alexa, Cortana, Google Home, Siri and so on, the best option is to utilize modern technologies and semantic annotations.

Since 2011, schema.org, introduced by the major search engines Bing, Google, Yahoo! and Yandex, became a de facto standard for annotating and publishing structure data on the web. Schema.org focuses on defining the item types and properties to describe “things” on web pages that are most valuable to automated agents (e.g. search engines, chatbots or personal assistant systems). This means search engines and dialog assistants get the structured information they need most to improve search results and give the required answer. For example, Google Search has rich features where events can be presented in a structured layout, or recipes in a carousel. The core vocabulary of schema.org currently consists of 598 Types, 862 Properties, and 114 Enumeration values. Invariably schema.org publishes new releases to improve the core  vocabulary by adding  more detailed descriptive vocabularies and extensions, such as:  healthcare-related schemas, e-commerce (GoodRelations project),  accommodation extension (STI Accommodation Ontology) and so on. The usage of schema.org gives a motivation to focus on a certain domain for improving vocabulary and providing patterns for semantic annotations. For our study we chose the Tourism domain as it has a significant impact on the world's economy and represents a great opportunity to apply semantic technologies for tourist needs, especially in Austria.

We analyzed data of the touristic service providers [1] and interactive maps [2] and semantically annotate them. The annotations covered a wide variety of information topics including events, accommodation and accommodation offers, food establishment, ski areas, sports activity locations, the region, press release articles, the organization itself and a large variety of infrastructure information. Based on that we defined domain specifications [4] to provide better quality annotations and to make the annotation process easier for inexperienced users [3]. Each touristic domain specification describes a schema (pattern) with a set of types and properties, which are necessary (required and optional) for semantic annotations. All domain specifications are stored on the semantity.it platform [5]. To support annotation process domain specification editor, annotation editor and validator are developed.

The following is a list of the defined schemas for annotating touristic services:

  • Schema for annotating Article and Blog Posting. This schema model represents a set of the most common classes, objects and data properties needed for semantic annotation articles and blog posting content, a detailed description of the selected set and markup examples in JSON-LD format. The aim of the Article and Blog Posting Schema is to make the annotation process easier for users and help them with providing correct and valid semantic annotations.
  • Schema for annotating Event. This schema model represents a set of the classes, object and data properties needed for semantic annotation of different types of events (e.g. DanceEvent, BusinessEvent) including event offers (e.g. event tickets). The document consist of the detailed description of the selected set and markup examples in JSON-LD format as well.
  • Schema for annotating Food Establishment. This schema model describes the set of classes, objects and data properties needed for semantic annotation of food related businesses, such as: Bakery, BarOrPub, Brewery, CafeOrCoffeeShop, FastFoodRestaurant, IceCreamShop, Restaurant and Winery, and its menu. The document contains an explanation of how to select types and properties, which are required or optional.
  • Schema for annotating Infrastructure.This schema model represents the main infrastructure types, subclasses of LocalBusiness, such as:
    •    AutomotiveBusiness with subtypes AutoBodyShop, AutoDealer, AutoPartsStore, AutoRental, AutoRepair, AutoWash, GasStation, MotorcycleDealer and MotorcycleRepair
    •    ChildCare.
    •    DryCleaningOrLaundry.
    •    EmergencyService with subtypes FireStation, Hospital and PoliceStation.
    •    EntertainmentBusiness with subtypes AdultEntertainment, AmusementPark, ArtGallery, Casino, ComedyClub, MovieTheater and NightClub
    •    FinancialService with subtypes AccountingService, AutomatedTeller, BankOrCreditUnion and InsuranceAgency
    •    GovernmentBuilding with subtypes CityHall, Courthouse, DefenceEstablishment, Embassy, LegislativeBuilding
    •    HealthAndBeautyBusiness with subtypes BeautySalon, DaySpa, HairSalon, HealthClub, NailSalon and TattooParlor
    •    Library
    •    MedicalBusiness with subtypes Dentist, Pharmacy and Physician
    •    SportsActivityLocation with subtypes BowlingAlley, ExerciseGym, GolfCourse, HealthClub, PublicSwimmingPool, SkiResort, SportsClub, StadiumOrArena and TennisComplex
    •    Store with subtypes AutoPartsStore, BikeStore, BookStore, ClothingStore, ComputerStore, ConvenienceStore, DepartmentStore, ElectronicsStore, Florist, FurnitureStore, GardenStore, GroceryStore, HardwareStore, HobbyShop, HomeGoodsStore, JewelryStore, LiquorStore, MensClothingStore, MobilePhoneStore, MovieRentalStore, MusicStore, OfficeEquipmentStore, OutletStore, PawnShop, PetStore, ShoeStore, SportingGoodsStore, TireShop, ToyStore and WholesaleStore
    •    TravelAgency
    The aim of the Infrastructure Schema is to give a common model for annotations, make the annotation process easier for users and help them with providing correct and valid semantic annotations.
  • Schema for annotating Lodging Business. A lodging business, e.g. a hotel, hostel, resort, or a camping site, is essentially the place and local business that houses the actual units of the establishment (e.g. hotel rooms). The lodging business can encompass multiple buildings but in most cases is an individual location. The Lodging Business Schema model describes a set of selected types and properties for semantic annotations of   the accommodation companies such as BedAndBreakfast, Campground, Hostel, Hotel, Motel and Resort and their Offers.
  • Schema for annotating Points of Interest.  Points of interest, that is interesting places in a tourism region, can be very diverse. This includes a museum as well as a historical event, a beautiful vantage point or the venue of a particular event. For the semantic annotation of these reference points we selected the following types with their properties:
  • Schema for annotating Ski Resort. A ski resort is a self-contained commercial establishment developed for skiing, snowboarding, and other winter sports and endeavors to provide most of a vacationer's wants, such as food, drink, lodging, sports, entertainment, and shopping, on the premises. In schema.org SkiResort has the properties inherited from LocalBusiness, Place, Organization and Thing and is a subtype of SportsActivityLocation.
  • Schema for annotating Ski School and Sport School. Sport and ski school is an establishment that teaches football, volleyball, gymnastic, skiing, snowboarding (typically in a ski resort). Usually sports and ski schools are represented with schema.org types LocalBusiness and EducationOrganization.  The document contains an explanation of how to select types and properties, which are required or optional, for valid and good quality semantic annotations, and a markup example in  JSON-LD format.
  • Schema for annotating Tourist Information Center. A Tourist Information center is an organization to promote and represent the local interests of tourism, including leisure. The concept TouristInformationCenter with selected properties for annotation is represented in this document.
  • Schema for annotating Sports Activity Location, Trail ans Ski Lift. A ski slope is a sloping surface which you can ski down, either on a snow-covered mountain or on a specially made structure. There is no specific type in schema.org for annotation, so the type SportsActivityLocation is used. A ski lift is a mechanism for transporting skiers up to the slope. There is no specific schema.org class for lift, and so types CivicStructure and SportsActivityLocation are used for annotation.

These Schemas improve the use of schema.org and tailors it to specific needs in different business application scenarios, as well as making the annotation process easier, especially for non-technical users. We propose a solution with a set of defined Schemas for different domains (e.g. Hotel, Restaurant, Article). It gives a standard for providing correct semantic annotations and increases the quality and quantity of structured data on the web. Future work is to provide more specialized vocabularies for the touristic and other domain to build upon the core and start the process of extending the schema.org standard.

References

  1. Akbar, Z., Kärle, E., Panasiuk, O., Şimşek, U., Toma, I. and Fensel, D. (Sep 2017). Complete Semantics to empower Touristic Service Providers. In OTM Confederated International Conferences" On the Move to Meaningful Internet Systems", pp. 353-370. Springer, Cham.
  2. Panasiuk O., Akbar Z., Gerrier T. and Fensel D. (2018). Representing GeoData for Tourism with Schema.org. In Proceedings of the 4th International Conference on Geographical Information Systems Theory, Applications and Management - Volume 1: GISTAM, ISBN 978-989-758-294-3, pp. 239-246. DOI: 10.5220/0006755102390246.
  3. Panasiuk, O., Kärle, E., Şimşek, U., Fensel, D. (Jan 2018).  Defining tourism domains for semantic annotation of web content. e-Review of Tourism Research 9 research notes from the ENTER 2018 Conference on ICT in Tourism.
  4. Şimşek, U., Kärle, E., Holzknecht, O., and Fensel, D. (2018). Domain specific semantic validation of schema. org annotations. In International Andrei Ershov Memorial Conference on Perspectives of System Informatics, pp. 417-429. Springer, Cham.
  5. Kärle, E., Şimşek, U., & Fensel, D. (2017). semantify.it, a Platform for Creation, Publication and Distribution of Semantic Annotations. In: SEMAPRO 2017: The Eleventh International Conference on Advances in Semantic Processing, pp. 22-30, ISBN: 978-1-61208-600-2.