Semantic

From Things and Stuff Wiki
Revision as of 18:01, 23 November 2016 by Milk (talk | contribs) (→‎API)
Jump to navigation Jump to search


General

See also Open social#Semantic

  • Semantic Web - a “Web of data,” the sort of data you find in databases. The ultimate goal of the Web of data is to enable computers to do more useful work and to develop systems that can support trusted interactions over the network. The term “Semantic Web” refers to W3C’s vision of the Web of linked data. Semantic Web technologies enable people to create data stores on the Web, build vocabularies, and write rules for handling data. Linked data are empowered by technologies such as RDF, SPARQL, OWL, and SKOS.
  1. Use URIs to denote things.
  2. Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.
  3. Provide useful information about the thing when its URI is dereferenced, leveraging standards such as RDF, SPARQL.
  4. Include links to other related things (using their URIs) when publishing data on the Web.

or

  1. All kinds of conceptual things, they have names now that start with HTTP.
  2. I get important information back. I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event.
  3. I get back that information it's not just got somebody's height and weight and when they were born, it's got relationships. And when it has relationships, whenever it expresses a relationship then the other thing that it's related to is given one of those names that starts with HTTP.

On the Semantic Web, vocabularies define the concepts and relationships (also referred to as “terms”) used to describe and represent an area of concern. Vocabularies are used to classify the terms that can be used in a particular application, characterize possible relationships, and define possible constraints on using those terms. In practice, vocabularies can be very complex (with several thousands of terms) or very simple (describing one or two concepts only).

There is no clear division between what is referred to as “vocabularies” and “ontologies”. The trend is to use the word “ontology” for more complex, and possibly quite formal collection of terms, whereas “vocabulary” is used when such strict formalism is not necessarily used or only in a very loose sense. Vocabularies are the basic building blocks for inference techniques on the Semantic Web.

  • Agile Knowledge Engineering and Semantic Web (AKSW) is hosted by the Chair of Business Information Systems (BIS) of the Institute of Computer Science (IfI) / University of Leipzig as well as the Institute for Applied Informatics (InfAI). Goals: Development of methods, tools and applications for adaptive Knowledge Engineering in the context of the Semantic Web. Research of underlying Semantic Web technologies and development of fundamental Semantic Web tools and applications. Maturation of strategies for fruitfully combining the Social Web paradigms with semantic knowledge representation techniques.
  • Sindice - Data Web Services. Millions of websites mark up their content using RDF, Microformats, Microdata, Schema.org, RDFa, Opengraph and more. Sindice helps you find, understand and integrate with their content.


News

Linked Open Data

  • W3C: Linking Open Data - The Open Data Movement aims at making data freely available to everyone. There are already various interesting open data sets available on the Web. Examples include Wikipedia, Wikibooks, Geonames, MusicBrainz, WordNet, the DBLP bibliography and many more which are published under Creative Commons or Talis licenses. The goal of the W3C SWEO Linking Open Data community project is to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting RDF links between data items from different data sources. RDF links enable you to navigate from a data item within one data source to related data items within other sources using a Semantic Web browser. RDF links can also be followed by the crawlers of Semantic Web search engines, which may provide sophisticated search and query capabilities over crawled data. As query results are structured data and not just links to HTML pages, they can be used within other applications.
  • Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods. More specifically, Wikipedia defines Linked Data as "a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF." This site exists to provide a home for, or pointers to, resources from across the Linked Data community.
  • LodLive project provides a demonstration of the use of Linked Data standards (RDF, SPARQL) to browse RDF resources. The application aims to spread linked data principles using a simple and friendly interface with reusable techniques.

RDF

  • RDF data are sets of ‘triples’ (aka ‘statements’) of the form (Subject, Property, Object)
  • RDF data are seen as (unranked, node- and edge-labeled) directed graphs
    • nodes of which are statement's subjects and objects and are either labeled
      • by URIs an thus representing Web resources
      • by literals, such as strings or numbers, thus representing literal resources
      • by ‘local’ identifiers thus representing ‘anonymous’ or ‘blank’ nodes.
    • arcs of which correspond to statement's properties
  • Properties are also called ‘predicates’ (statement analogy)
  • Blank nodes commonly used to aggregate or group statements
    • e.g., in containers or collections
    • or for n-ary relations

RDF is a general method to decompose any type of knowledge into small pieces, with some rules about the semantics, or meaning, of those pieces. The point is to have a method so simple that it can express any fact, and yet so structured that computer applications can do useful things with it.

The basic unit of RDF is a statement called a triple. One can think of a triple as a type of sentence that states a single "fact" about a resource. RDF allows you to define statements about things (or resources), in the form of subject-predicate-object expressions (known as RDF-triples due to the 3 constituent parts).

The different forms for representing the RDF data are:

  • RDF/XML
  • Notation-3 (N3)
  • Turtle - a simplified, RDF-only subset of N3.
  • N-Triple
  • RDFa
  • TRiX
  • TRiG
  • JSON-LD

RDF/XML

Here's some RDF XML:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:ns="http://www.example.org/#">

 <ns:Person rdf:about="http://www.example.org/#john">
   <ns:hasMother rdf:resource="http://www.example.org/#susan" />
   <ns:hasFather>
     <rdf:Description rdf:about="http://www.example.org/#richard">
       <ns:hasBrother rdf:resource="http://www.example.org/#luke" />
     </rdf:Description>
   </ns:hasFather>
 </ns:Person>
</rdf:RDF>

N3

Here's some N3 RDF:

@prefix : <http://www.example.org/> .
:john    a           :Person .
:john    :hasMother  :susan .
:john    :hasFather  :richard .
:richard :hasBrother :luke .

N-Triple

Turtle

TriG

  • TriG - RDF Dataset Language. A concrete syntax for RDF as defined in the RDF Concepts and Abstract Syntax ([rdf11-concepts]). TriG is an extension of Turtle ([turtle]), extended to support representing a complete RDF Dataset.

RDFa

2004. RDFa 1.1 reached recommendation status in June 2012.

  • RDFa is an extension to HTML5 that helps you markup things like People, Places, Events, Recipes and Reviews. Search Engines and Web Services use this markup to generate better search listings and give you better visibility on the Web, so that people can find your website more easily.

JSON-LD

JSON-LD was created by people that have been directly involved in the Linked Data, lowercase semantic web, uppercase Semantic Web, Microformats, Microdata, and RDFa work. It has proven to be useful to them. There are a number of very large technology companies that have adopted JSON-LD, further underscoring its utility.

Other

  • RDF HDT (Header, Dictionary, Triples) is a compact data structure and binary serialization format for RDF that keeps big datasets compressed to save space while maintaining search and browse operations without prior decompression. This makes it an ideal format for storing and sharing RDF datasets on the Web.


RDFS

  • RDF/S allows so-called RDF Schemas (or ontologies) similar to object-oriented class hierarchies or taxonomies
  • Inheritance model of RDF/S exhibits the following peculiarities:
    • same resource may be classified in different, unrelated classes
    • class hierarchy may be cyclic → all classes on cycle equivalent
    • properties are first-class
      • associates range and domain to property, rather than which properties a class can carry
  • Inference rules are used to define the semantics (or entailment) of an RDF/S schema
    • e.g., transitivity of the class hierarchy or
    • inferred type of an untyped resource in the domain of a property

Ontologies

There is no clear division between what is referred to as “vocabularies” and “ontologies”. The trend is to use the word “ontology” for more complex, and possibly quite formal collection of terms, whereas “vocabulary” is used when such strict formalism is not necessarily used or only in a very loose sense. Vocabularies are the basic building blocks for inference techniques on the Semantic Web.

  • vocab.org is intended to be an open URI space for vocabularies such as RDF Schemas or XML Namespace documents. The PURL http://purl.org/vocab/ is mapped to this domain. It is recommended that all vocabularies hosted here define their term URIs using the PURL rather than the vocab.org domain. The PURL is expected to persist longer than vocab.org, although every effort will be made to ensure that vocab.org persists for as long as possible.
  • UMBEL Vocabulary and Reference Concept Ontology (namespace: umbel). UMBEL is the Upper Mapping and Binding Exchange Layer, designed to help content interoperate on the Web.

Dublin Core

MetaVocab

Creative Commons

Web of Trust

SKOS

DOAP

SIOC

  • SIOC initiative (Semantically-Interlinked Online Communities) aims to enable the integration of online community information. SIOC provides a Semantic Web ontology for representing rich data from the Social Web in RDF. It has recently achieved significant adoption through its usage in a variety of commercial and open-source software applications, and is commonly used in conjunction with the FOAF vocabulary for expressing personal profile and social networking information. By becoming a standard way for expressing user-generated content from such sites, SIOC enables new kinds of usage scenarios for online community site data, and allows innovative semantic applications to be built on top of the existing Social Web. The SIOC ontology was recently published as a W3C Member Submission, which was submitted by 16 organisations.

ResumeRDF

SCOT

  • SCOT is an acronym for Social Semantic Cloud of Tags. The name was chosen to emphasise the goal of providing a consistent framework for expressing social tagging at a semantic level in machine-understandable way. The SCOT ontology provides a model for expressing the main concepts and properties required to describe information for tagging activities (e.g., users, tags, resources, etc.) on the Semantic Web. This document contains a detailed description of the SCOT Ontology.

Data Cube

  • RDF Data Cube Vocabulary - There are many situations where it would be useful to be able to publish multi-dimensional data, such as statistics, on the web in such a way that it can be linked to related data sets and concepts. The Data Cube vocabulary provides a means to do this using the W3C RDF (Resource Description Framework) standard. The model underpinning the Data Cube vocabulary is compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations. The Data Cube vocabulary is a core foundation which supports extension vocabularies to enable publication of other aspects of statistical data flows or other multi-dimensional data sets.

Other

  • Vocabulary of Interlinked Datasets (VoID) is an RDF Schema vocabulary for expressing metadata about RDF datasets. It is intended as a bridge between the publishers and users of RDF data, with applications ranging from data discovery to cataloging and archiving of datasets. This document provides a formal definition of the new RDF classes and properties introduced for VoID. It is a companion to the main specification document for VoID, Describing Linked Datasets with the VoID Vocabulary.
  • Schema.RDFS.org - In early June 2011, the three big search engines Bing, Google and Yahoo! introduced Schema.org, a collection of terms that webmasters can use to markup their pages to improve the display of search results. This site is a complementary effort by people from the Linked Data community to support Schema.org deployment and usage with a special focus on Linked Data:
  • Common Tag is an open tagging format developed to make content more connected, discoverable and engaging. Unlike free-text tags, Common Tags are references to unique, well-defined concepts, complete with metadata and their own URLs. With Common Tag, site owners can more easily create topic hubs, cross-promote their content, and enrich their pages with free data, images and widgets.
  • Music Ontology is an attempt to provide a vocabulary for linking a wide range music-related information, and to provide a democratic mechanism for doing so
  • BIO: A vocabulary for biographical information. used in FOAF
  • oeGOV is making and publishing W3C OWL ontologies for eGovernment.
  • http://ontorule-project.eu/ ONTORULE (ONTOlogies meet business RULEs) is a large-scale integrating project (IP) partially funded by the European Union's 7th Framework Programme under the Information and Communication Technologies
  • Product Types Ontology - High-precision identifiers for product types based on Wikipedia. Provides ca. 300,000 precise definitions for types of product or services that extend the schema.org and GoodRelations standards for e-commerce markup.

VOAF

  • VOAF is a vocabulary specification providing elements allowing the description of vocabularies (RDFS vocabularies or OWL ontologies) used in the Linked Data Cloud. In particular it provides properties expressing the different ways such vocabularies can rely on, extend, specify, annotate or otherwise link to each other. It relies itself on Dublin Core and voiD. The name of the vocabulary makes an explicit reference to FOAF because VOAF can be used to define networks of vocabularies in a way similar to the one FOAF is used to define networks of people.

LOV

LOV objective is to provide easy access methods to this ecosystem of vocabularies, and in particular by making explicit the ways they link to each other and providing metrics on how they are used in the linked data cloud, help to improve their understanding, visibility and usability, and overall quality.

Ontology languages

OWL

  • OWL is a Web Ontology language. Where earlier languages have been used to develop tools and ontologies for specific user communities (particularly in the sciences and in company-specific e-commerce applications), they were not defined to be compatible with the architecture of the World Wide Web in general, and the Semantic Web in particular. OWL uses both URIs for naming and the description framework for the Web provided by RDF to add the following capabilities to ontologies: Ability to be distributed across many systems, Scalability to Web needs, Compatibility with Web standards for accessibility and internationalization, Openess and extensiblility. OWL builds on RDF and RDF Schema and adds more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes.

older;

newer;

Query languages

SPARQL



RDQL

GRDDL

  • GRDDL is a mechanism for Gleaning Resource Descriptions from Dialects of Languages. It is a technique for obtaining RDF data from XML documents and in particular XHTML pages. Authors may explicitly associate documents with transformation algorithms, typically represented in XSLT, using a link element in the head of the document. Alternatively, the information needed to obtain the transformation may be held in an associated metadata profile document or namespace document.

Description Logic

  • Description Logic workshops are the main international event of the description logic research community. They take place annually and aim at being an informal get-together that allows researchers to discuss the current developments in the area. The workshops explicitly welcome submissions from researchers that are new to the area and provide quality feedback via peer-reviewing while at the same time being of an "inclusive" nature with a very high acceptance rate. There are only informal (electronic) proceedings and "publication" at the workshop is not supposed to preclude publication at conferences.

Future

Web Observatory

P2P

REST

See WebDev#API

Hydra

  • Hydra is an effort to simplify the development of interoperable, hypermedia-driven Web APIs. The two fundamental building blocks of Hydra are JSON‑LD and the Hydra Core Vocabulary.

JSON‑LD is the serialization format used in the communication between the server and its clients. The Hydra Core Vocabulary represents the shared vocabulary between them. By specifying a number of concepts which are commonly used in Web APIs it can be used as the foundation to build Web services that share REST's benefits in terms of loose coupling, maintainability, evolvability, and scalability. Furthermore it enables the creation of generic API clients instead of requiring specialized clients for every single API.


Validation

Search


Clustering

Access control

See also Open social#WebID

Software

Libs

http://rhizomik.net/html/redefer/

Python

JavaScript

Tools

  • RDF Translator is a multi-format conversion tool for structured markup. It provides translations between data formats ranging from RDF/XML to RDFa or Microdata. The service allows for conversions triggered either by URI or by direct text input. Furthermore it comes with a straightforward REST API for developers.
  • Simple javascript RDF Parser and query thingy This RDF parser is designed to run in a web-browser or SVG browser, allowing you to process RDF on the client. The parser isn't complete, there's no support for various bits of the spec, and isn't all that fast, especially with large XML/RDF files. I've found it quite useful though for simple querying.
  • RDF Gravity is a tool for visualising RDF/OWL Graphs/ ontologies. Its main features are: Graph Visualization, Global and Local Filters (enabling specific views on a graph), Full text Search, Generating views from RDQL Queries, Visualising multiple RDF files, RDF Gravity is implemented by using the JUNG Graph API and Jena semantic web toolkit.
  • RDFaCE - A Semantic content editor based on TinyMCE WYSIWYG editor. RDFaCE is created as a proof of concept for WYSIWYM (What You See Is What You Mean) concept. WYSIWYM aims to enable end-users to easily annotate their content using RDFa and Microdata markups. RDFaCE employs external NLP APIs to suggest namespaces, properties, URIs and to automatically annotate content.
  • W3C: RDFImportersAndAdapters
  • W3C: ConverterToRdf converts application data from an application-specific format into RDF for use with RDF tools and integration with other data. Converters may be part of a one-time migration effort, or part of a running system which provides a semantic web view of a given application.
  • Redland is a set of free software C libraries that provide support for the Resource Description Framework (RDF).
    • Raptor is a free software / Open Source C library that provides a set of parsers and serializers that generate Resource Description Framework (RDF) triples by parsing syntaxes or serialize the triples into a syntax. The supported parsing syntaxes are RDF/XML, N-Quads, N-Triples, TRiG, Turtle, RDFa 1.0 and 1.1, RSS tag soup including all versions of RSS, Atom 1.0 and 0.3, GRDDL and microformats for HTML, XHTML and XML. The serializing syntaxes are RDF/XML (regular, and abbreviated), Atom 1.0, GraphViz, JSON, N-Quads, N-Triples, RSS 1.0 and XMP.
  • rdfstore-js is a pure Javascript implementation of a RDF graph store with support for the SPARQL query and data manipulation language. node.js
  • CumulusRDF is an RDF store on cloud-based architectures. CumulusRDF provides a REST-based API with CRUD operations to manage RDF data. The current version uses Apache Cassandra as storage backend. A previous version is built on Google's AppEngine. CumulusRDF is licensed under GNU Affero General Public License.
  • ARC2 is a PHP 5.3 library for working with RDF. It also provides a MySQL-based triplestore with SPARQL support. Feature-wise, ARC2 is now in a stable state with no further feature additions planned. Issues are still being fixed and Pull Requests are welcome, though.
  • Graphite is a PHP Library, built on top of ARC2, to make it easy to do stuff with RDF data really quickly, without having to naff around with databases. It is not intended to be scalable, or a way of authoring RDF data. [
  • Q&D RDF Browser is powered by Graphite and ARC2 and hosted by ECS at the University of Southampton.


  • prefix.cc - namespace lookup for RDF developers

Tabulator

RWW.IO

  • RWW.IO is a personal Linked Data store, intended to be used as a backend service for your Linked Data applications, and it supports the latest standards and recommendations: RDF, JSON-LD, SPARQL 1.1 Update, WebID. All data stores (endpoints) interpret the HTTP request URI as the base URI for RDF operations and the default-graph URI for SPARQL operations. When using the service as a backend, you need to follwo two basic rules: Specify the media type of your request data with a Content-Type HTTP header. Specify your response type preference with an Accept HTTP header.

Other

See also MediaWiki#Semantic

  • Apache Jena - A free and open source Java framework for building Semantic Web and Linked Data applications.
  • PublishMyData is a Linked Data publishing platform. It lets you serve your 5-star Open Data on the web in a format that’s easy to understand, but it’s also machine readable so data experts can exploit it.
  • URIBurner - A data virtualization service that transforms data hosted in a variety of data spaces and formats into standards compliant Linked Data Objects for uniform access, integration and management. The underlying technology is Virtuoso's in-built Linked Data Middleware (aka Sponger) that uses URLs as data source names for its powerful data ingestion and transformation services that result in highly navigable Linked Data Object graphs. Post transformation, each Data Object is endowed with a dereferenceable identifier (Name) that resolves to its actual representation via its URL (Address). The Sponger then re-presents Data Object descriptions via HTML documents (the default behavior) or in a variety of raw data graph forms that include: CSV, N-Triples, Turtle, N3, RDF/XML, JSON, CXML, OData (Atom and JSON) etc.

Other