From Things and Stuff Wiki
(Redirected from Semantic web)
Jump to navigation Jump to search


See also Feeds (for RSS, Atom and Activity Streams), Open social, Data, Documents, HTML/CSS

note: slowly cleaning this page up

  • W3C Data Activity - Building the Web of Data - 5 stars for Linked Data; Data is increasingly important to society and W3C has a mature suite of Web standards with plans for further work on making it easier for average developers to work with graph data and knowledge graphs. Linked Data is about the use of URIs as names for things, the ability to dereference these URIs to get further information and to include links to other data. There are ever increasing sources of Linked Open Data on the Web, as well as data services that are restricted to the suppliers and consumers of those services. The digital transformation of industry is seeking to exploit advanced digital technologies. This will facilitate businesses to integrate horizontally along the supply and value chains, and vertically from the factory floor to the office floor. W3C is seeking to make it easier to support enterprise wide data management and governance, reflecting the strategic importance of data to modern businesses. Traditional approaches to data have focused on tabular databases (SQL/RDBMS), Comma Separated Value (CSV) files, and data embedded in PDF documents and spreadsheets. We're now in midst of a major shift to graph data with nodes and labelled directed links between them. Graph data is: Faster than using SQL and associated JOIN operations; Better suited to integrating data from heterogeneous sources; Better suited to situations where the data model is evolving

  • - to help machines understand data. To enable the encoding of semantics with the data, well-known technologies are RDF (Resource Description Framework) and OWL (Web Ontology Language). These technologies formally represent the meaning involved in information. For example, ontology can describe concepts, relationships between things, and categories of things. These embedded semantics with the data offer significant advantages such as reasoning over data and dealing with heterogeneous data sources.

  • - or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields. A semantic network may be instantiated as, for example, a graph database or a concept map. Typical standardized semantic networks are expressed as semantic triples.

Semantic networks are used in natural language processing applications such as semantic parsing and word-sense disambiguation. Semantic networks can also be used as a method to analyze large texts and identify the main themes and topics (e.g., of social media posts), to reveal biases (e.g., in news coverage), or even to map an entire research field.

  • - a technology used for knowledge representation in artificial intelligence. Frames are stored as ontologies of sets and subsets of the frame concepts. They are similar to class hierarchies in object-oriented languages although their fundamental design goals are different. Frames are focused on explicit and intuitive representation of knowledge whereas objects focus on encapsulation and information hiding. Frames originated in AI research and objects primarily in software engineering. However, in practice the techniques and capabilities of frame and object-oriented languages overlap significantly.

  • - a lexical set of words grouped semantically (by meaning) that refers to a specific subject. The term is also used in anthropology, computational semiotics, and technical exegesis.

  • - or meaning properties are those aspects of a linguistic unit, such as a morpheme, word, or sentence, that contribute to the meaning of that unit. Basic semantic properties include being meaningful or meaningless – for example, whether a given word is part of a language's lexicon with a generally understood meaning; polysemy, having multiple, typically related, meanings; ambiguity, having meanings which aren't necessarily related; and anomaly, where the elements of a unit are semantically incompatible with each other, although possibly grammatically sound. Beyond the expression itself, there are higher-level semantic relations that describe the relationship between units: these include synonymy, antonymy, and hyponymy.

Besides basic properties of semantics, semantic property is also sometimes used to describe the semantic components of a word, such as man assuming that the referent is human, male, and adult, or female being a common component of girl, woman, and actress. In this sense, semantic properties are used to define the semantic field of a word or set of words

  • - contains words that share a semantic feature. For example within nouns there are two sub classes, concrete nouns and abstract nouns. The concrete nouns include people, plants, animals, materials and objects while the abstract nouns refer to concepts such as qualities, actions, and processes. According to the nature of the noun, they are categorized into different semantic classes. Semantic classes may intersect. The intersection of female and young can be girl.

  • - allow for queries and analytics of associative and contextual nature. Semantic queries enable the retrieval of both explicitly and implicitly derived information based on syntactic, semantic and structural information contained in data. They are designed to deliver precise results (possibly the distinctive selection of one single piece of information) or to answer more fuzzy and wide open questions through pattern matching and digital reasoning. Semantic queries work on named graphs, linked data or triples. This enables the query to process the actual relationships between information and infer the answers from the network of data. This is in contrast to semantic search, which uses semantics (the science of meaning) in unstructured text to produce a better search result. (See natural language processing.) From a technical point of view semantic queries are precise relational-type operations much like a database query. They work on structured data and therefore have the possibility to utilize comprehensive features like operators (e.g. >, < and =), namespaces, pattern matching, subclassing, transitive relations, semantic rules and contextual full text search. The semantic web technology stack of the W3C is offering SPARQL to formulate semantic queries in a syntax similar to SQL. Semantic queries are used in triplestores, graph databases, semantic wikis, natural language and artificial intelligence systems.

  • - a process that organizes bibliographic information, for example in library catalogs by using a single, distinct spelling of a name (heading) or a numeric identifier for each topic. The word authority in authority control derives from the idea that the names of people, places, things, and concepts are authorized, i.e., they are established in one particular form. These one-of-a-kind headings or identifiers are applied consistently throughout catalogs which make use of the respective authority file, and are applied for other methods of organizing data such as linkages and cross references.

Each controlled entry is described in an authority record in terms of its scope and usage, and this organization helps the library staff maintain the catalog and make it user-friendly for researchers. Catalogers assign each subject—such as author, topic, series, or corporation—a particular unique identifier or heading term which is then used consistently, uniquely, and unambiguously for all references to that same subject, which obviates variations from different spellings, transliterations, pen names, or aliases. The unique header can guide users to all relevant information including related or collocated subjects.

Authority records can be combined into a database and called an authority file, and maintaining and updating these files as well as "logical linkages" to other files within them is the work of librarians and other information catalogers. Accordingly, authority control is an example of controlled vocabulary and of bibliographic control.While in theory any piece of information is amenable to authority control such as personal and corporate names, uniform titles, series names, and subjects, library catalogers typically focus on author names and titles of works. Subject headings from the Library of Congress fulfill a function similar to authority records, although they are usually considered separately. As time passes, information changes, prompting needs for reorganization. According to one view, authority control is not about creating a perfect seamless system but rather it is an ongoing effort to keep up with these changes and try to bring "structure and order" to the task of helping users find information.


One World Language

  • William A. Martin's OWL System - In the early 1970's, Prof. William A. (Bill) Martin of MIT's Project MAC (later the Lab for Computer Science) began development of a powerful knowledge representation language called OWL. Although the acronym was never officially defined, many of us who participated in the project believed it to stand for "One World Language." Throughout several incarnations, OWL was used to support research on automatic programming (including declarative specification of computational procedures) natural language understanding business modeling and problem solving medical diagnosis and therapy management generation of natural language explanations from knowledge and program behavior. The heart of the system is a formal language for knowledge representation that took a Wittgenstein-like approach to knowledge. Approximately, this says that the meaning of any concept in the language is the totality of all the other concepts linked to it. Concepts were formed by specialization, OWL included a derivative subclassifier that made certain taxonomic inferences, and a very general notion of characterization allowed great flexibility in expressing assertional knowledge. Critically, all characterizations could be reified as concepts, allowing the system to use meta-level descriptions of its own features. Inference methods could be specified within the language itself, or could be implemented as Lisp procedures for efficiency.


  • REC-PICS-labels-961031 - This document has been prepared for the technical subcommittee of PICS (Platform for Internet Content Selection). It defines a general format for labels and three methods by which these labels may be transmitted: In an HTML document; With a document transported via a protocol that uses RFC-822 headers; Separately from the document.
  • - was a specification created by W3C that used metadata to label webpages to help parents and teachers control what children and students could access on the Internet. The W3C Protocol for Web Description Resources project integrates PICS concepts with RDF. PICS was superseded by POWDER, which itself is no longer actively developed. PICS often used a content labeling from the Internet Content Rating Association, which has also been discontinued by the Family Online Safety Institute's board of directors. An alternative self-rating system, named Voluntary Content Rating, was devised by Solid Oak Software in 2010, in response to the perceived complexity of PICS. Internet Explorer 3 was one of the early web browsers to offer support for PICS, released in 1996. Internet Explorer 5 added a feature called approved sites, that allowed extra sites to be added to the list in addition to the PICS list when it was being used.


  • Protocol for Web Description Resources (POWDER): Primer - the Protocol for Web Description Resources — provides a mechanism to describe and discover Web resources and helps the users to make a decision whether a given resource is of interest. There are a variety of use cases: from providing a better means to describing Web resources and creating trustmarks to aiding content discovery, child protection and Semantic Web searches.There are two varieties of POWDER: a complex, semantically rich variety, called POWDER-S, and a much simpler version, just called POWDER, which is intended as the primary transport mechanism for Description Resources. POWDER-S can be generated automatically from POWDER.
  • - the W3C recommended method for describing Web resources. It specifies a protocol for publishing metadata about Web resources using RDF, OWL, and HTTP.The initial working party was formed in February 2007 with the W3C Content Label Incubator Group's 2006 work as an input. On 1 September 2009 POWDER became a W3C recommendation and the Working Group is now closed.


  • An MCF[/XML Tutorial] - a tool to provide information about information. The primary goal is to make the Web (Internet or Intranet) more like a library and less like a messy heap of books on the floor. In order to understand MCF, there are three things that you'll need to learn: Objects, Categories, and Properties - the conceptual building blocks, The XML syntax in which MCF is stored, The Directed Linked Graph mathematical model which lies behind MCF, which can be used by computer programmers to build efficient MCF implementations.

  • - a specification of a content format for structuring metadata about web sites and other data. MCF was developed by Ramanathan V. Guha at Apple Computer's Advanced Technology Group between 1995 and 1997. Rooted in knowledge-representation systems such as CycL, KRL, and KIF, it sought to describe objects, their attributes, and the relationships between them. When the research project was discontinued, Guha left Apple for Netscape, where, in collaboration with Tim Bray, he adapted MCF to use XML and created the first version of the Resource Description Framework (RDF).


  • - a small set of HTML extensions designed to give web pages semantic meaning by allowing information such as class, subclass and property relationships. SHOE was developed around 1996 by Sean Luke, Lee Spector, James Hendler, Jeff Heflin, and David Rager at the University of Maryland, College Park.

Semantic web

  • Semantic Web - a “Web of data,” the sort of data you find in databases. The ultimate goal of the Web of data is to enable computers to do more useful work and to develop systems that can support trusted interactions over the network. The term “Semantic Web” refers to W3C’s vision of the Web of linked data. Semantic Web technologies enable people to create data stores on the Web, build vocabularies, and write rules for handling data. Linked data are empowered by technologies such as RDF, SPARQL, OWL, and SKOS.

  • Structured Data on the Web - More and more of the world's data is moving onto the Web. We want to share, re-mix and use this data to build more awesome Web applications. Using structured data technologies to mark up people, places, events, recipes, ratings, music, movies and products on the Web makes everybody's life easier. This site will help you learn about big data, the semantic web, and the practical application of technologies such as Microformats, RDFa, Microdata and JSON-LD.

  • - a form of syndication in which content is made available from one website to other sites. Most commonly, websites are made available to provide either summaries or full renditions of a website's recently added content. The term may also describe other kinds of content licensing for reuse.

  • Semantic Web Case Studies and Use Cases - Case studies include descriptions of systems that have been deployed within an organization, and are now being used within a production environment. Use cases include examples where an organization has built a prototype system, but it is not currently being used by business functions. Note; absolutely terrible use cases.

  1. Use URIs to denote things.
  2. Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.
  3. Provide useful information about the thing when its URI is dereferenced, leveraging standards such as RDF, SPARQL.
  4. Include links to other related things (using their URIs) when publishing data on the Web.


  1. All kinds of conceptual things, they have names now that start with HTTP.
  2. I get important information back. I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event.
  3. I get back that information it's not just got somebody's height and weight and when they were born, it's got relationships. And when it has relationships, whenever it expresses a relationship then the other thing that it's related to is given one of those names that starts with HTTP.

On the Semantic Web, vocabularies define the concepts and relationships (also referred to as “terms”) used to describe and represent an area of concern. Vocabularies are used to classify the terms that can be used in a particular application, characterize possible relationships, and define possible constraints on using those terms. In practice, vocabularies can be very complex (with several thousands of terms) or very simple (describing one or two concepts only).

There is no clear division between what is referred to as “vocabularies” and “ontologies”. The trend is to use the word “ontology” for more complex, and possibly quite formal collection of terms, whereas “vocabulary” is used when such strict formalism is not necessarily used or only in a very loose sense. Vocabularies are the basic building blocks for inference techniques on the Semantic Web.

  • The Self-Describing Web - The Web is designed to support flexible exploration of information by human users and by automated agents. For such exploration to be productive, information published by many different sources and for a variety of purposes must be comprehensible to a wide range of Web client software, and to users of that software.HTTP and other Web technologies can be used to deploy resource representations that are self-describing: information about the encodings used for each representation is provided explicitly within the representation. Starting with a URI, there is a standard algorithm that a user agent can apply to retrieve and interpret such representations. Furthermore, representations can be what we refer to as grounded in the Web, by ensuring that specifications required to interpret them are determined unambiguously based on the URI, and that explicit references connect the pertinent specifications to each other. Web-grounding ensures that the specifications needed to interpret information on the Web can be identified unambiguously. When such self-describing, Web-grounded resources are linked together, the Web as a whole can support reliable, ad hoc discovery of information.This finding describes how document formats, markup conventions, attribute values, and other data formats can be designed to facilitate the deployment of self-describing, Web-grounded Web content.
  • - a key concept of Semantic Web architecture in which a set of Resource Description Framework statements (a graph) are identified using a URI, allowing descriptions to be made of that set of statements such as context, provenance information or other such metadata.Named graphs are a simple extension of the RDF data model through which graphs can be created but the model lacks an effective means of distinguishing between them once published on the Web at large.

  • - or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields.Typical standardized semantic networks are expressed as semantic triples.Semantic networks are used in natural language processing applications such as semantic parsing and word-sense disambiguation.

  • -2003.02320- Knowledge Graphs - In this paper we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After a general introduction, we motivate and contrast various graph-based data models and query languages that are used for knowledge graphs. We discuss the roles of schema, identity, and context in knowledge graphs. We explain how knowledge can be represented and extracted using a combination of deductive and inductive techniques. We summarise methods for the creation, enrichment, quality assessment, refinement, and publication of knowledge graphs. We provide an overview of prominent open knowledge graphs and enterprise knowledge graphs, their applications, and how they use the aforementioned techniques. We conclude with high-level future research directions for knowledge graphs. [4]

  • YouTube: Knowledge Engineering with Semantic Web Technologies by Dr. Harald Sack - The web has become an object of our daily life and the amount of information in the web is ever growing. Besides plain texts, especially multimedia information such as graphics, audio or video have become a predominant part of the web's information traffic. But, how can we find useful information within this huge information space? How can we make use of the knowledge contained in those web documents? Traditional search engines for example will reach the limits of their power, when it comes to understanding information content. The Semantic Web is an extension of the traditional web in the sense that information in the form of natural language text in the web will be complemented by its explicit semantics based on a formal knowledge representation. Thus, the meaning of information expressed in natural language can be accessed in an automated way and interpreted correctly, i.e. it can be ‘understood’ by machines.

  • - a class whose instances are themselves classes. Similar to their role in programming languages, metaclasses in Semantic Web languages can have properties otherwise applicable only to individuals, while retaining the same class's ability to be classified in a concept hierarchy. This enables knowledge about instances of those metaclasses to be inferred by semantic reasoners using statements made in the metaclass. Metaclasses thus enhance the expressivity of knowledge representations in a way that can be intuitive for users. While classes are suitable to represent a population of individuals, metaclasses can, as one of their feature, be used to represent the conceptual dimension of an ontology. Metaclasses are supported in the ontology language OWL and the data-modeling vocabulary RDFS.

  • Agile Knowledge Engineering and Semantic Web (AKSW) is hosted by the Chair of Business Information Systems (BIS) of the Institute of Computer Science (IfI) / University of Leipzig as well as the Institute for Applied Informatics (InfAI). Goals: Development of methods, tools and applications for adaptive Knowledge Engineering in the context of the Semantic Web. Research of underlying Semantic Web technologies and development of fundamental Semantic Web tools and applications. Maturation of strategies for fruitfully combining the Social Web paradigms with semantic knowledge representation techniques.

  • Sindice - Data Web Services. Millions of websites mark up their content using RDF, Microformats, Microdata,, RDFa, Opengraph and more. Sindice helps you find, understand and integrate with their content.

  • The Registry! - This is the home page for the Open Metadata Registry (formerly the NSDL Registry).The Metadata Registry provides services to developers and consumers of controlled vocabularies and is one of the first production deployments of the RDF-based Semantic Web Community's Simple Knowledge Organization System (SKOS)
    • The Registry! :: RDA - This page provides quick links for the Registered RDA Element Sets and Value Vocabularies.Each set of elements or vocabulary concepts has a link to the general description as well as a link to a list of elements or concepts.

  • Ontopia - Open source tools for building, maintaining and deploying Topic Maps-based applications

  • Curing the Web's Identity Crisis - This paper describes the crisis of identity facing the World Wide Web and, in particular, the RDF community. It shows how that crisis is rooted in a lack of clarity about the nature of "resources" and how concepts developed during the XML Topic Maps effort can provide a solution that works not only for Topic Maps, but also for RDF and semantic web technologies in general.

  • - also known as Semantic Web Cake or Semantic Web Layer Cake, illustrates the architecture of the Semantic Web.The Semantic Web is a collaborative movement led by international standards body the World Wide Web Consortium (W3C). The standard promotes common data formats on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web, dominated by unstructured and semi-structured documents into a "web of data". The Semantic Web stack builds on the W3C's Resource Description Framework (RDF).

  • - a key concept of Semantic Web architecture in which a set of Resource Description Framework statements (a graph) are identified using a URI, allowing descriptions to be made of that set of statements such as context, provenance information or other such metadata.Named graphs are a simple extension of the RDF data model through which graphs can be created but the model lacks an effective means of distinguishing between them once published on the Web at large.

  • - reasoning engine, rules engine, or simply a reasoner, is a piece of software able to infer logical consequences from a set of asserted facts or axioms. The notion of a semantic reasoner generalizes that of an inference engine, by providing a richer set of mechanisms to work with. The inference rules are commonly specified by means of an ontology language, and often a description logic language. Many reasoners use first-order predicate logic to perform reasoning; inference commonly proceeds by forward chaining and backward chaining. There are also examples of probabilistic reasoners, including Pei Wang's non-axiomatic reasoning system, and probabilistic logic networks.

  • PublishMyData - Digital Marketplace - an end-to-end solution for publishing and connecting data on the web, helping public sector organisations work more efficiently and to deliver improved services. PublishMyData incorporates a polished end-user website, powerful management interface, performance-tuned graph store and programming interfaces - delivered together with hosting maintenance, technical support and consultancy.
  • PublishMyData | Swirrl

  • TwitLogic - a tool for data integration which translates streaming social data using common, RDF-based vocabularies and mashes it up with Linked Data, enabling new kinds of real-time social applications which take advantage of the rich background knowledge of the Semantic Web.


  • RDF 1.2 Primer - This primer is designed to provide the reader with the basic knowledge required to effectively use RDF. It introduces the basic concepts of RDF and shows concrete examples of the use of RDF.

  • W3C:RDF 1.2 Concepts and Abstract Syntax - The Resource Description Framework (RDF) is a framework for representing information in the Web. This document defines an abstract syntax (a data model) which serves to link all RDF-based languages and specifications, including: the formal model-theoretic semantics for RDF [RDF12-SEMANTICS]; serialization syntaxes for storing and exchanging RDF such as RDF 1.2 Turtle [RDF12-TURTLE] and JSON-LD 1.1 [JSON-LD11]; the SPARQL 1.2 Query Language [SPARQL12-QUERY]; the RDF 1.2 Schema [

  • - a family of World Wide Web Consortium (W3C) specifications originally designed as a metadata data model. It has come to be used as a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax notations and data serialization formats. It is also used in knowledge management applications. RDF was adopted as a W3C recommendation in 1999. The RDF 1.0 specification was published in 2004, the RDF 1.1 specification in 2014.

The RDF data model is similar to classical conceptual modeling approaches (such as entity–relationship or class diagrams). It is based on the idea of making statements about resources (in particular web resources) in expressions of the form subject–predicate–object, known as triples. The subject denotes the resource, and the predicate denotes traits or aspects of the resource, and expresses a relationship between the subject and the object.For example, one way to represent the notion "The sky has the color blue" in RDF is as the triple: a subject denoting "the sky", a predicate denoting "has the color", and an object denoting "blue". Therefore, RDF uses subject instead of object (or entity) in contrast to the typical approach of an entity–attribute–value model in object-oriented design: entity (sky), attribute (color), and value (blue).

RDF is an abstract model with several serialization formats (i.e. file formats), so the particular encoding for resources or triples varies from format to format.This mechanism for describing resources is a major component in the W3C's Semantic Web activity: an evolutionary stage of the World Wide Web in which automated software can store, exchange, and use machine-readable information distributed throughout the Web, in turn enabling users to deal with the information with greater efficiency and certainty. RDF's simple data model and ability to model disparate, abstract concepts has also led to its increasing use in knowledge management applications unrelated to Semantic Web activity. A collection of RDF statements intrinsically represents a labeled, directed multi-graph. This in theory makes an RDF data model better suited to certain kinds of knowledge representation than are other relational or ontological models. However, in practice, RDF data is often stored in relational database or native representations (also called Triplestores—or Quad stores, if context such as the named graph is also stored for each RDF triple). As RDFS and OWL demonstrate, one can build additional ontology languages upon RDF.

  • W3C: RDF 1.2 Semantics - This document describes a precise semantics for the Resource Description Framework 1.1 [RDF11-CONCEPTS] and RDF Schema [RDF11-SCHEMA]. It defines a number of distinct entailment regimes and corresponding patterns of entailment. It is part of a suite of documents which comprise the full specification of RDF 1.1.
  • RDF data are sets of ‘triples’ (aka ‘statements’) of the form (Subject, Property, Object)
  • RDF data are seen as (unranked, node- and edge-labeled) directed graphs
    • nodes of which are statement's subjects and objects and are either labeled
      • by URIs an thus representing Web resources
      • by literals, such as strings or numbers, thus representing literal resources
      • by ‘local’ identifiers thus representing ‘anonymous’ or ‘blank’ nodes.
    • arcs of which correspond to statement's properties
  • Properties are also called ‘predicates’ (statement analogy)
  • Blank nodes commonly used to aggregate or group statements
    • e.g., in containers or collections
    • or for n-ary relations

  • W3C: RDF 1.1: On Semantics of RDF Datasets - RDF defines the concept of RDF datasets, a structure composed of a distinguished RDF graph and zero or more named graphs, being pairs comprising an IRI or blank node and an RDF graph. While RDF graphs have a formal model-theoretic semantics that determines what arrangements of the world make an RDF graph true, no agreed formal semantics exists for RDF datasets. This document presents some issues to be addressed when defining a formal semantics for datasets, as they have been discussed in the RDF 1.1 Working Group, and specify several semantics in terms of model theory, each corresponding to a certain design choice for RDF datasets.

RDF is a general method to decompose any type of knowledge into small pieces, with some rules about the semantics, or meaning, of those pieces. The point is to have a method so simple that it can express any fact, and yet so structured that computer applications can do useful things with it.

The basic unit of RDF is a statement called a triple. One can think of a triple as a type of sentence that states a single "fact" about a resource. RDF allows you to define statements about things (or resources), in the form of subject-predicate-object expressions (known as RDF-triples due to the 3 constituent parts).

  • - a node in an RDF graph representing a resource for which a URI or literal is not given.[1] The resource represented by a blank node is also called an anonymous resource. According to the RDF standard a blank node can only be used as subject or object of an RDF triple.

The different forms for representing the RDF data are:

  • Notation-3 (N3)
  • Turtle - a simplified, RDF-only subset of N3.
  • N-Triple
  • RDFa
  • TRiX
  • TRiG

  • - used for providing uniquely named elements and attributes in an XML document. They are defined in a W3C recommendation. An XML instance may contain element or attribute names from more than one XML vocabulary. If each vocabulary is given a namespace, the ambiguity between identically named elements or attributes can be resolved. A simple example would be to consider an XML instance that contained references to a customer and an ordered product. Both the customer element and the product element could have a child element named id. References to the id element would therefore be ambiguous; placing them in different namespaces would remove the ambiguity.

  • A tidyverse lover’s intro to RDF - In the world of data science, RDF is a bit of an ugly duckling. Like XML and Java, only without the massive-adoption-that-refuses-to-die part. In fact RDF is most frequently expressed in XML, and RDF tools are written in Java, which help give RDF has the aesthetics of steampunk, of some technology for some futuristic Semantic Web1 in a toolset that feels about as lightweight and modern as iron dreadnought.But don’t let these appearances deceive you. RDF really is cool.

  • RDF Dataset Normalization - RDF describes a graph-based data model for making claims about the world and provides the foundation for reasoning upon that graph of information. At times, it becomes necessary to compare the differences between sets of graphs, digitally sign them, or generate short identifiers for graphs via hashing algorithms. This document outlines an algorithm for normalizing RDF datasets such that these operations can be performed.

  • R2RML - RDB to RDF Mapping Language. This document describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author's choice. R2RML mappings are themselves RDF graphs and written down in Turtle syntax. R2RML enables different types of mapping implementations. Processors could, for example, offer a virtual SPARQL endpoint over the mapped relational data, or generate RDF dumps, or offer a Linked Data interface.
  • RML - an extension of R2RML. RML rules are executed by processors. To ease the creation and execution of these rules, we developed graphical user interfaces and YARRRML, a human-readable text-based representation based on YAML, together with the relevant tooling. We created wrappers to easy the use of our processors in your development environment. It is also possible to validate your RML rules to improve the quality of your resulting knowledge graphs.


See also Documents



  • RDF 1.1 XML Syntax - This document defines an XML syntax for RDF called RDF/XML in terms of Namespaces in XML, the XML Information Set and XML Base.

Here's some RDF XML:

<rdf:RDF xmlns:rdf=""


 <ns:Person rdf:about="">
   <ns:hasMother rdf:resource="" />
     <rdf:Description rdf:about="">
       <ns:hasBrother rdf:resource="" />

  • - a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, Boolean predicates that the content must satisfy, data types governing the content of elements and attributes, and more specialized rules such as uniqueness and referential integrity constraints. There are languages developed specifically to express XML schemas. The document type definition (DTD) language, which is native to the XML specification, is a schema language that is of relatively limited capability, but that also has other uses in XML aside from the expression of schemas. Two more expressive XML schema languages in widespread use are XML Schema (with a capital S) and RELAX NG.The mechanism for associating an XML document with a schema varies according to the schema language. The association may be achieved via markup within the XML document itself, or via some external means.

  • - or DTD, a set of markup declarations that define a document type for an SGML-family markup language (GML, SGML, XML, HTML). A DTD defines the valid building blocks of an XML document. It defines the document structure with a list of validated elements and attributes. A DTD can be declared inline inside an XML document, or as an external reference. XML uses a subset of SGML DTD. As of 2009, newer XML namespace-aware schema languages (such as W3C XML Schema and ISO RELAX NG) have largely superseded DTDs. A namespace-aware version of DTDs is being developed as Part 9 of ISO DSDL. DTDs persist in applications that need special publishing characters, such as the XML and HTML Character Entity References, which derive from larger sets defined as part of the ISO SGML standard effort.


  • Notation3 (N3): A readable RDF syntax - This document defines Notation 3 (also known as N3), an assertion and logic language which is a superset of RDF. N3 extends the RDF datamodel by adding formulae (literals which are graphs themselves), variables, logical implication, and functional predicates, as well as providing an textual syntax alternative to RDF/XML.

Here's some N3 RDF:

@prefix : <> .
:john    a           :Person .
:john    :hasMother  :susan .
:john    :hasFather  :richard .
:richard :hasBrother :luke .
  • RDF for "Little Languages" - This note describes an experimental software development in which RDF/N3 is used to code query and report generation functions performed on RDF data.


  • RDF 1.1 N-Quads - N-Quads is a line-based, plain text format for encoding an RDF dataset.


  • RDF 1.2 Turtle - This document defines a textual syntax for RDF called Turtle that allows an RDF graph to be completely written in a compact and natural text form, with abbreviations for common usage patterns and datatypes. Turtle provides levels of compatibility with the N-Triples [N-TRIPLES] format as well as the triple pattern syntax of the SPARQL W3C Recommendation.


  • TriG - RDF Dataset Language. A concrete syntax for RDF as defined in the RDF Concepts and Abstract Syntax ([rdf11-concepts]). TriG is an extension of Turtle ([turtle]), extended to support representing a complete RDF Dataset.


  • TriX : RDF Triples in XML - Abstract: Many approaches to writing RDF in XML have been proposed. The revised standard RDF/XML still has many known problems. It is not intrinsically difficult to have a clear serialization of RDF in XML, and we present a simple solution. We add the ability to name graphs, noting that in practice this is already widely used. We use XSLT as a general syntactic extensibility mechanism to provide human friendly macros for our syntax. Notes: Patrick Stickler, Nokia, Tampere, Finland
  • - a serialization format for RDF (Resource Description Framework) graphs. It is an XML format for serializing Named Graphs and RDF Datasets which offers a compact and readable alternative to the XML-based RDF/XML syntax. It was jointly created by HP Labs and Nokia. It is suggested that those digital artifacts dependent of the serialization format need means to verify immutability, or digital artifacts including datasets, code, texts, and images are not verifiable nor permanent. Embedding cryptographic hash values to applied URIs has been suggested for structured data files such as nano-publications.


  • Notation3 (N3): A readable RDF syntax - This document defines Notation 3 (also known as N3), an assertion and logic language which is a superset of RDF. N3 extends the RDF datamodel by adding formulae (literals which are graphs themselves), variables, logical implication, and functional predicates, as well as providing an textual syntax alternative to RDF/XML.

Not a real spec.

  • - or N3 as it is more commonly known, is a shorthand non-XML serialization of Resource Description Framework models, designed with human-readability in mind: N3 is much more compact and readable than XML RDF notation. The format is being developed by Tim Berners-Lee and others from the Semantic Web community. A formalization of the logic underlying N3 was published by Berners-Lee and others in 2008. N3 has several features that go beyond a serialization for RDF models, such as support for RDF-based rules. Turtle is a simplified, RDF-only subset of N3.


  • JSON-LD 1.1 - a useful data serialization and messaging format. This specification defines JSON-LD, a JSON-based format to serialize Linked Data. The syntax is designed to easily integrate into deployed systems that already use JSON, and provides a smooth upgrade path from JSON to JSON-LD. It is primarily intended to be a way to use Linked Data in Web-based programming environments, to build interoperable Web services, and to store Linked Data in JSON-based storage engines.
  • W3C JSON-LD Working Group - The mission of the JSON-LD Working Group is to update the JSON-LD 1.0 specifications to address specific usability or technical issues based on the community’s experiences, implementer feedback, and requests for new features.

  • JSON-LD 1.1 Framing - allows developers to query by example and force a specific tree layout to a JSON-LD document. This specification describes a superset of the features defined in JSON-LD Framing 1.0 [JSON-LD10-FRAMING] and, except where noted, the algorithms described in this specification are fully compatible with documents created using the previous community standard.

  • JSON-LD Playground - Play around with JSON-LD markup by typing out some JSON below and seeing what gets generated from it at the bottom of the page. Pick any of the examples below to get started.

  • JSON-LD 1.1 Framing - JSON-LD Framing allows developers to query by example and force a specific tree layout to a JSON-LD document.

  • - "JSON-LD was created by people that have been directly involved in the Linked Data, lowercase semantic web, uppercase Semantic Web, Microformats, Microdata, and RDFa work. It has proven to be useful to them. There are a number of very large technology companies that have adopted JSON-LD, further underscoring its utility."

"Full Disclosure: I am one of the primary creators of JSON-LD, lead editor on the JSON-LD 1.0 specification, and chair of the JSON-LD Community Group. These are my personal opinions and not the opinions of the W3C, JSON-LD Community Group, or my company. ... TL;DR: The desire for better Web APIs is what motivated the creation of JSON-LD, not the Semantic Web. If you want to make the Semantic Web a reality, stop making the case for it and spend your time doing something more useful, like actually making machines smarter or helping people publish data in a way that’s useful to them."


  • Linked Data Fragments - A huge amount of Linked Data is available on the Web. But can live applications use it? SPARQL endpoints are expensive for the server, and not always available for all datasets. Downloadable dumps are expensive for clients, and do not allow live querying on the Web. With Linked Data Fragments, and specifically the Triple Pattern Fragments interface, we aim to explore what happens when we redistribute the load between clients and servers. We then measure the impact of such interfaces on clients, servers, and caches. high client cost high availability high bandwidth high server cost low availability low bandwidth Have we explored all options in-between? data dump SPARQL endpoint Such solutions allow you to reliably execute queries against live Linked Data on the Web. You can even perform federated querying—all in your browser.





  • hAtom – for marking up Atom feeds from within standard HTML
  • hCalendar – for events
  • hCard – for contact information; includes:
  • adr – for postal addresses
  • geo – for geographical coordinates (latitude, longitude)
  • hMedia - for audio/video content
  • hNews - for news content
  • hProduct – for products
  • hRecipe - for recipes and foodstuffs.
  • hResume – for resumes or CVs
  • hReview – for reviews
  • rel-directory – for distributed directory creation and inclusion
  • rel-enclosure – for multimedia attachments to web pages
  • rel-license – specification of copyright license
  • rel-nofollow, an attempt to discourage third-party content spam (e.g. spam in blogs)
  • rel-tag – for decentralized tagging (Folksonomy)
  • xFolk – for tagged links
  • XHTML Friends Network (XFN) – for social relationships
  • XOXO – for lists and outlines


2004. RDFa 1.1 reached recommendation status in June 2012.

  • RDFa - an extension to HTML5 that helps you markup things like People, Places, Events, Recipes and Reviews. Search Engines and Web Services use this markup to generate better search listings and give you better visibility on the Web, so that people can find your website more easily.

  • - an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents. XHTML+RDFa is one of the techniques used to develop Semantic Web content by embedding rich semantic markup. Version 1.1 of the language is a superset of XHTML 1.1, integrating the attributes according to RDFa Core 1.1. In other words, it is an RDFa support through XHTML Modularization.RDFa in XHTML version 1.0 became a World Wide Web Consortium (W3C) Recommendation on 14 October 2008. The current recommendation is RDFa+XHTML version 1.1, which became a W3C Recommendation on 7 June 2012 and was updated with a ”Second Edition” on 22 August 2013 and a ”Third Edition” on 17 March 2015.[5]Version 1.1 is based on XHTML™ 1.1 - Module-based XHTML - Second Edition. Version 1.0 was based on the first edition.



  • Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.





  • Microdata to RDF – Second Edition - HTML microdata is an extension to HTML used to embed machine-readable data into HTML documents. Whereas the microdata specification describes a means of markup, the output format is JSON. This specification describes processing rules that may be used to extract RDF [RDF11-CONCEPTS] from an HTML document containing microdata.


Open Graph Protocol



  • provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google, Yahoo! and Yandex rely on this markup to improve the display of search results, making it easier for people to find the right web pages.

Twitter Cards


  • Twitter Cards make it possible for you to attach media experiences to Tweets that link to your content. Simply add a few lines of HTML to your webpages, and users who Tweet links to your content will have a "card" added to the Tweet that’s visible to all of their followers.

Web Credits


  • ValueFlows - a set of common vocabularies to describe flows of economic resources of all kinds within distributed economic ecosystems. Purpose: to enable internetworking among many different software projects for resource planning and accounting within fractal networks of people and groups. The vocabulary will work for any kind of economic activity, but the focus is to facilitate groups experimenting with solidarity / cooperative / collaborative / small business ecosystem / commons based peer production / any transitional economies. Or, with less buzzwords, "let's help a lot of alternative economic software projects that are solving different pieces of the same puzzle be able to work together".One of the purposes of this vocab is to support resource flows connecting many websites. These flows may be oriented around Processes, Exchanges, or combinations of both. We want to support RDF based and non-RDF based use of the vocabulary, basically any way that people want to use software and data on the internet to help create economic networks.


  • Gleaning Resource Descriptions from Dialects of Languages (GRDDL) - a mechanism for Gleaning Resource Descriptions from Dialects of Languages. This GRDDL specification introduces markup based on existing standards for declaring that an XML document includes data compatible with the Resource Description Framework (RDF) and for linking to algorithms (typically represented in XSLT), for extracting this data from the document.The markup includes a namespace-qualified attribute for use in general-purpose XML documents and a profile-qualified link relationship for use in valid XHTML documents. The GRDDL mechanism also allows an XML namespace document (or XHTML profile document) to declare that every document associated with that namespace (or profile) includes gleanable data and for linking to an algorithm for gleaning the data.

  • GRDDL Primer - a technique for obtaining RDF data from XML documents and in particular XHTML pages. Authors may explicitly associate documents with transformation algorithms, typically represented in XSLT, using a link element in the head of the document. Alternatively, the information needed to obtain the transformation may be held in an associated metadata profile document or namespace document.

to sort

  • RDF HDT (Header, Dictionary, Triples) is a compact data structure and binary serialization format for RDF that keeps big datasets compressed to save space while maintaining search and browse operations without prior decompression. This makes it an ideal format for storing and sharing RDF datasets on the Web.

  • - short for "eXtensible Data Interchange", is a semantic data interchange format and protocol under development by the OASIS XDI Technical Committee. The name comes from the addressable graph model XDI uses: every node in the XDI graph is its own RDF graph that is uniquely addressable.

  • - like conventional web services, is the server end of a client–server system for machine-to-machine interaction via the World Wide Web. Semantic services are a component of the semantic web because they use markup which makes data machine-readable in a detailed and sophisticated way (as compared with human-readable HTML which is usually not easily "understood" by computer programs).

  • Web Application Description Language - This specification describes the Web Application Description Language (WADL). An increasing number of Web-based enterprises (Google, Yahoo, Amazon, Flickr to name but a few) are developing HTTP-based applications that provide programatic access to their internal data. Typically these applications are described using textual documentation that is sometimes supplemented with more formal specifications such as XML schema for XML-based data formats. WADL is designed to provide a machine process-able description of such HTTP-based Web applications.

Description Logic

  • - a family of formal knowledge representation languages. Many DLs are more expressive than propositional logic but less expressive than first-order logic. In contrast to the latter, the core reasoning problems for DLs are (usually) decidable, and efficient decision procedures have been designed and implemented for these problems. There are general, spatial, temporal, spatiotemporal, and fuzzy description logics, and each description logic features a different balance between expressive power and reasoning complexity by supporting different sets of mathematical constructors. DLs are used in artificial intelligence to describe and reason about the relevant concepts of an application domain (known as terminological knowledge). It is of particular importance in providing a logical formalism for ontologies and the Semantic Web: the Web Ontology Language (OWL) and its profile is based on DLs. The most notable application of DLs and OWL is in biomedical informatics where DL assists in the codification of biomedical knowledge.

  • - an "assertion component" — a fact associated with a conceptual model or ontologies within a knowledge base.The terms "ABox" and "TBox" are used to describe two different types of statements in knowledge bases. TBox statements describe a domain of interest by defining classes and properties as a domain vocabulary. ABox are TBox-compliant statements that use the vocabulary. TBox statements are sometimes associated with object-oriented classes and ABox statements associated with instances of those classes.
  • - a "terminological component" — a conceptualization associated with a set of facts, known as an ABox. TBox statements describe a conceptualization of a domain of interest by defining different sets of individuals described in terms of their characteristics (properties). ABox are TBox-compliant statements about individuals belonging to these sets. For instance, a specific student is an individual in the set called "Student". This set can be defined as a subset of all people that attend some educational institution, making it possible to state the specific educational institution each individual attends.

  • - a knowledge representation and ontology language. F-logic combines the advantages of conceptual modeling with object-oriented, frame-based languages and offers a declarative, compact and simple syntax, as well as the well-defined semantics of a logic-based language.Features include, among others, object identity, complex objects, inheritance, polymorphism, query methods, encapsulation. F-logic stands in the same relationship to object-oriented programming as classical predicate calculus stands to relational database programming.

Ontology languages

  • - formal languages used to construct ontologies. They allow the encoding of knowledge about specific domains and often include reasoning rules that support the processing of that knowledge. Ontology languages are usually declarative languages, are almost always generalizations of frame languages, an


  • - was the name of a US funding program at the US Defense Advanced Research Projects Agency (DARPA) started in 1999 by then-Program Manager James Hendler, and later run by Murray Burke, Mark Greaves and Michael Pagels. The program focused on the creation of machine-readable representations for the Web. One of the Investigators working on the program was Tim Berners-Lee and to a great degree through his influence, working with the program managers, the effort worked to create technologies and demonstrations for what is now called the Semantic Web and this in turn led to the growth of Knowledge Graph technology.A primary outcome of the DAML program was the DAML language, an agent markup language based on RDF. This language was then followed by an extension entitled DAML+OIL which included researchers outside of the DARPA program in the design. The 2002 submission of the DAML+OIL language to the World Wide Web Consortium (W3C) captures the work done by DAML contractors and the EU/U.S. ad hoc Joint Committee on Markup Languages. This submission was the starting point for the language (later called OWL) to be developed by W3C's web ontology working group, WebOnt.

DAML+OIL was a syntax, layered on RDF and XML, that could be used to describe sets of facts making up an ontology. DAML+OIL had its roots in three main languages - DAML, as described above, OIL (Ontology Inference Layer) and SHOE, an earlier US research project.A major innovation of the languages was to use RDF and XML for a basis, and to use RDF namespaces to organize and assist with the integration of arbitrarily many different and incompatible ontologies. Articulation ontologies can link these competing ontologies through codification of analogous subsets in a neutral point of view, as is done in the Wikipedia.Current ontology research derived in part from DAML is leading toward the expression of ontologies and rules for reasoning and action.Much of the work in DAML has now been incorporated into RDF Schema, the OWL and their successor languages and technologies including

  • - can be regarded as an ontology infrastructure for the Semantic Web. OIL is based on concepts developed in Description Logic (DL) and frame-based systems and is compatible with RDFS. OIL was developed by Dieter Fensel, Frank van Harmelen (Vrije Universiteit, Amsterdam) and Ian Horrocks (University of Manchester) as part of the IST OntoKnowledge project.Much of the work in OIL was subsequently incorporated into DAML+OIL and the Web Ontology Language (OWL).

RDF Schema / RDFS

  • - Resource Description Framework Schema, variously abbreviated as RDFS, RDF(S), RDF-S, or RDF/S) is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources. These resources can be saved in a triplestore to reach them with the query language SPARQL.The first versio was published by the World-Wide Web Consortium (W3C) in April 1998, and the final W3C recommendation was released in February 2004. Many RDFS components are included in the more expressive Web Ontology Language (OWL).

  • RDF/S allows so-called RDF Schemas (or ontologies) similar to object-oriented class hierarchies or taxonomies
  • Inheritance model of RDF/S exhibits the following peculiarities:
    • same resource may be classified in different, unrelated classes
    • class hierarchy may be cyclic → all classes on cycle equivalent
    • properties are first-class
      • associates range and domain to property, rather than which properties a class can carry
  • Inference rules are used to define the semantics (or entailment) of an RDF/S schema
    • e.g., transitivity of the class hierarchy or
    • inferred type of an untyped resource in the domain of a property


  • OWL - a Web Ontology language. Where earlier languages have been used to develop tools and ontologies for specific user communities (particularly in the sciences and in company-specific e-commerce applications), they were not defined to be compatible with the architecture of the World Wide Web in general, and the Semantic Web in particular. OWL uses both URIs for naming and the description framework for the Web provided by RDF to add the following capabilities to ontologies: Ability to be distributed across many systems, Scalability to Web needs, Compatibility with Web standards for accessibility and internationalization, Openess and extensiblility. OWL builds on RDF and RDF Schema and adds more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes.


  • RDFS vs Owl - RDFS allows you to express the relationships between things by standardizing on a flexible, triple-based format and then providing a vocabulary (“keywords” such as rdf:type or rdfs:subClassOf) which can be used to say things. OWL is similar, but bigger, better, and badder. OWL lets you say much more about your data model, it shows you how to work efficiently with database queries and automatic reasoners, and it provides useful annotations for bringing your data models into the real world.

  • OWL 2 Web Ontology Language Mapping to RDF Graphs (Second Edition) - The OWL 2 Web Ontology Language, informally OWL 2, is an ontology language for the Semantic Web with formally defined meaning. OWL 2 ontologies provide classes, properties, individuals, and data values and are stored as Semantic Web documents. OWL 2 ontologies can be used along with information written in RDF, and OWL 2 ontologies themselves are primarily exchanged as RDF documents. The OWL 2 Document Overview describes the overall state of OWL 2, and should be read before other OWL 2 documents.This document defines the mapping of OWL 2 ontologies into RDF graphs, and vice versa.

  • - or synset, is a group of data elements that are considered semantically equivalent for the purposes of information retrieval. These data elements are frequently found in different metadata registries. Although a group of terms can be considered equivalent, metadata registries store the synonyms at a central location called the preferred data element. According to WordNet, a synset or synonym set is defined as a set of one or more synonyms that are interchangeable in some context without changing the truth value of the proposition in which they are embedded. A synonym ring can be expressed by a series of statements in the Web Ontology Language (OWL) using the classEquivalence or the propertyEquivalence or instance equivalence statement – the sameAs property.

  • PR-OWL - an open research work aimed to extend the OWL ontology Web language so it can represent probabilistic ontologies. In other words, it is a probabilistic extension to OWL that provides a framework for authoring probabilistic ontologies and is based on the Bayesian first-order logic called Multi-Entity Bayesian Networks (MEBN).
  • - an ontology representation language that enables such perceptual modeling. It assumes a causal model of the world, where observable media features are caused by underlying concepts. In MOWL, it is possible to associate different types of media features in different media format and at different levels of abstraction with the concepts in a closed domain. The associations are probabilistic in nature to account for inherent uncertainties in observation of media patterns. The spatial and temporal relations between the media properties characterizing a concept (or, event) can also be expressed using MOWL. Often the concepts in a domain inherit the media properties of some related concepts, such as a historic monument inheriting the color and texture properties of its building material. It is possible to reason with the media properties of the concepts in a domain to derive an Observation Model for a concept. Finally, MOWL supports an abductive reasoning framework using Bayesian networks, that is robust against imperfect observations of media data.

  • Visual Data Web - Visually Experiencing the Data Web - provides an overview of our attempts to a more visual Data Web.The term Data Web refers to the evolution of a mainly document-centric Web toward a more data-oriented Web. In its narrow sense, the term describes pragmatic approaches of the Semantic Web, such as RDF and Linked Data. In a broader sense, it also includes less formal data structures, such as microformats, microdata, tagging, and folksonomies.The term Visual Data Web reflects our goal of making the Data Web visually more experienceable, also for average Web users with little to no knowledge about the underlying technologies. This website presents developments, related publications, and current activities to generate new ideas, methods, and tools that help making the Data Web easier accessible, more visible, and thus more attractive.

  • Binary OWL - "This paper presents a binary format for both storing OWL ontologies and describing changes in OWL ontologies. The format is de-signed to be a fast to parse and serialise format. It is intended as a low level storage and transmission mechanism rather than an end user exchange syntax. Software to parse and serialise binary OWL has been implemented in the form of OWL API parsers and renderers. Some initial experiments seem to indicate that a Binary OWL ontology document can be parsed roughly an order of magnitude faster than the corresponding RDF/XML document.



  • - a proposed language for the Semantic Web that can be used to express rules as well as logic, combining OWL DL or OWL Lite with a subset of the Rule Markup Language (itself a subset of Datalog). The specification was submitted in May 2004 to the W3C by the National Research Council of Canada, Network Inference (since acquired by webMethods), and Stanford University in association with the Joint US/EU ad hoc Agent Markup Language Committee. The specification was based on an earlier proposal for an OWL rules language. SWRL has the full power of OWL DL, but at the price of decidability and practical implementations. However, decidability can be regained by restricting the form of admissible rules, typically by imposing a suitable safety condition.

RDF Lists

  • PDF: An Ordered RDF List - proposal is to introduce a fundamental concept of ordered lists in RDF

  • PDF: Modelling and Querying Lists in RDF. A Pragmatic Study Many Linked Data datasets model elements in their domains in the form of lists: a countable number of ordered resources. When publishing these lists in RDF, an important concern is making them easy to consume. Therefore, a well-known recommendation is to find an existing list modelling solution, and reuse it. However, a specific domain model can be implemented in different ways and vocabularies may provide alternative solutions. In this paper, we argue that a wrong decision could have a significant impact in terms of performance and, ultimately, the availability of the data. We take the case of RDF Lists and make the hypothesis that the efficiency of retrieving sequential linked data depends primarily on how they are modelled (triple-store invariance hypothesis). To demonstrate this, we survey different solutions for modelling sequences in RDF, and propose a pragmatic approach for assessing their impact on data availability. Finally, we derive good (and bad) practices on how to publish lists as linked open data. By doing this, we sketch the foundations of an empirical, task-oriented methodology for benchmarking linked data modelling solutions.

  • List.MID: A MIDI-Based Benchmark for Evaluating RDF Lists | Zenodo - The RDF list data is coherently generated from a large, community-curated base collection of Web MIDI files, rich in lists of musical events of arbitrary length. We describe the List.MID benchmark, and discuss its impact and adoption, reusability, design, and availability.


  • Shapes Constraint Language (SHACL) - This document specifies SHACL (Shapes Constraint Language), a language for describing and validating RDF graphs. This section introduces SHACL with an overview of the key terminology and an example to illustrate basic concepts.

Ontologies / vocabularies

There is no clear division between what is referred to as “vocabularies” and “ontologies”. The trend is to use the word “ontology” for more complex, and possibly quite formal collection of terms, whereas “vocabulary” is used when such strict formalism is not necessarily used or only in a very loose sense. Vocabularies are the basic building blocks for inference techniques on the Semantic Web.

  • - namespace lookup for RDF developers

  • - "The use of URIs in RDF facilitates a marketplace of terms and vocabularies. This is not a centralized directory of RDF vocabularies; there is no such thing. But let's use this as a place to advertise our work (though it's not a substitute for MailingLists and such) and a place to look when you're considering whether to BuildOrBuyTerms, and as a place to compare and contrast, and perhaps to facilitate sharing (especially in SeedApplications) vs. supporting redundant vocabularies. "
  • - "At the moment, the methods used in practice to locate an adequate vocabulary for describing one's data in RDF are more akin to dowsing than to an educated, technically-guided choice, supported by scientific tools and methodologies. While the situation is improving with the progress of Semantic Web search engines and better education, oftentimes data publishers still rely on informal criteria such as word-of-mouth, reputation or follow-your-nose strategies. This page tries to identify methods, tools, applications, websites or communities that can help Linked Data publishers to discover or build the right vocabulary they need. The tools identified below are sorted from the ones that require less time and efforts from the publisher's side to those that require hard work."

  • - intended to be an open URI space for vocabularies such as RDF Schemas or XML Namespace documents. The PURL is mapped to this domain. It is recommended that all vocabularies hosted here define their term URIs using the PURL rather than the domain. The PURL is expected to persist longer than, although every effort will be made to ensure that persists for as long as possible.

  • UMBEL - Vocabulary and Reference Concept Ontology (namespace: umbel). UMBEL is the Upper Mapping and Binding Exchange Layer, designed to help content interoperate on the Web.


  • - XML Schema Definition, a recommendation of the World Wide Web Consortium (W3C), specifies how to formally describe the elements in an Extensible Markup Language (XML, document. It can be used by programmers to verify each piece of item content in a document, to assure it adheres to the description of the element it is placed in. Like all XML schema languages, XSD can be used to express a set of rules to which an XML document must conform to be considered "valid" according to that schema. However, unlike most other schema languages, XSD was also designed with the intent that determination of a document's validity would produce a collection of information adhering to specific data types. Such a post-validation infoset can be useful in the development of XML document processing software.

XML Schema, published as a W3C recommendation in May 2001, is one of several XML schema languages. It was the first separate schema language for XML to achieve Recommendation status by the W3C. Because of confusion between XML Schema as a specific W3C specification, and the use of the same term to describe schema languages in general, some parts of the user community referred to this language as WXS, an initialism for W3C XML Schema, while others referred to it as XSD, an initialism for XML Schema Definition. In Version 1.1 the W3C has chosen to adopt XSD as the preferred name, and that is the name used in this article. The XSD 1.0 specification was originally published in 2001, with a second edition following in 2004 to correct large numbers of errors. XSD 1.1 became a W3C Recommendation in April 2012.

In its appendix of references, the XSD specification acknowledges the influence of DTDs and other early XML schema efforts such as DDML, SOX, XML-Data, and XDR. It has adopted features from each of these proposals but is also a compromise among them. Of those languages, XDR and SOX continued to be used and supported for a while after XML Schema was published. A number of Microsoft products supported XDR until the release of MSXML 6.0 (which dropped XDR in favor of XML Schema) in December 2006. Commerce One, Inc. supported its SOX schema language until declaring bankruptcy in late 2004. The most obvious features offered in XSD that are not available in XML's native Document Type Definitions (DTDs) are namespace awareness and datatypes, that is, the ability to define element and attribute content as containing values such as integers and dates rather than arbitrary text.

  • XML Schema Datatypes in RDF and OWL - The RDF and OWL Recommendations use the simple types from XML Schema. This document addresses three questions left unanswered by these Recommendations: Which URIref should be used to refer to a user defined datatype? Which values of which XML Schema simple types are the same? How to use the problematic xsd:duration in RDF and OWL? In addition, we further describe how to integrate OWL DL with user defined datatypes (in appendix B).

Dublin Core

  • DCMI: Dublin Core™ - The original Dublin Core™ of thirteen (later fifteen) elements was first published in the report of a workshop in 1995. In 1998, this was formalized in the Internet Engineering Task Force standard RFC 5791, and discussions began about making it a standard of the (US) National Information Standards Organization (NISO). This led to the publication of ANSI/NISO Z39.85-2001 and the International Standards Organization Standard 15836-2003. The most recent updates of these standards are RFC 5791 (2010), Z39-85-2012, and ISO 15836-1:2017. Publication of a Part 2 to the ISO standard, covering several dozen properties and classes that have been added to DCMI namespaces since 1999, is expected in 2019.Starting in 2002, DCMI grew into the role of "de facto" standards agency by maintaining its own, updated documentation for DCMI Metadata Terms. The DCMI Usage Board currently serves as the maintenance agency for ISO 15836.In addition to these semantic specifications, DCMI working groups have developed specifications on other topics of relevance to metadata, such as encoding syntaxes, usage guidelines, and metadata models. In the rapidly evolving environment of the World Wide Web, most of these specifications have been superseded over time, sometimes after influencing subsequent work by other technical communities, notably the World Wide Web Consortium.
  • DCMI: DCMI Metadata Terms - This document is an up-to-date specification of all metadata terms maintained by the Dublin Core Metadata Initiative, including properties, vocabulary encoding schemes, syntax encoding schemes, and classes.

  • - often abbreviated to DwC, is an extension of Dublin Core for biodiversity informatics. It is meant to provide a stable standard reference for sharing information on biological diversity (biodiversity). The terms described in this standard are a part of a larger set of vocabularies and technical specifications under development and maintained by Biodiversity Information Standards (TDWG) (formerly the Taxonomic Databases Working Group).


  • SKOS: Simple Knowledge Organization System - an area of work developing specifications and standards to support the use of knowledge organization systems (KOS) such as thesauri, classification schemes, subject heading systems and taxonomies within the framework of the Semantic Web.SKOS & RDFSKOS provides a standard way to represent knowledge organization systems using the Resource Description Framework (RDF). Encoding this information in RDF allows it to be passed between computer applications in an interoperable way.Using RDF also allows knowledge organization systems to be used in distributed, decentralised metadata applications. Decentralised metadata is becoming a typical scenario, where service providers want to add value to metadata harvested from multiple sources.

  • SKOS Simple Knowledge Organization System Primer - SKOS—Simple Knowledge Organization System—provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary. As an application of the Resource Description Framework (RDF), SKOS allows concepts to be composed and published on the World Wide Web, linked with data on the Web and integrated into other concept schemes.
  • SKOS Simple Knowledge Organization System Reference - The SKOS data model provides a standard, low-cost migration path for porting existing knowledge organization systems to the Semantic Web. SKOS also provides a lightweight, intuitive language for developing and sharing new knowledge organization systems. It may be used on its own, or in combination with formal knowledge representation languages such as the Web Ontology language (OWL).This document is the normative specification of the Simple Knowledge Organization System. It is intended for readers who are involved in the design and implementation of information systems, and who already have a good understanding of Semantic Web technology, especially RDF and OWL.


  • The FOAF Project - a computer language defining a dictionary of people-related terms that can be used in structured data (e.g. RDFa, JSON-LD, Linked Data).
  • FOAF Vocabulary Specification - a project devoted to linking people and information using the Web. Regardless of whether information is in people's heads, in physical or digital documents, or in the form of factual data, it can be linked. FOAF integrates three kinds of network: social networks of human collaboration, friendship and association; representational networks that describe a simplified view of a cartoon universe in factual terms, and information networks that use Web-based linking to share independently published descriptions of this inter-connected world. FOAF does not compete with socially-oriented Web sites; rather it provides an approach in which different sites can tell different parts of the larger story, and by which users can retain some control over their information in a non-proprietary format.

Web of Trust

  • WOT Schema - RDF documents can make any number of statements. Without some kind of signature or other similar verification mechanism, there is no way to understand who made these statements. One way to document who made a set of statements is via the use of Digital Signatures: signing a document using Public Key Cryptography. The WOT, or Web Of Trust, schema is designed to facilitate the use of Public Key Cryptography tools such as PGP or GPG to sign RDF documents and document these signatures.


Creative Commons



  • SIOC - Semantically-Interlinked Online Communities, aims to enable the integration of online community information. SIOC provides a Semantic Web ontology for representing rich data from the Social Web in RDF. It has recently achieved significant adoption through its usage in a variety of commercial and open-source software applications, and is commonly used in conjunction with the FOAF vocabulary for expressing personal profile and social networking information. By becoming a standard way for expressing user-generated content from such sites, SIOC enables new kinds of usage scenarios for online community site data, and allows innovative semantic applications to be built on top of the existing Social Web. The SIOC ontology was recently published as a W3C Member Submission, which was submitted by 16 organisations.




  • SCOT is an acronym for Social Semantic Cloud of Tags. The name was chosen to emphasise the goal of providing a consistent framework for expressing social tagging at a semantic level in machine-understandable way. The SCOT ontology provides a model for expressing the main concepts and properties required to describe information for tagging activities (e.g., users, tags, resources, etc.) on the Semantic Web. This document contains a detailed description of the SCOT Ontology.


  • - was a controlled vocabulary developed by The Open Group. It provided a framework for categorizing, naming, and indexing data. It assigned to every item of data a structured alphanumeric tag plus a controlled vocabulary name that describes the meaning of the data. This allowed relating data elements to similar elements defined by other organizations. UDEF defined a Dewey-decimal like code for each concept. For example, an "employee number" is often used in human resource management. It has a UDEF tag a.5_12.35.8 and a controlled vocabulary description "Employee.PERSON_Employer.Assigned.IDENTIFIER". UDEF has been superseded by the Open Data Element Framework (O-DEF).

  • Universal Data Element Framework (UDEF) - a cross-industry metadata identification strategy designed to facilitate convergence and interoperability among e-business and other standards. The objective of the UDEF is to provide a means of real-time identification for semantic equivalency, as an attribute to data elements within e-business document and integration formats. The supporters of the UDEF hope that it can be seen as the "Dewey Decimal System" across standards. The UDEF can be seen as only an attribute in the data element. There are no process, validation, or handling requirements. The intent is to communicate in a standard and repeatable way, the exact concept that the data element represents. There is very little about context, just enough to identify the data element exactly.


  • Open Data Element Framework (O-DEF) workshops - a standard published by The Open Group as a replacement for the UDEF, enables information interoperability by common categorization of basic units of data. This simplifies the development of interface software and contributes to improved management and organization of data.

CVS / Tabular Data

  • CSVW Namespace Vocabulary Terms - This document describes the CSVW Namespace Vocabulary Terms and Term definitions used for creating Metadata descriptions for Tabular Data. This document provides the RDFS [RDF-SCHEMA] vocabulary definition for terms defined in [tabular-metadata] and a description of the JSON-LD context definition for use with defining metadata documents. Alternate versions of the vocabulary definition exist in Turtle and JSON-LD, which also includes the @context required for metadata descriptions. These versions may also be retrieved from using an appropiate HTTP Accept header.

  • Metadata Vocabulary for Tabular Data - Validation, conversion, display, and search of tabular data on the web requires additional metadata that describes how the data should be interpreted. This document defines a vocabulary for metadata that annotates tabular data. This can be used to provide metadata at various levels, from groups of tables and how they relate to each other down to individual cells within a table. The metadata defined in this specification is used to provide annotations on an annotated table or group of tables, as defined in [tabular-data-model]. Annotated tables form the basis for all further processing, such as validating, converting, or displaying the tables.

  • Generating RDF from Tabular Data on the Web - This document defines the procedures and rules to be applied when converting tabular data into RDF. Tabular data may be complemented with metadata annotations that describe its structure, the meaning of its content and how it may form part of a collection of interrelated tabular data. This document specifies the effect of this metadata on the resulting RDF.

  • CSV-LD: Spreadsheet-based Linked Data - Comma-separated-values (CSV) is a useful data serialization and sharing format. This talk introduces the idea of CSV-LD as a CSV-based format to serialize Linked Data, mirroring the way that JSON-LD is a JSON-based format to serialize Linked Data. "CSV" here includes any dialect that uses a different delimiter, such as tab-separated-values (TSV). The syntax of CSV-LD is designed to easily integrate into workflows that already use CSV, and provides a smooth upgrade path from CSV to CSV-LD. It is primarily intended to be a way to use Linked Data as part of spreadsheet-based data entry; to facilitate data validation, display, and conversion of CSV into other formats via use of CSV on the Web (CSVW) metadata; and to build FAIR data services. The term "CSV-LD" was previously used to describe a now-obsoleted precursor to the CSVW specifications; both approaches require a second file, a JSON-LD template document, to be shared along with a CSV file. The approach described here, in contrast, requires only a CSV file from the data producer, one that includes links to CSVW-powered metadata.

  • csvcubed: a new tool for creating CSVWs | ONS Digital - a Python library and command line interface (CLI) tool for people with statistical or observational data to share. Our focus has been to remove the steep learning curve that makes it hard to use open standards for linked data. Our well documented command line interface makes it possible for users to create 4☆ open data from a single CSV file, or 5☆ linked data by providing a little configuration.

  • - an express-based server that provides an interface to interact with data following the CSV on the Web (CSVW, format. It is designed to support various content negotiation options to deliver data in the most suitable format for the consumer.

Data Cube

  • RDF Data Cube Vocabulary - There are many situations where it would be useful to be able to publish multi-dimensional data, such as statistics, on the web in such a way that it can be linked to related data sets and concepts. The Data Cube vocabulary provides a means to do this using the W3C RDF (Resource Description Framework) standard. The model underpinning the Data Cube vocabulary is compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations. The Data Cube vocabulary is a core foundation which supports extension vocabularies to enable publication of other aspects of statistical data flows or other multi-dimensional data sets.


  • Vocabulary of Interlinked Datasets (VoID) is an RDF Schema vocabulary for expressing metadata about RDF datasets. It is intended as a bridge between the publishers and users of RDF data, with applications ranging from data discovery to cataloging and archiving of datasets. This document provides a formal definition of the new RDF classes and properties introduced for VoID. It is a companion to the main specification document for VoID, Describing Linked Datasets with the VoID Vocabulary.

  • - In early June 2011, the three big search engines Bing, Google and Yahoo! introduced, a collection of terms that webmasters can use to markup their pages to improve the display of search results. This site is a complementary effort by people from the Linked Data community to support deployment and usage with a special focus on Linked Data:


  • OMV Homepage - Ontologies have seen quite an enormous development and application in many domains within the last years, especially in the context of the next web generation, the Semantic Web. Besides the work of countless researchers across the world, industry starts developing ontologies to support their daily operative business. Currently, most ontologies exist in pure form without any additional information, e.g. authorship information, such as provided by Dublin Core for text documents. This burden makes it difficult for academia and industry e.g. to identify, find and apply - basically meaning to reuse - ontologies effectively and efficiently. Our contribution consists of a proposal for a metadata standard, so called Ontology Metadata Vocabulary OMV.


  • - Metadata for Ontology Description and Publication OntologyThis project consists in building an OWL ontology and application profile to capture metadata information for ontologies, vocabularies or semantic resources in general.


  • Data Catalog Vocabulary (DCAT) - Version 2 - DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. This document defines the schema and provides examples for its use. DCAT enables a publisher to describe datasets and data services in a catalog using a standard model and vocabulary that facilitates the consumption and aggregation of metadata from multiple catalogs. This can increase the discoverability of datasets and data services. It also makes it possible to have a decentralized approach to publishing data catalogs and makes federated search for datasets across catalogs in multiple sites possible using the same query mechanism and structure. Aggregated DCAT metadata can serve as a manifest file as part of the digital preservation process. The namespace for DCAT terms is The suggested prefix for the DCAT namespace is dcat

Common Tag

  • Common Tag - an open tagging format developed to make content more connected, discoverable and engaging. Unlike free-text tags, Common Tags are references to unique, well-defined concepts, complete with metadata and their own URLs. With Common Tag, site owners can more easily create topic hubs, cross-promote their content, and enrich their pages with free data, images and widgets.

Music Ontology

  • Music Ontology is an attempt to provide a vocabulary for linking a wide range music-related information, and to provide a democratic mechanism for doing so


  • BIO: A vocabulary for biographical information. used in FOAF




  • - The CIDOC Conceptual Reference Model (CRM), provides an extensible ontology for concepts and information in cultural heritage and museum documentation. It is the international standard (ISO 21127:2014) for the controlled exchange of cultural heritage information. Galleries, libraries, archives, museums (GLAMs), and other cultural institutions are encouraged to use the CIDOC CRM to enhance accessibility to museum-related information and knowledge.

Getty Vocabulary Program

  • - a department within the Getty Research Institute at the Getty Center in Los Angeles, California. It produces and maintains the Getty controlled vocabulary databases, Art and Architecture Thesaurus, Union List of Artist Names, and Getty Thesaurus of Geographic Names. They are compliant with ISO and NISO standards for thesaurus construction. The Getty vocabularies are the premiere references for categorizing works of art, architecture, material culture, and the names of artists, architects, and geographic names. They have been the life work of many people and continue to be critical contributions to cultural heritage information management and documentation. They contain terms, names, and other information about people, places, things, and concepts relating to art, architecture, and material culture. They can be accessed online free of charge on the Getty website.

Cultural Objects Name Authority / CONA

  • - abbreviated CONA, is a project by the Getty Research Institute to create a controlled vocabulary containing authority records for cultural works, including architecture and movable works such as paintings, sculpture, prints, drawings, manuscripts, photographs, textiles, ceramics, furniture, other visual media such as frescoes and architectural sculpture, performance art, archaeological artifacts, and various functional objects that are from the realm of material culture and of the type collected by museums. The focus of CONA is works cataloged in scholarly literature, museum collections, visual resources collections, archives, libraries, and indexing projects with a primary emphasis on art, architecture, or archaeology. The target users are the visual resources, academic, and museum communities.

Getty Thesaurus of Geographic Names / TGN

  • - abbreviated TGN, is a product of the J. Paul Getty Trust included in the Getty Vocabulary Program. The TGN includes names and associated information about places. Places in TGN include administrative political entities (e.g., cities, nations) and physical features (e.g., mountains, rivers). Current and historical places are included. Other information related to history, population, culture, art and architecture is included. The resource is available to museums, art libraries, archives, visual resource collection catalogers, bibliographic projects through private license or available to members of the general public for free on the Getty Vocabulary website (see external links)

Union List of Artist Names / ULAN

  • - a free online database of the Getty Research Institute using a controlled vocabulary, which by 2018 contained over 300,000 artists and over 720,000 names for them, as well as other information about artists. Names in ULAN may include given names, pseudonyms, variant spellings, names in multiple languages, and names that have changed over time (e.g., married names). Among these names, one is flagged as the preferred name. Although it is displayed as a list, ULAN is structured as a thesaurus, compliant with ISO and NISO standards for thesaurus construction; it contains hierarchical, equivalence, and associative relationships.

Art & Architecture Thesaurus / AAT

  • - AAT, is a controlled vocabulary used for describing items of art, architecture, and material culture. The AAT contains generic terms, such as "cathedral", but no proper names, such as "Cathedral of Notre Dame." The AAT is used by, among others, museums, art libraries, archives, catalogers, and researchers in art and art history. The AAT is a thesaurus in compliance with ISO and NISO standards including ISO 2788, ISO 25964 and ANSI/NISO Z39.19. The AAT is a structured vocabulary of 55,661 concepts (as of January 2020), including 131,000 terms, descriptions, bibliographic citations, and other information relating to fine art, architecture, decorative arts, archival materials, and material culture.

Geographic / observations

Basic Geo


  • GeoNames Ontology - makes it possible to add geospatial semantic information to the Word Wide Web. All over 11 million geonames toponyms now have a unique URL with a corresponding RDF web service. Other services describe the relation between toponyms.

Observations and Measurements

  • - an international standard which defines a conceptual schema encoding for observations, and for features involved in sampling when making observations. While the O&M standard was developed in the context of geographic information systems, the model is derived from generic patterns proposed by Fowler and Odell, and is not limited to geospatial information. O&M is one of the core standards in the OGC Sensor Web Enablement suite, providing the response model for Sensor Observation Service (SOS).


  • - a type of sensor network that heavily utilizes the World Wide Web and is especially suited for environmental monitoring. OGC's Sensor Web Enablement (SWE, framework defines a suite of web service interfaces and communication protocols abstracting from the heterogeneity of sensor (network) communication.
  • - a marriage of sensor web and semantic Web technologies. The encoding of sensor descriptions and sensor observation data with Semantic Web languages enables more expressive representation, advanced access, and formal analysis of sensor resources. The SSW annotates sensor data with spatial, temporal, and thematic semantic metadata. This technique builds on current standardization efforts within the Open Geospatial Consortium's Sensor Web Enablement (SWE) and extends them with Semantic Web technologies to provide enhanced descriptions and access to sensor data.


  • - or TML is a retired Open Geospatial Consortium standard developed to describe any transducer (sensor or transmitter, in terms of a common model, including characterizing not only the data but XML formed metadata describing the system producing that data.


  • - an approved Open Geospatial Consortium standard and an XML encoding for describing sensors and measurement processes. SensorML can be used to describe a wide range of sensors, including both dynamic and stationary platforms and both in-situ and remote sensors.

Semantic Sensor Network Ontology / SSN

  • Semantic Sensor Network Ontology - SSN, ontology is an ontology for describing sensors and their observations, the involved procedures, the studied features of interest, the samples used to do so, and the observed properties, as well as actuators. SSN follows a horizontal and vertical modularization architecture by including a lightweight but self-contained core ontology called SOSA (Sensor, Observation, Sample, and Actuator) for its elementary classes and properties. With their different scope and different degrees of axiomatization, SSN and SOSA are able to support a wide range of applications and use cases, including satellite imagery, large-scale scientific monitoring, industrial and household infrastructures, social sensing, citizen science, observation-driven ontology engineering, and the Web of Things. Both ontologies are described below, and examples of their usage are given.


  • PROV-Overview - Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. The PROV Family of Documents defines a model, corresponding serializations and other supporting definitions to enable the inter-operable interchange of provenance information in heterogeneous environments such as the Web. This document provides an overview of this family of documents.

  • PROV Model Primer - This primer document provides an accessible introduction to the PROV data model for provenance interchange on the Web. The provenance of digital objects represents their origins. PROV is a specification to express provenance records, which contain descriptions of the entities and activities involved in producing and delivering or otherwise influencing a given object. Provenance can be used for many purposes, such as understanding how data was collected so it can be meaningfully used, determining ownership and rights over an object, making judgements about information to determine whether to trust it, verifying that the process and steps used to obtain a result complies with given requirements, and reproducing how something was generated.

  • - After a successful publication of the PROV Ontology and related Notes (see PROV Overview for more details), this group has been closed on the 19th of June, 2013. For more information, related tools, etc, consult the PROV page of the Semantic Web Wiki, which is maintained by the community

Simple News and Press Ontologies



  • oeGOV - Ontologies for e-Government, making and publishing W3C OWL ontologies for eGovernment. This initiative is born out of the idea: "Use small OWL ontologies to model recovery and deploy across all government" posted at and Tim Berners-Lee's vision of "Linked Open Data". oeGOV is an initiative started by TopQuadrant. As far back as 2003 we have been building ontologies for eGovernment. TopQuadrant was co-organizer of the first eGovernment conference at the White House Conference Center on the importance of Semantic Web Technologies for the sharing of data using the web infrastructure. The first eGovernment ontologies were the FEA Ontologies. Currently the focus is on ontologies of Government, datasets of US Government branches, agencies, departments, offices and state governments, and an ontology of Archived.


  • - e-GMS, is the UK e-Government Metadata Standard. It defines how UK public sector bodies should label content such as web pages and documents to make such information more easily managed, found and shared. The metadata standard is an application profile of the Dublin Core Metadata Element Set and consists of mandatory, recommended and optional metadata elements such as title, date created and description. The e-GMS formed part of the e-Government Metadata Framework (e-GMF), and eGovernment Interoperability Framework (e-GIF). The standard helps provide a basis for the adoption of XML schemas for data exchange. Within the UK, e-GIF was replaced by the "Open Source, Open Standards And Re-Use: Government Action Plan", published by the Cabinet Office in February 2009. This is now embodied in the Open Standards principles policy. The UK e-GIF documentation and UK Government Data Standards Catalogue have been archived and are available for reference only. Although now deprecated by the Cabinet Office, many systems have been built around the framework and continue to use e-GIF components, notably within the NHS. E-GMS has been mapped to the IEEE/LOM.[5] IPSV has been mapped to the Local Government Classification Scheme.

  • - was a type of controlled vocabulary called a taxonomy, for use in choosing Subject metadata and keywords, primarily for indexing government web pages. The use of GCL terms in the metadata of all government resources is intended to facilitate, encourage and simplify automatic categorisation. The Government Category list was superseded by the Integrated Public Sector Vocabulary (IPSV, during 2006, which incorporates terms from GCL as well as from other controlled vocabularies, and is designed to enable semantic interoperability of systems and web resources across the UK public sector.


  • - IPSV, Integrated Public Sector Vocabulary, is a controlled vocabulary for describing subjects and was first released in April 2005, building on developments of the subject element introduced with version 3.0 of e-GMS. It merged three earlier lists: the GCL (Government Category List), LGCL (Local Government Category List, and the seamless. It had 2732 preferred terms and, 4230 non-preferred. The current version, version 2, was released in April 2006. It is much bigger, with 3080 preferred terms and 4843 non-preferred terms, and covers internal-facing as well as public-oriented topics. The Internal Vocabulary was released as a separate subset containing 756 preferred terms and 1333 non-preferred terms. An abridged version of the IPSV was also released containing 549 preferred terms and 1472 non-preferred terms and remains compliant with the e-GMS. The Public Sector Information Domain – Metadata Standards Working Group subsequently agreed to recommend this change to eGMS on the use of subject metadata from October 2012.

to sort

  • ONTORULE (ONTOlogies meet business RULEs) is a large-scale integrating project (IP) partially funded by the European Union's 7th Framework Programme under the Information and Communication Technologies


  • VOAF is a vocabulary specification providing elements allowing the description of vocabularies (RDFS vocabularies or OWL ontologies) used in the Linked Data Cloud. In particular it provides properties expressing the different ways such vocabularies can rely on, extend, specify, annotate or otherwise link to each other. It relies itself on Dublin Core and voiD. The name of the vocabulary makes an explicit reference to FOAF because VOAF can be used to define networks of vocabularies in a way similar to the one FOAF is used to define networks of people.


LOV objective is to provide easy access methods to this ecosystem of vocabularies, and in particular by making explicit the ways they link to each other and providing metrics on how they are used in the linked data cloud, help to improve their understanding, visibility and usability, and overall quality.



  • - a framework for describing, analyzing and studying games. It is a hierarchy of concepts abstracted from an analysis of many specific games. GOP borrows concepts and methods from prototype theory as well as grounded theory to achieve a framework that is always growing and changing as new games are analyzed or particular research questions are explored.


  • GoodRelations - powerful vocabulary for publishing all of the details of your products and services in a way friendly to search engines, mobile applications, and browser extensions. By adding a bit of extra code to your Web content, you make sure that potential customers realize all the great features and services and the benefits of doing business with you, because their computers can extract and present this information with ease.

Product Types

  • The Product Types Ontology - Use Wikipedia pages for describing products or services with GoodRelations and This service provides ca. 300,000 precise definitions for types of product or services that extend the and GoodRelations standards for e-commerce markup.

  • - a collaborative project to provide stable digital representations of numismatic concepts according to the principles of Linked Open Data. These take the form of http URIs that also provide access to reusable information about those concepts, along with links to other resources. The canonical format of is RDF/XML, with serializations available in JSON-LD (including geoJSON-LD for complex geographic features), Turtle, KML (when applicable), and HTML5+RDFa 1.1.


  • PROV-O: The PROV Ontology - The PROV Ontology (PROV-O) expresses the PROV Data Model [PROV-DM] using the OWL2 Web Ontology Language (OWL2) [OWL2-OVERVIEW]. It provides a set of classes, properties, and restrictions that can be used to represent and interchange provenance information generated in different systems and under different contexts. It can also be specialized to create new classes and properties to model provenance information for different applications and domains. The PROV Document Overview describes the overall state of PROV, and should be read before other PROV documents.

BBC Things

Linked Data

  • Linked Data - using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods. More specifically, Wikipedia defines Linked Data as "a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF." This site exists to provide a home for, or pointers to, resources from across the Linked Data community.

  • Linked Data - Design Issues - The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data.
  • 5-star Open Data - Tim Berners-Lee, the inventor of the Web and Linked Data initiator, suggested a 5-star deployment scheme for Open Data. Here, we give examples for each step of the stars and explain costs and benefits that come along with it.

  • - The Linked Data Theatre is a platform for an optimal presentation of Linked Data - The Linked Data Theatre is a platform for an optimal presentation of Linked Data

  • Linked Research - an initiative, a movement, and a manifesto. We believe that scholarly communication is stunted by current scholarly and scientific practices, and we aim to promote change for the greater good. This is not something hypothetical or a dream for the future, it is completely possible with today's technologies based on native Web design principles and standards. A cultural shift is needed, and Linked Research is here to bring together like-minded people who want to push this forwards. Linked Research is not biased towards any particular tool or technology; we understand that different researchers and disciplines have different needs and desires around scholarly communication. We encourage the use of any technologies that comply with the Linked Research principles; we tend to find Web technologies work really well for this, but we'd love to hear about how you do it!

Linking Open Data / LOD

  • W3C: Linking Open Data - The Open Data Movement aims at making data freely available to everyone. There are already various interesting open data sets available on the Web. Examples include Wikipedia, Wikibooks, Geonames, MusicBrainz, WordNet, the DBLP bibliography and many more which are published under Creative Commons or Talis licenses. The goal of the W3C SWEO Linking Open Data community project is to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting RDF links between data items from different data sources. RDF links enable you to navigate from a data item within one data source to related data items within other sources using a Semantic Web browser. RDF links can also be followed by the crawlers of Semantic Web search engines, which may provide sophisticated search and query capabilities over crawled data. As query results are structured data and not just links to HTML pages, they can be used within other applications.

  • LodLive project provides a demonstration of the use of Linked Data standards (RDF, SPARQL) to browse RDF resources. The application aims to spread linked data principles using a simple and friendly interface with reusable techniques.

  • - provides Linked Open Data (LOD) services for libraries, consisting of user interfaces (UIs) and application programming interfaces (APIs). lobid is run by the North Rhine-Westphalian Library Service Centre (hbz).

Linked Data Platform / LDP

  • Linked Data Platform Use Cases and Requirements - To foster the development of the Linked Data Platform specification, this document includes a set of user stories, use cases, scenarios and requirements that motivate a simple read-write Linked Data architecture, based on HTTP access to web resources that describe their state using RDF. The starting point for the development of these use cases is a collection of user stories that provide realistic examples describing how people may use read-write Linked Data. The use cases themselves are captured in a narrative style that describes a behavior, or set of behaviors based on, and using scenarios from, these user stories. The aim throughout has been to avoid details of protocol (specifically the HTTP protocol), and use of any specific vocabulary that might be introduced by the LDP specification.

  • - a linked data specification defining a set of integration patterns for building RESTful HTTP services that are capable of read/write of RDF data. The Linked Data Platform allows use of RESTful HTTP to consume, create, update and delete both RDF and non-RDF resources. In addition, it defines a set of "container" constructs – buckets into which documents can be added with a relationship between the bucket and the object similar to the relationship between a blog and its constituent blog posts.

  • LDP Implementations - community-maintained page listing planned and existing implementations of the Linked Data Platform (LDP). If you are associated with such an implementation (planned or complete), please make sure it is correctly listed on this page. See the LDP compliance reports for those implementations who have supplied testing results. Disclaimer: The contents of this page have not necessarily been reviewed by the LDP Working Group. Things listed on this page might not actually be LDP implementations. (But please do try to keep it limited to things which aim to be full or partial LDP implementations.)

  • Linked Data Platform 1.0 - defines a set of rules for HTTP operations on web resources, some based on RDF, to provide an architecture for read-write Linked Data on the web.

  • W3C Linked Data Platform Working Group Charter - "This group is based on the idea of combining two Web-related concepts to help solve some of the long-standing challenges involved in building and combining software: RDF, the Resource Description Framework, is a W3C Recommended general technique for conveying information. It has a handful of syntaxes, including RDF/XML, RDFa, and Turtle, any of which can be used to transmit RDF statements. The items about which information is expressed in RDF documents are identified with URIs (eg, but the existing RDF specifications do not cover dereferencing them. RDF is the basis for Linked Data and the Semantic Web. With RESTful APIs and RESTful Web Services, clients use basic HTTP verbs, with their simple and direct meaning, to obtain and alter the state of objects on the server. In these APIs, the remote information objects are identified with URIs which are dereferenced in every operation. RESTful APIs can be defined independent of the formats used for conveying the state of the objects; typically services use custom XML and/or JSON encodings of state information.

The combination of RDF and RESTful APIs is therefore natural, with RDF providing a standard way to serialize information about things identified by URIs and REST providing a way to obtain and alter the state of those things. This approach has been proposed and explored for some time, in academia and industry, as shown by the items listed in References. Within W3C, the SPARQL Working Group developed a RESTful protocol for accessing data in SPARQL data stores and discussed its wider applicability. The participants in the Linked Enterprise Data Patterns Workshop expressed general support for the creation of a Working Group to define a way to use RDF with RESTful APIs in support of application integration.

The basic technique here is to expose application data objects ("resources") on the Web, allowing authorized clients to see and modify object state using HTTP operations (GET, PUT, etc) with an RDF data format. This RESTful approach leverages existing Web technology, including caching, linking, and indexing, and the use of RDF facilitates integration of data across systems and applications. This approach dovetails with SPARQL and is positioned for developers who want more direct access to the application data.

The Linked Data Platform is envisioned as an enterprise-ready collection of standard techniques and services based on using RESTful APIs and the W3C Semantic Web stack. Simple LDP applications can be developed and deployed using only RDF and conventional HTTP infrastructure. More extensive LDP applications can be built using other elements of the stack, including RDFS, SPARQL, OWL, RIF, and the PROV provenance vocabulary. Although expertise in these specialized elements may be helpful, it is not necessary for participation in this group and should not be required for using the Linked Data Platform.

  • Linked Data Basic Profile 1.0 - A set of best practices and simple approach for a read-write Linked Data architecture, based on HTTP access to web resources that describe their state using RDF.

  • - This is community-maintained page listing planned and existing implementations of the Linked Data Platform (LDP). If you are associated with such an implementation (planned or complete), please make sure it is correctly listed on this page. See the LDP compliance reports for those implementations who have supplied testing results.

  • EUCLID | EdUcational Curriculum for the usage of LInked Data - a European project facilitating professional training for data practitioners, who aim to use Linked Data in their daily work. EUCLID delivers a curriculum implemented as a combination of living learning materials and activities (eBook series, webinars, face‐to‐face training), validated by the user community through continuous feedback

Linked Data otifications / LDN

  • - a W3C Recommendation that describes a communications protocol based on HTTP, URI, and RDF on how servers (receivers) can receive messages pushed to them by applications (senders), as well as how other applications (consumers) may retrieve those messages. Any web resource (like a HTML page) can advertise a receiving endpoint (inbox) for notification messages. Messages are expressed in RDF, and can contain arbitrary data.



See also WebDev#API


  • RESTful Grounding by otaviofff - RESTful Grounding was the very first Web Ontology to allow the semantic description of RESTful Web Services. It was authored by two Brazilian researches, from the University of Sao Paulo (USP), Engineering School, as part of their master thesis on RESTful Semantic Web Services: Otavio F. Ferreira Filho (@otaviofff) Maria Alice G. V. FerreiraAbstractThe proposal is to allow the development of semantic Web services according to an architectural style called REST. More specifically, it considers a REST implementation based on the HTTP protocol, resulting in RESTful Semantic Web Services.The development of semantic Web services has been the subject of various academic papers. However, the predominant effort considers Web services designed according to another architectural style named RPC, mainly through the SOAP protocol. The RPC approach, strongly stimulated by the enterprise software industry, aggregates unnecessary processing and definitions that make Web services more complex than desired. Therefore, services end up being not as scalable or fast as possible.In fact, REST services form the majority of Web services currently developed on the social Web, an environment focused on user-generated content and networking, clearly stronger and more predominant than others.The proposal presented here makes use of a specific selection of existing languages and protocols, reinforcing its feasibility. First, OWL-S is used as the base ontology for services. Second, WADL is used for syntactically describing them. Third, the HTTP protocol is used for transferring messages; defining the action to be executed; and also defining the execution scope. Finally, URI identifiers are responsible for specifying the service interface.


  • - a Virtual Knowledge Graph system. It exposes the content of arbitrary relational databases as knowledge graphs. These graphs are virtual, which means that data remains in the data sources instead of being moved to another database. Ontop translates SPARQL queries expressed over the knowledge graphs into SQL queries executed by the relational data sources. It relies on R2RML mappings and can take advantage of lightweight ontologies.

Web tools


  • - or SPARQL/Update, was a declarative data manipulation language that extended the SPARQL 1.0 query language standard. SPARUL provided the ability to insert, delete and update RDF data held within a triple store or quad store. SPARUL was originally written by Hewlett-Packard and has been used as the foundation for the current W3C recommendation entitled SPARQL 1.1 Update. With the publication of SPARQL 1.1, SPARUL is superseded and should only be consulted as a source of inspiration for possible future refinements of SPARQL, but not for real-world applications.



  • Hydra W3C Community Group - Building Web APIs seems still more an art than a science. How can we build APIs such that generic clients can easily use them? And how do we build those clients? Current APIs heavily rely on out-of-band information such as human-readable documentation and API-specific SDKs. However, this only allows for very simple and brittle clients that are hardcoded against specific APIs. Hydra, in contrast, is a set of technologies that allow to design APIs in a different manner, in a way that enables smarter clients.

Semantic forms

Linked Data API

  • - This document defines a vocabulary and processing model for a configurable API layer intended to support the creation of simple RESTful APIs over RDF triple stores. The API layer is intended to be deployed as a proxy in front of a SPARQL endpoint to support: Generation of documents (information resources) for the publishing of Linked DataProvision of sophisticated querying and data extraction features, without the need for end-users to write SPARQL queries. Delivery of multiple output formats from these APIs, including a simple serialisation of RDF in JSON syntax.


See also Open data

Wikidata / wikibase

  • - the software that enables MediaWiki to store structured data or access data that is stored in a structured data repository. Wikibase basically consists of two MediaWiki extensions, Wikibase Repository and Wikibase Client, that can be enabled individually or together for a certain MediaWiki installation to turn it into a structured data repository, a client of a structured data repository or both.For example, Wikidata is a Wikibase Repository as well as a Wikibase Client.You can find out more about the overall architecture of Wikibase and its components on
  • - This page describes the RDF dump and export format produced by Wikidata and used for export and indexing purposes. Note that while it is close to the format used by the Wikidata Toolkit, it is not the same code and not the same format. While we strive to keep divergence to a minimum, there may be differences and one should use documentation only for the format that is actually being consumed.


Semantic desktop

  • - a collective term for ideas related to changing a computer's user interface and data handling capabilities so that data are more easily shared between different applications or tasks and so that data that once could not be automatically processed by a computer could be. It also encompasses some ideas about being able to share information automatically between different people. This concept is very much related to the Semantic Web, but is distinct insofar as its main concern is the personal use of information.

  • - Networked Environment for Personal, Ontology-based Management of Unified Knowledge) is an open-source software specification that is concerned with the development of a social semantic desktop that enriches and interconnects data from different desktop applications using semantic metadata stored as RDF. Between 2006 and 2008 it was funded by a European Union research project of the same name[2] that grouped together industrial and academic actors to develop various Semantic Desktop technologies




  • S-RDF: A New RDF Serialization Format for Better Storage Without Losing Human Readability | SpringerLink - Nowadays, RDF data becomes more and more popular on the Web due to the advances of the Semantic Web and the Linked Open Data initiatives. Several works are focused on transforming relational databases to RDF by storing related data in N-Triple serialization format. However, these approaches do not take into account the existing normalization of their databases since N-Triple format allows data redundancy and does not control any normalization by itself. Moreover, the mostly used and recommended serialization formats, such as RDF/XML, Turtle, and HDT, have either high human-readability but waste storage capacity, or focus further on storage capacities while providing low human-readability. To overcome these limitations, we propose here a new serialization format, called S-RDF. By considering the structure (graph) and values of the RDF data separately, S-RDF reduces the duplicity of values by using unique identifiers. Results show an important improvement over the existing serialization formats in terms of storage (up to 71,66% w.r.t. N-Triples) and human readability.

Web Observatory

  • Web Observatory Community Group - The sister organisation of W3C, the Web Science Trust ( proposes to create a global "Web Observatory". The Open Data movement and the Transparency Agenda are successfully advocating the release of very large institutional and commercial data sets describing social phenomena, economic indicators and geographic trends. This proliferation of data represents great opportunity for researchers and industry but this data abundance also threatens to make it ever more difficult to locate, analyse, compare and interpret useful information in a consistent and reliable way; a situation which can only get worse unless we can help stakeholders perform useful analysis rather than drowning in a sea of data. The Web Observatory will offer an institutional framework to promote the use of W3C and other standards in the development of; Semantic Catalogues to globally locate existing data sets, Collection Systems to gather new global data sets, and Analytics Tools and methodologies to analyse these data sets. This community group seeks to articulate the business and technical requirements for the Web Observatory.




  • Shapes Constraint Language (SHACL) - This document defines the SHACL Shapes Constraint Language, a language for validating RDF graphs against a set of conditions. These conditions are provided as shapes and other constructs expressed in the form of an RDF graph. RDF graphs that are used in this manner are called "shapes graphs" in SHACL and the RDF graphs that are validated against a shapes graph are called "data graphs". As SHACL shape graphs are used to validate that data graphs satisfy a set of conditions they can also be viewed as a description of the data graphs that do satisfy these conditions. Such descriptions may be used for a variety of purposes beside validation, including user interface building, code generation and data integration.
  • SHACL Playground - A constraint validator for the Shapes Constraint Language, written in JavaScript. Work in Progress!


  • ShEx - Shape Expressions - Validation, traversal and transformation of RDF graphs.Shape Expressions is a structural schema language for RDF graphs.
  • Shape Expressions Community Group - This group serves to promote and expand ShEx – Shape Expressions. ShEx is an alternative to SHACL which uses a syntactic representation to describe the shape of an RDF graph. The Shape Expressions (ShEx) language describes RDF nodes and graph structures. A node constraint describes an RDF node (IRI, blank node or literal) and a shape describes the triples touching nodes in RDF graphs. These descriptions identify predicates and their associated cardinalities and datatypes. ShEx shapes can be used to communicate data structures associated with some process or interface, generate or validate data, or drive user interfaces.


  • - was a search engine for Semantic Web ontologies, documents, terms and data published on the Web. Swoogle employed a system of crawlers to discover RDF documents and HTML documents with embedded RDF content. Swoogle reasoned about these documents and their constituent parts (e.g., terms and triples) and recorded and indexed meaningful metadata about them in its database. Swoogle provided services to human users through a browser interface and to software agents via RESTful web services. Several techniques were used to rank query results inspired by the PageRank algorithm developed at Google but adapted to the semantics and use patterns found in semantic web documents.


  • EYE - a reasoning engine supporting the Semantic Web layers. It performs semibackward reasoning and it supports Euler paths. Via N3 it is interoperable with Cwm.

  • EulerGUI - started as a GUI for the EulerSharp. Euler/SEM/EYE reasoning engine, and still is. The sources can be N3 (Notation 3), RDF, or OWL, UML, eCore, plain XML, XML Schema, files or URL's, SPARQL queries. Internally everything is expressed in the convergence language N3, that allows to express facts, classes and properties, and rules. It is becoming a lightweight IDE for Artificial Intelligence. It is under LGPL license.The EulerGUI Manual covers all aspects : install, usage, links, development, architecture . Latest User Manual from Subversion. The EulerGUI Sourceforce project page. Euler GUI is effective for developping and testing projects composed of N3, OWL, RDF(S) and UML ontologies and databases, with rules in N3 logic. One can launch 4 reasoning engines: Euler proof engine , CWM , FuXi and Drools; show Graphviz graph, generate Javascript+Java code.

  • Google Code Archive - Long-term storage for Google Code Project Hosting. - to leverage the efficiency of the RETE-UL algorithm and a logic programming meta-interpreter as the 'engine' for a Python-based open-source expert system for the semantic web, built on Python. It is inspired by its predecessors: cwm, pychinko - Rete-based RDF friendly rule engine, and euler - Euler proof mechanism.

  • Drools - Business Rules Management System (Java™, Open Source) - a Business Rules Management System (BRMS) solution. It provides a core Business Rules Engine (BRE), a web authoring and rules management application (Drools Workbench), full runtime support for Decision Model and Notation (DMN) models at Conformance level 3 and an Eclipse IDE plugin for core development.Drools is open source software, released under the Apache License 2.0. It is written in 100% pure Java™, runs on any JVM and is available in the Maven Central repository too.


Access control

See also Open social#WebID


  • S4AC Vocabulary Specification - (Social Semantic SPARQL Security for Access Control) is a lightweight vocabulary to create fine-grained access control policies for Linked Data. The vocabulary has the aim to design and share security information specifying the access control conditions under which the data is accessible. Implementations are free to extend S4AC to add further functionalities.


  • PROV-Overview - An Overview of the PROV Family of Documents, W3C Working Group Note 30 April 2013. Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. The PROV Family of Documents defines a model, corresponding serializations and other supporting definitions to enable the inter-operable interchange of provenance information in heterogeneous environments such as the Web. This document provides an overview of this family of documents.
  • PROV Model Primer - This document provides an intuitive introduction and guide to the PROV Data Model for provenance interchange on the web. PROV defines a core data model for provenance for building representations of the entities, people and processes involved in producing a piece of data or thing in the world. This primer explains the fundamental PROV concepts and provides examples of its use. The primer is intended as a starting point for those wishing to create or use PROV data.


  • - contains a collection of tool references that can help in developing Semantic Web applications. These include complete development environments, editors, libraries or modules for various programming languages, specialized browsers, etc.

  • RDF Translator - a multi-format conversion tool for structured markup. It provides translations between data formats ranging from RDF/XML to RDFa or Microdata. The service allows for conversions triggered either by URI or by direct text input. Furthermore it comes with a straightforward REST API for developers.

  • Simple javascript RDF Parser and query thingy - designed to run in a web-browser or SVG browser, allowing you to process RDF on the client. The parser isn't complete, there's no support for various bits of the spec, and isn't all that fast, especially with large XML/RDF files. I've found it quite useful though for simple querying.

  • RDFaCE - A Semantic content editor based on TinyMCE WYSIWYG editor. RDFaCE is created as a proof of concept for WYSIWYM (What You See Is What You Mean) concept. WYSIWYM aims to enable end-users to easily annotate their content using RDFa and Microdata markups. RDFaCE employs external NLP APIs to suggest namespaces, properties, URIs and to automatically annotate content.

  • W3C: ConverterToRdf - converts application data from an application-specific format into RDF for use with RDF tools and integration with other data. Converters may be part of a one-time migration effort, or part of a running system which provides a semantic web view of a given application.

  • Structured Data Linter - a tool aiding webmasters and web developers to verify the structured data present in their HTML pages. Search engines use structured data to understand webpages more accurately and to present enhanced search results. The Linter understands the microdata, JSON-LD and RDFa formats according to their latest specifications. Note however that it does not guarranty that all consumers (e.g. search engines) will make use of all the structured data available in your page. The linter does not currently support microformats (contributions welcome). In addition to providing snippet visualizations for, the linter performs limited vocabulary validations for, Dublin Core Metadata Terms, Friend of a Friend (FOAF), GoodRelations, Facebook's Open Graph Protocol, Semantically-Interlinked Online Communities (SIOC), Facebook's Open Graph Protocol, Simple Knowledge Organization System (SKOS), and



  • protégé - a free, open-source platform that provides a growing user community with a suite of tools to construct domain models and knowledge-based applications with ontologies.
  • VOWL Plugin for Protégé - a Protégé plugin for the user-oriented visualization of ontologies. It implements the Visual Notation for OWL Ontologies (VOWL) by providing graphical depictions for elements of the Web Ontology Language (OWL) that are combined to a force-directed graph layout representing the ontology. ProtégéVOWL is based on VOWL 2, which focuses on the visualization of the ontology schema (i.e., the classes, properties and datatypes, also known as TBox).

RDF Gravity

  • RDF-Gravity - a tool for visualising RDF/OWL Graphs/ ontologies. Its main features are: Graph Visualization, Global and Local Filters (enabling specific views on a graph), Full text Search, Generating views from RDQL Queries, Visualising multiple RDF files, RDF Gravity is implemented by using the JUNG Graph API and Jena semantic web toolkit.




  • Amaya - a Web editor, i.e. a tool used to create and update documents directly on the Web. Browsing features are seamlessly integrated with the editing and remote access features in a uniform environment. This follows the original vision of the Web as a space for collaboration and not just a one-way publishing medium.Work on Amaya started at W3C in 1996 to showcase Web technologies in a fully-featured Web client. The main motivation for developing Amaya was to provide a framework that can integrate as many W3C technologies as possible. It is used to demonstrate these technologies in action while taking advantage of their combination in a single, consistent environment.Amaya started as an HTML + CSS style sheets editor. Since that time it was extended to support XML and an increasing number of XML applications such as the XHTML family, MathML, and SVG. It allows all those vocabularies to be edited simultaneously in compound documents.




  • - Python library helps generating a vector space from very large hierarchies encoded in RDF. An obvious example application is to generate a vector space from a SKOS hierarchy or an RDFS subclass hierarchy.




  • - a library for RDF, SPARQL and Linked Data technologies in Scala.It can be used with existing libraries without any added cost. There is no wrapping involved: you manipulate directly the real objects. We currently support Jena, Sesame and Plantain, a pure Scala implementation.


  • Rhizomik - The ReDeFer project is a compendium of RDF-aware utilities organised in a set of packages: RDF2HTML+RDFa: render a piece of RDF/XML as HTML+RDFa. XSD2OWL: transform an XML Schema into an OWL Ontology. CS2OWL: transform a MPEG-7 Classification Scheme into an OWL Ontology. XML2RDF: transform a piece of XML into RDF. RDF2SVG: render a piece of RDF/XML as a SVG showing the corresponding graph.

Redland / Raptor

  • Redland - a set of free software C libraries that provide support for the Resource Description Framework (RDF).
    • Raptor is a free software / Open Source C library that provides a set of parsers and serializers that generate Resource Description Framework (RDF) triples by parsing syntaxes or serialize the triples into a syntax. The supported parsing syntaxes are RDF/XML, N-Quads, N-Triples, TRiG, Turtle, RDFa 1.0 and 1.1, RSS tag soup including all versions of RSS, Atom 1.0 and 0.3, GRDDL and microformats for HTML, XHTML and XML. The serializing syntaxes are RDF/XML (regular, and abbreviated), Atom 1.0, GraphViz, JSON, N-Quads, N-Triples, RSS 1.0 and XMP.


  • Serd - a lightweight C library for RDF syntax which supports reading and writing Turtle, TRiG, NTriples, and NQuads. Serd is suitable for performance-critical or resource-limited applications, such as serialising very large data sets, network protocols, or embedded systems that require minimal dependencies and lightweight deployment.


  • Sord - a lightweight C library for storing RDF data in memory.


  • ARC2 - a PHP 5.3 library for working with RDF. It also provides a MySQL-based triplestore with SPARQL support. Feature-wise, ARC2 is now in a stable state with no further feature additions planned. Issues are still being fixed and Pull Requests are welcome, though.
  • EasyRdf - A PHP library designed to make it easy to consume and produce RDF. Symfony.



  • rdflib - a pure Python package work working with RDF. RDFLib contains most things you need to work with RDF, including: parsers and serializers for RDF/XML, N3, NTriples, N-Quads, Turtle, TriX, RDFa and Microdata. a Graph interface which can be backed by any one of a number of Store implementations. store implementations for in memory storage and persistent storage on top of the Berkeley DB. a SPARQL 1.1 implementation - supporting SPARQL 1.1 Queries and Update statements.

  • - a small utility class that uses RDFLib to represent RDF as hierarchies. It is a subclass of dict with a few added methods that help parse and structure RDF. It turns itself into a dict of dicts and lists of dicts that represents a hierarchy according to nodes.


Ruby Distiller

  • Ruby Distiller - a Ruby Gem implementing RDF graphs, readers and writers to help integrate semantic technologies into Ruby projects. The distiller implements different commands, accepting different file inputs and options.


  • RDF/JS: Data model specification - This document provides a specification of a low level interface definition representing RDF data independent of a serialized format in a JavaScript environment. The task force which defines this interface was formed by RDF JavaScript library developers with the wish to make existing and future libraries interoperable. This definition strives to provide the minimal necessary interface to enable interoperability of libraries such as serializers, parsers and higher level accessors and manipulators.

  • Visual Data Web: Tools - Several tools have already been developed in the project that showcase the visual power of the Data Web.The following four tools are all implemented in the open source framework Adobe Flex. They are readily configured to access RDF data of the DBpedia and/or Linking Open Data (LOD) projects and only require a Flash Player to be executed (which is usually already installed in Web browsers).

  • Visual Browser - a Java application that can visualise the data in RDF scheme. The main principle of the visualisation is that: the triple (resource, resource, resource) is represented by two nodes connected by an edge the triple (resource, resource, literal) is represented by a hint (small window appearing on mouse over the subject node)Visual Browser uses the Jena framework to obtain the data, since the RDF scheme can be saved in different forms (a single XML file or a relational database).

  • - a web tool developed by the European Environment Agency which helps creating interactive data visualizations easily through the web browser, no extra tools are necessary. It is free and open source. You can generate attractive and interactive charts and combine them in a dashboard with facets/filters which updates the charts simultaneously. Data can be uploaded as CSV/TSV or you can specify SPARQL to query online Linked open data servers (aka sparql endpoints). Daviz is the first Semantic web data visualisation tool for Plone CMS, entirely web-based! At the moment Simile Exhibit and Google Charts visualizations are supported. The architecture allows to extend Daviz with more visualisation libraries (visualisations plugins). Zope/Plone.


  • graphy.js - a collection of high-performane RDF libraries for JavaScript developers with a focus on usability.


  • - or RDF store is a purpose-built database for the storage and retrieval of triples[1] through semantic queries. A triple is a data entity composed of subject-predicate-object, like "Bob is 35" or "Bob knows Fred".Much like a relational database, one stores information in a triplestore and retrieves it via a query language. Unlike a relational database, a triplestore is optimized for the storage and retrieval of triples. In addition to queries, triples can usually be imported/exported using Resource Description Framework (RDF) and other formats.


  • Annotea - home page, archived

  • Annotea - a W3C LEAD (Live Early Adoption and Demonstration) project under Semantic Web Advanced Development (SWAD). Annotea enhances collaboration via shared metadata based Web annotations, bookmarks, and their combinations. By annotations we mean comments, notes, explanations, or other types of external remarks that can be attached to any Web document or a selected part of the document without actually needing to touch the document. When the user gets the document he or she can also load the annotations attached to it from a selected annotation server or several servers and see what his peer group thinks. Similarly shared bookmarks can be attached to Web documents to help organize them under different topics, to easily find them later, to help find related material and to collaboratively filter bookmarked material.Annotea is open; it uses and helps to advance W3C standards when possible. For instance, we use an RDF based annotation schema for describing annotations as metadata and XPointer for locating the annotations in the annotated document. Similarly a bookmark schema describes the bookmark and topic metadata.Annotea is part of the Semantic Web efforts. It provides a RDF metadata based extendible framework for rich communication about Web pages while offering a simple annotation and bookmark user interface. The annotation metadata can be stored locally or in one or more annotation servers and presented to the user by a client capable of understanding this metadata and capable of interacting with an annotation server with the HTTP service protocol.

  • - was designed as an RDF standard sponsored by the W3C to enhance document-based collaboration via shared document metadata based on tags, bookmarks, and other annotations.

In this case document metadata includes:

  • Keywords
  • Comments
  • Notes
  • Explanations
  • Errors
  • Corrections

In general, Annotea associates text strings to a web document or selected parts of a web document without actually needing to modify the original document. Users that access any web documents can also load the metadata associated with it from a selected annotation server (or groups of servers, and see a peer group's comments on the document. Similarly shared metadata tags can be attached to web documents to help in future retrieval.

Annotea is an extensible standard and is designed to work with other W3C standards when possible. For instance, Annotea uses an RDF Schema for describing annotations as metadata and XPointer for locating the annotations in the annotated document. Similarly a bookmark schema describes the bookmark and topic metadata. Annotea is part of the W3C Semantic Web efforts. An example implementation of Annotea is W3C's Amaya editor/browser. The current Amaya user interface for annotations is presented in the Amaya documentation. Other projects consists of Plugins for Firefox/Mozilla or Annotatio Client which interacts with most browsers per JavaScript. Active development of Annotea seems to have been discontinued after 2003, as a W3C standard for annotation of documents on the webs, it is superseded by Web Annotation.



  • CumulusRDF - an RDF store on cloud-based architectures. CumulusRDF provides a REST-based API with CRUD operations to manage RDF data. The current version uses Apache Cassandra as storage backend. A previous version is built on Google's AppEngine. CumulusRDF is licensed under GNU Affero General Public License.




  • Hexastore - A fast, pure javascript triple store implementation, also useful as a graph database. Works in any browser, with browserify or webpack. Early development, API is subject to changes. It is a way to structure RDF data such that queries are really fast. However, as implemented here, it has a 6 fold increase in memory usage as compared to a naive implementation of a triple store.

ODS Briefcase

  • ODS Briefcase - ODS Briefcase is a WebDAV?-compliant platform that offers file-sharing functionality via a "Briefcase Data Space". Its standards-compliance enables the exploitation of File Server functionality via the following methods: Web Browser-based interactions Web Services - direct use of the HTTP based WebDAV? protocol Semantic Web's SPARQL Query Language - all WebDAV? resources are exposed as SIOC Ontology instance data (RDF Data Sets)

Apache Marmotta

  • Apache Marmotta - The goal of Apache Marmotta is to provide an open implementation of a Linked Data Platform that can be used, extended and deployed easily by organizations who want to publish Linked Data or build custom applications on Linked Data.
  • - a linked data platform that comprises several components. In its most basic configuration it is a Linked Data server. Marmotta is one of the reference projects early implementing the new Linked Data Platform recommendation that is being developed by W3C. It has been contributed by Salzburg Research from the Linked Media Framework, and continues its versioning, hence starting at version 3.0.0. Since April 2013, it is listed among the Semantic Web tools by the W3C. In November 2020, it was retired to the Apache Attic, meaning that the project is no longer being developed.



  • - a wiki that has an underlying model of the knowledge described in its pages. Regular, or syntactic, wikis have structured text and untyped hyperlinks. Semantic wikis, on the other hand, provide the ability to capture or identify information about the data within pages, and the relationships between pages, in ways that can be queried or exported like a database through semantic queries. Semantic wikis were first proposed in the early 2000s, and began to be implemented seriously around 2005. As of 2013, the best-known semantic wiki software, and the only one with significant usage on public websites, is Semantic MediaWiki.

  • - a free, open-source ex­ten­sion to MediaWiki – the wiki soft­ware that pow­ers Wikipedia – that lets you store and query data with­in the wiki's pages.Semantic MediaWiki is also a full-fledged frame­work, in con­junc­tion with many spin­off ex­ten­sions, that can turn a wiki into a pow­er­ful and flex­i­ble know­ledge manage­ment sys­tem. All data cre­ated with­in Semantic MediaWiki can easi­ly be ex­port­ed or pub­lish­ed via the Semantic Web, al­low­ing other sys­tems to use this data seam­less­ly.
  • - an extension to MediaWiki that allows for annotating semantic data within wiki pages, thus turning a wiki that incorporates the extension into a semantic wiki. Data that has been encoded can be used in semantic searches, used for aggregation of pages, displayed in formats like maps, calendars and graphs, and exported to the outside world via formats like RDF and CSV.

  • OntoWiki - a semantic application as well as a framework which acts as a hardened basement for your application in the Semantic Web context. One of its main purposes is to assist you managing your knowledge. Knowledge means here machine readable data organized as RDF/XML, Notation3, Turtle as well as Talis(JSON). You organize your knowledge using a feature-rich user interface managing classes, properties and resources.



  • Graphite - a PHP Library, built on top of ARC2, to make it easy to do stuff with RDF data really quickly, without having to naff around with databases. It is not intended to be scalable, or a way of authoring RDF data. [

Q&D RDF Browser

  • Q&D RDF Browser is powered by Graphite and ARC2 and hosted by ECS at the University of Southampton.




  • OpenRefine - previously Google Refine, is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.OpenRefine always keeps your data private on your own computer until YOU want to share or collaborate. Your private data never leaves your computer unless you want it to. (It works by running a small server on your computer and you use your web browser to interact with it)


Apache Jena



  • RWW.IO - a personal Linked Data store, intended to be used as a backend service for your Linked Data applications, and it supports the latest standards and recommendations: RDF, JSON-LD, SPARQL 1.1 Update, WebID. All data stores (endpoints) interpret the HTTP request URI as the base URI for RDF operations and the default-graph URI for SPARQL operations. When using the service as a backend, you need to follwo two basic rules: Specify the media type of your request data with a Content-Type HTTP header. Specify your response type preference with an Accept HTTP header.


  • - created in April 2009 from the merger of SOLINET and PALINET, two US based library networks. NELINET, the New England library network, also merged into LYRASIS in late 2009. In January 2011, the Bibliographical Center for Research, Denver, CO (BCR) phased out operations and joined LYRASIS.


See also MediaWiki#Semantic

  • PublishMyData is a Linked Data publishing platform. It lets you serve your 5-star Open Data on the web in a format that’s easy to understand, but it’s also machine readable so data experts can exploit it.

  • URIBurner - A data virtualization service that transforms data hosted in a variety of data spaces and formats into standards compliant Linked Data Objects for uniform access, integration and management. The underlying technology is Virtuoso's in-built Linked Data Middleware (aka Sponger) that uses URLs as data source names for its powerful data ingestion and transformation services that result in highly navigable Linked Data Object graphs. Post transformation, each Data Object is endowed with a dereferenceable identifier (Name) that resolves to its actual representation via its URL (Address). The Sponger then re-presents Data Object descriptions via HTML documents (the default behavior) or in a variety of raw data graph forms that include: CSV, N-Triples, Turtle, N3, RDF/XML, JSON, CXML, OData (Atom and JSON) etc.

  • CubicWeb Semantic Web Framework - a semantic web application framework, licensed under the LGPL, that empowers developers to efficiently build web applications by reusing components (called cubes) and following the well known object-oriented design principles.Its main features are: an engine driven by the explicit data model of the application, a query language named RQL similar to W3C’s SPARQL, a selection+view mechanism for semi-automatic XHTML/XML/JSON/text generation, a library of reusable components (data model and views) that fulfill common needs, the power and flexibility of the Python programming language, the reliability of SQL databases, LDAP directories, Subversion and Mercurial for storage backends.

  • MELD - Music Encoding and Linked Data

  • Technology – Mico Project - an environment that will allow to analyse “media in context” by orchestrating a set of different analysis components that can work in sequence on content, each adding their bit of additional information to the final result. Analysis components can e.g. be a “language detector” (identifying the language of text or an audio track), a “keyframe extractor” (identifying relevant images from a video), a “face detector” (identifying objects that could be faces), a “face recognizer” (assigning faces to concrete persons), an “entity linker” (assigning objects to concrete entities) or a “disambiguation component” (resolving possible alternatives be choosing the more likely given the context).


Life sciences

  • Google Code Archive - Long-term storage for Google Code Project Hosting. - This document provides concise information about topics related to RDF and OWL in the context of the biodiversity informatics community. It is intended as an introduction for persons who are not already familiar with RDF and OWL and as a reference for persons who are familiar but would like organized access to additional reference material.


  • - an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets. XMP standardizes a data model, a serialization format and core properties for the definition and processing of extensible metadata. It also provides guidelines for embedding XMP information into popular image, video and document file formats, such as JPEG and PDF, without breaking their readability by applications that do not support XMP. Therefore, the non-XMP metadata have to be reconciled with the XMP properties. Although metadata can alternatively be stored in a sidecar file, embedding metadata avoids problems that occur when metadata is stored separately.