Open data

From Things and Stuff Wiki
Jump to navigation Jump to search


General

data, noun

  • facts and statistics collected together for reference or analysis: there is very little data available
    • the quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media.
    • Philosophy things known or assumed as facts, making the basis of reasoning or calculation.

See also Coding#Serialization_and_markup, Coding#Data_types_and_structures, Coding#Stats_and_big_data

Learning

Open Data

See also WebDev#API, Open

"in 2009 and 9 the American Government launched data.gov - 'a comprehensive catalogue of data provided by federalo agencies and represents transparency and accountability in groups and officials... how the government spends tax dollars' - (YAU 2011)

Data.gov.uk - Opening up Government More than 9000 data sets

"As of April 2010 the following UK Government departments and agencies have provided data sets to data.gov.uk:BusinessLink, the Cabinet Office, the Department for Business, Innovation and Skills, the Department for Children, Schools and Families, the Department for Communities and Local Government, theDepartment for Culture, Media and Sport, the Department for Environment, Food and Rural Affairs, the Department for International Development, the Department for Transport, the Department for Work and Pensions, the Department of Energy and Climate Change, theDepartment of Health, the Foreign and Commonwealth Office, the Home Office, Her Majesty's Treasury, Lichfield District Council, the Ministry of Defence, the Ministry of Justice, the Northern Ireland Office, theOrdnance Survey, and the Society of Information Technology Management.

"All data included in data.gov.uk is covered either by Crown Copyright, the Crown Database Right or have been licensed to the Crown. In turn, all data available on data.gov.uk is available under a worldwide, royalty-free, perpetual, non-exclusive license which permits use of the data under the following conditions: the copyright and the source of the data should be acknowledged by including an attribution statement specified by data.gov.uk, which is 'name of data provider' data © Crown copyright and database right. the inclusion of the same acknowledgement is required in sub-licensing of the data, and further sub-licenses should require the same. The data should not be used in a way that suggests that the data provider endorses the use of the data. And the data or its source should not be misrepresented"

Semantic Web

See also Open web#Semantic

  • Semantic Web - a “Web of data,” the sort of data you find in databases. The ultimate goal of the Web of data is to enable computers to do more useful work and to develop systems that can support trusted interactions over the network. The term “Semantic Web” refers to W3C’s vision of the Web of linked data. Semantic Web technologies enable people to create data stores on the Web, build vocabularies, and write rules for handling data. Linked data are empowered by technologies such as RDF, SPARQL, OWL, and SKOS.
  1. Use URIs to denote things.
  2. Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.
  3. Provide useful information about the thing when its URI is dereferenced, leveraging standards such as RDF, SPARQL.
  4. Include links to other related things (using their URIs) when publishing data on the Web.

or

  1. All kinds of conceptual things, they have names now that start with HTTP.
  2. I get important information back. I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event.
  3. I get back that information it's not just got somebody's height and weight and when they were born, it's got relationships. And when it has relationships, whenever it expresses a relationship then the other thing that it's related to is given one of those names that starts with HTTP.

On the Semantic Web, vocabularies define the concepts and relationships (also referred to as “terms”) used to describe and represent an area of concern. Vocabularies are used to classify the terms that can be used in a particular application, characterize possible relationships, and define possible constraints on using those terms. In practice, vocabularies can be very complex (with several thousands of terms) or very simple (describing one or two concepts only).

There is no clear division between what is referred to as “vocabularies” and “ontologies”. The trend is to use the word “ontology” for more complex, and possibly quite formal collection of terms, whereas “vocabulary” is used when such strict formalism is not necessarily used or only in a very loose sense. Vocabularies are the basic building blocks for inference techniques on the Semantic Web.

  • Agile Knowledge Engineering and Semantic Web (AKSW) is hosted by the Chair of Business Information Systems (BIS) of the Institute of Computer Science (IfI) / University of Leipzig as well as the Institute for Applied Informatics (InfAI). Goals: Development of methods, tools and applications for adaptive Knowledge Engineering in the context of the Semantic Web. Research of underlying Semantic Web technologies and development of fundamental Semantic Web tools and applications. Maturation of strategies for fruitfully combining the Social Web paradigms with semantic knowledge representation techniques.
  • Sindice - Data Web Services. Millions of websites mark up their content using RDF, Microformats, Microdata, Schema.org, RDFa, Opengraph and more. Sindice helps you find, understand and integrate with their content.

RFD

Vocabularies

Tools

  • W3C: RDFImportersAndAdapters
  • W3C: ConverterToRdf converts application data from an application-specific format into RDF for use with RDF tools and integration with other data. Converters may be part of a one-time migration effort, or part of a running system which provides a semantic web view of a given application.
  • Redland is a set of free software C libraries that provide support for the Resource Description Framework (RDF).
    • Raptor is a free software / Open Source C library that provides a set of parsers and serializers that generate Resource Description Framework (RDF) triples by parsing syntaxes or serialize the triples into a syntax. The supported parsing syntaxes are RDF/XML, N-Quads, N-Triples, TRiG, Turtle, RDFa 1.0 and 1.1, RSS tag soup including all versions of RSS, Atom 1.0 and 0.3, GRDDL and microformats for HTML, XHTML and XML. The serializing syntaxes are RDF/XML (regular, and abbreviated), Atom 1.0, GraphViz, JSON, N-Quads, N-Triples, RSS 1.0 and XMP.
  • rdfstore-js is a pure Javascript implementation of a RDF graph store with support for the SPARQL query and data manipulation language. node.js

other

Heterogeneous Information Sources on the Web

Linked Open Data

  • W3C: Linking Open Data - The Open Data Movement aims at making data freely available to everyone. There are already various interesting open data sets available on the Web. Examples include Wikipedia, Wikibooks, Geonames, MusicBrainz, WordNet, the DBLP bibliography and many more which are published under Creative Commons or Talis licenses. The goal of the W3C SWEO Linking Open Data community project is to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting RDF links between data items from different data sources. RDF links enable you to navigate from a data item within one data source to related data items within other sources using a Semantic Web browser. RDF links can also be followed by the crawlers of Semantic Web search engines, which may provide sophisticated search and query capabilities over crawled data. As query results are structured data and not just links to HTML pages, they can be used within other applications.
  • Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods. More specifically, Wikipedia defines Linked Data as "a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF." This site exists to provide a home for, or pointers to, resources from across the Linked Data community.
  • LOD cloud diagram shows datasets that have been published in Linked Data format, by contributors to the Linking Open Data community project and other individuals and organisations. It is based on metadata collected and curated by contributors to the Data Hub. Clicking the image will take you to an image map, where each dataset is a hyperlink to its homepage.
  • LodLive project provides a demonstration of the use of Linked Data standards (RDF, SPARQL) to browse RDF resources. The application aims to spread linked data principles using a simple and friendly interface with reusable techniques.
JSON-LD

JSON-LD was created by people that have been directly involved in the Linked Data, lowercase semantic web, uppercase Semantic Web, Microformats, Microdata, and RDFa work. It has proven to be useful to them. There are a number of very large technology companies that have adopted JSON-LD, further underscoring its utility.

OWL

SPARQL

LOV

VOAF

Python

JavaScript

REST

See WebDev#API

Validator

Search

Other

See also MediaWiki#Semantic

Data sources

Hubs / platforms

  • CKAN is a fully-featured, mature, open source data portal and data management solution. CKAN provides a streamlined way to make your data discoverable and presentable. Each dataset is given its own page with a rich collection of metadata, making it a valuable and easily searchable resource.

UK

  • http://www.data-archive.ac.uk/find/hasset-thesaurus/skos-hasset This new resource is an outcome of the Jisc-funded SKOS-HASSET project, led by staff at the UK Data Archive at the University of Essex, which owns and manages HASSET. Like dictionaries, thesauri describe the changing world around them; this is why the UK Data Archive continues work to ensure HASSET is up to date. Simple Knowledge Organisation System(SKOS) makes the thesaurus machine-readable. It is the version of Resource Description Framework (RDF) specific to classification resources. It encodes these products in a standardised way to make their structures comparable and to facilitate interaction.

Government

  • Data.gov.uk is a key part of the Government's work on Transparency which is being lead by the Transparency Board. Data.gov.uk implementation is being led by the Transparency and Open Data team in the Cabinet Office, working across government departments to ensure that data is released in a timely and accessible way. This work is being supported by Sir Tim-Berners Lee & Professor Nigel Shadbolt. There are a number of technical partners involved in the project to date. These include the Comprehensive Knowledge Archive Network (CKAN): CKAN runs the catalogue at data.gov.uk/data as well as a growing number of open data registries around the world. It is a project created by the Open Knowledge Foundation to make it easy to find, share and reuse open content and data. The CKAN software provides a web interface, programmer's API, feeds notifying of changes, and a browsable history of all changes. The API is documented here: http://data.gov.uk/data/api.
  • data.gov.uk: Who is doing what? - This page lists the domains which publish and maintain linked data and short term projects developing the government use of linked data. Most sectors have one or more SPARQL endpoints, which enable you to perform searches across the data; you can access these interactively on this site.
  • London Datastore has been created by the Greater London Authority (GLA) as an innovation towards freeing London’s data. We want citizens to be able access the data that the GLA and other public sector organisations hold, and to use that data however they see fit – free of charge. The GLA is committed to influencing and cajoling other public sector organisations into releasing their data here too.
  • Open Data Communities - Open Access to Local Data. This site is the UK Department for Communities and Local Government's official Linked Open Data site. It provides a selection of statistics on a variety of themes including Local Government finance, housing and homelessness, wellbeing, deprivation, and the department's business plan as well as supporting geographical data. All of the data is available as fully browsable and queryable Linked Data, and the majority is free to re-use under the Open Government Licence.

Education

BBC

Other

Scotland

National

"Action 2.4 We will develop proposals with partners for releasing more government information and data for use by the public. Initial proposals to be developed and implementation to begin by end of July 2011. We invite suggestions for areas where the greater availability of public data could lead to new services or innovative applications " - March 3 2011

Health

  • ALISS stands for Access to Local Information to Support Self Management. It’s a wide-ranging project taking a number of approaches to making it easier to find local self management support.

Local

Ireland

Europe

USA

Gloal

Crowdsourced

Geo

Commercial

Development

JavaScript

Articles

APIs

Scraping

Tools