Open data

From Things and Stuff Wiki
Revision as of 20:54, 19 August 2024 by Milk (talk | contribs) (→‎Geo)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


See also Free/open, Learning, Data, Semantic, Database, WebDev#API,

  • - also data publication, is the act of releasing research data in published form for use by others. It is a practice consisting in preparing certain data or data set(s) for public use thus to make them available to everyone to use as they wish. This practice is an integral part of the open science movement. There is a large and multidisciplinary consensus on the benefits resulting from this practice.

  • The Turing Way - handbook to reproducible, ethical and collaborative data science. The Turing Way project is open source, open collaboration, and community-driven. We involve and support a diverse community of contributors to make data science accessible, comprehensible and effective for everyone. Our goal is to provide all the information that researchers and data scientists in academia, industry and the public sector need to ensure that the projects they work on are easy to reproduce and reuse. Top Tip The Turing Way is not meant to be read from start to finish. Start with a concept, tool or method that you need now, in your current work. Browse the different guides that make up the book, or use the search box to search for whatever you would like to learn about first. All stakeholders, including researchers, software engineers, project leaders and funding teams, are encouraged to use The Turing Way to understand their roles and responsibility of reproducibility in data science. You can inspect our resources on GitHub, contribute to the project as described in our contribution guidelines and re-use all materials.

  • The ODI - Open Data Institute, works with companies and governments to build an open, trustworthy data ecosystem, where people can make better decisions using data and manage any harmful impacts.
  • - a non-profit private company limited by guarantee, based in the United Kingdom. Founded by Sir Tim Berners-Lee and Sir Nigel Shadbolt in 2012, the ODI's mission is to connect, equip and inspire people around the world to innovate with data.

  • ODI Learning - Improving the data literacy of our workforce is essential to help organisations evolve their data practices and get more value from data. Improving data literacy will help organisations build effective data focussed business models, create good data governance processes and practices, and become more trusted with data as a result.

  • ODI Open Data Certificate - We verify publisher best practice, so you can use data with confidence. It’s free and open.
    • - This source code is for the ODI's Open Data Certificates app at The online assessment tool allows publishers to assess how good their open data release is across technical, social, legal and other areas. When published, a certificate (which can be Bronze, Silver, Gold or Platinum, shows data reusers how much they can trust and rely on the dataset

  • Frictionless Data - a progressive open-source framework for building data infrastructure – data management, data integration, data flows, etc. It includes various data standards and provides software to work with data. The software is based on a suite of data standards that have been designed to make it easy to describe data structure and content so that data is more interoperable, easier to understand, and quicker to use. There are several aspects to the Frictionless software, including two high-level data frameworks (for Python and JavaScript), 10 low-level libraries for other languages, like R, and also visual interfaces and applications. You can read more about how to use the software (and find documentation) on the projects page.
  • frictionless-ci | Frictionless Repository - Continuous Data Validation: With Frictionless Repository you can ensure the quality of your data. This Github Action will report any problems with your data like bad header or missing cells.

  • EOSC Portal - Your unified access to the European hub of research data, tools and services for innovation and education

Data sources

  • The Our World in Data-Grapher - The Our World in Data Grapher is the open-source tool to store and visualize data developed by the Our World in Data team.As every other tool developed and used at Our World in Data, the Grapher is also open source and free to use on any other web publication. You can find all the code in the Github repository, published under the MIT license.

Hubs / platforms


  • CKAN is a fully-featured, mature, open source data portal and data management solution. CKAN provides a streamlined way to make your data discoverable and presentable. Each dataset is given its own page with a rich collection of metadata, making it a valuable and easily searchable resource.


  • Socrata - In support of its commitment to the open data community and to the proliferation of open data standards, Socrata is proud to bring you the "Socrata Open Data Server, Community Edition." Community Edition is a freely-available, open source product that shares the core of our open data platform. Read here about the motivations behind the Socrata Open Data Server, the architecture of the system we are building, and how to contribute.

  • Socrata - The Socrata Open Data API allows you to programmatically access a wealth of open data resources from governments, non-profits, and NGOs around the world. Click the link below and try a live example right now.

  • - business-to-government software company, that sells an "open data platform" whose goal was to help "civic developers build apps more efficiently." In July 2014, Socrata launched the Open Data Network, a machine learning-powered initiative aimed at promoting data-centered collaboration between the public and private sectors. This network provides governments with access to various types of data, including crime data, transit data, 311 service request data, and expenditure data.[9] The San Francisco administration later incorporated the open data network into its operations.

  • Swagger UI - Service for Data Access This service implements the IVOA SODA-1.0 service specification. To use this service, the caller must use a dataset identifier found through some means (for example, querying the ALMA TAP ObsCore Service). The SODA service provides a drill-down mechanism to access the data files and associated resources.


  • SODA API - SODA Foundation - Provides the standardization for Data / Storage Management APIs. Currently we support block and file APIs for key features of data management (provisioning, migration, fileshare, etc). Working to add the storage management APIs. This is the key external interface to platforms, which can do a seamless integration with heterogeneous storage backends. Users can develop SODA North-Bound Plugins (SODA NBP) under SODA NBP project to connect any platform or application solutions to SODA API from north for all storage/data requirements. We envision this to be the reference implementation of SODA Data Standard API Specification, which we plan to work with our industry partners and standards bodies. At that stage, this layer will upgraded to support Block, File and Object APIs across the Edge, Core and Cloud.
  • SODA API Specification :: Documentation for SODA Project - Standards for Data and Storage are an umbrella API Standards comprising of a collection of multiple data and storage API specifications released by the SODA Foundation (under Linux Foundation). It provides unified RESTful API interfaces with standardized data models for data and storage across the edge, core(on-prem), and cloud. It will consolidate, update, or develop API definitions to provide unified, extensible, and open industry standards collaborating with partners, vendors, and standard associations. It will have the universal application with needed customization as per the country, region and other legal needs where it is used/deployed Overall Scope SODA API Standards for Data and Storage aim to put together a set of specifications that would be: Unified | Open | Vendor-neutral | Platform agnostic | Environment aware | Extensible This document provides the latest versions of all API specifications under SODA API Standards for Data and Storage. Audience Main audience members are (not limited to) SODA API Standards Team, API Specification Software implementors, Platform&Vendors who want to utilize SODA API for their solutions.

  • SODA Foundation - an open source project under Linux Foundation that aims to establish an open, unified, and autonomous data management framework for data mobility from the edge, to core, to cloud. SODA brings together industry leaders to collaborate on building a common framework to promote standardization and best practices for data storage, data protection, data governance, data analytics, etc. to support IoT, big data, machine learning, and other applications. We are fostering collaboration and innovation across vendors, system integrators, cloud service providers, standards organizations, and consortiums across different industries, to provide quality end-to-end solutions to end users.
    • SODA Foundation Documentation - an open source project under Linux Foundation that aims to foster an ecosystem of open source data management and storage software for data autonomy. SODA Foundation offers a neutral forum for cross-projects collaboration and integration and provides end users quality end-to-end solutions.

  • - the SODA Infrastructure Manager project is an an open source project to provide unified, intelligent and scalable resource management, alert and performance monitoring. It will cover the resource management of all the storage backends & other infrastructures under SODA deployment. It will also provide the alert management and metric data(performance/health) for monitoring and further analysis. It will provide a scalable framework where more and more backends as well as client exporters can be added. This will enable to add more storage and infrastructure backends and also support different management clients for monitoring and health prediction. It provides unified APIs to access, export and connect with clients as well as a set of interfaces for various driver addition.


  • This new resource is an outcome of the Jisc-funded SKOS-HASSET project, led by staff at the UK Data Archive at the University of Essex, which owns and manages HASSET. Like dictionaries, thesauri describe the changing world around them; this is why the UK Data Archive continues work to ensure HASSET is up to date. Simple Knowledge Organisation System(SKOS) makes the thesaurus machine-readable. It is the version of Resource Description Framework (RDF) specific to classification resources. It encodes these products in a standardised way to make their structures comparable and to facilitate interaction.

Government / UK

  • is a key part of the Government's work on Transparency which is being lead by the Transparency Board. implementation is being led by the Transparency and Open Data team in the Cabinet Office, working across government departments to ensure that data is released in a timely and accessible way. This work is being supported by Sir Tim-Berners Lee & Professor Nigel Shadbolt. There are a number of technical partners involved in the project to date. These include the Comprehensive Knowledge Archive Network (CKAN): CKAN runs the catalogue at as well as a growing number of open data registries around the world. It is a project created by the Open Knowledge Foundation to make it easy to find, share and reuse open content and data. The CKAN software provides a web interface, programmer's API, feeds notifying of changes, and a browsable history of all changes. The API is documented here:
  • - Collaboration space for discussing and exploring technical and data standards

  • Who is doing what? - This page lists the domains which publish and maintain linked data and short term projects developing the government use of linked data. Most sectors have one or more SPARQL endpoints, which enable you to perform searches across the data; you can access these interactively on this site.

  • London Datastore has been created by the Greater London Authority (GLA) as an innovation towards freeing London’s data. We want citizens to be able access the data that the GLA and other public sector organisations hold, and to use that data however they see fit – free of charge. The GLA is committed to influencing and cajoling other public sector organisations into releasing their data here too.

  • Open Data Communities - Open Access to Local Data. This site is the UK Department for Communities and Local Government's official Linked Open Data site. It provides a selection of statistics on a variety of themes including Local Government finance, housing and homelessness, wellbeing, deprivation, and the department's business plan as well as supporting geographical data. All of the data is available as fully browsable and queryable Linked Data, and the majority is free to re-use under the Open Government Licence.






"Action 2.4 We will develop proposals with partners for releasing more government information and data for use by the public. Initial proposals to be developed and implementation to begin by end of July 2011. We invite suggestions for areas where the greater availability of public data could lead to new services or innovative applications " - March 3 2011


  • ALISS stands for Access to Local Information to Support Self Management. It’s a wide-ranging project taking a number of approaches to making it easier to find local self management support.




  • EERAdata project - Towards a FAIR and open data ecosystem in the low-carbon energy research community

  • EERAdata Community Platform - Sharing data among energy system stakeholders is key for the low-carbon energy transition. This platform provides easy access to a wide range of energy data and offers services and tools for implementing the Findability, Accessibility, Interoperability, Re-usability (FAIR) data principles.
  • Data-driven decision-support to increase energy efficiency through renovation in European building stock. | EERAdata | Project | Fact sheet | H2020 | CORDIS | European Commission - To address climate change, the EU is pursuing an ambitious initiative to improve energy efficiency (EE) – a fundamental goal of the EU’s energy policy. EE helps reduce public and private costs and diminishes environmental damage. Local administrations are particularly affected but limited coordination of related stakeholders including municipalities obstructs its wide range effective application. The EU-funded EERAdata project will test a software application that will support local administrations in policymaking. It will collect data from a wide range of sources to describe and estimate the impact of EE in different types of buildings to close the knowledge gap that lack of coordination provokes.
  • Community News: EERAdata Says Goodbye: WHY H2020 Project - After almost 3 years of great collaborations and hard work, the EERAdata project has come to an end last month. The partners presented the EERAdata Decision Support Tool, a socio-economic and LifeCycle Assessment software for energy efficiency interventions on the public building stock during their final events. If you have missed these catch up by reading the key takeaways from the conference and watch the presentation from the training here.


  • Bulk Data | GovInfo - select collections available in bulk in a machine-readable format (i.e. XML) via our Bulk Data Repository. The top level directory for select collections, such as the Federal Register and Code of Federal Regulations, also includes a Resources directory that contains the XML schema, XSL stylesheet, and user guide.


  • - YAGO2s is a huge semantic knowledge base, derived from Wikipedia WordNet and GeoNames. Currently, YAGO2s has knowledge of more than 10 million entities (like persons, organizations, cities, etc.) and contains more than 120 million facts about these entities.








  • - a set of file format specifications intended to facilitate electronic data transmission in the legal industry. The phrase is abbreviated LEDES and is usually pronounced as "leeds". The LEDES specifications are maintained by the LEDES Oversight Committee (LOC), which started informally as an industry-wide project led by the Law Firm and Law Department Services Group within PricewaterhouseCoopers in 1995. In 2001, the LEDES Oversight Committee was incorporated as a California mutual-benefit nonprofit corporation and is now led by a seven-member Board of Directors.

The LOC maintains four types of data exchange standards for legal electronic billing (ebilling); budgeting; timekeeper attributes; and intellectual property matter management.

The LOC also maintains five types of data elements in the LEDES data exchange standards: Uniform Task-Based Management System codes, which classify the work performed by type of legal matter; activity codes, which classify the actual work performed; expense codes, which classify the type of expense incurred; timekeeper classification codes; and error codes, which assist law firms with understanding invoice validation errors.

The LOC has also created an API that allows for system-to-system transmission of legal invoices from law firms and other legal vendors required by their clients to ebill, to the third-party ebilling systems. Other functionality is also supported in this very complex standard, which is intended to ease the burden at the law firm for managing client-required ebilling.


internet of things;


  • Wikxhibit - Author interactive applications of Wikidata and other sources of data on the web



  • Observable - Discover insights faster and communicate more effectively with interactive notebooks for data analysis, visualization, and exploration.



See WebDev#API

  • Swagger Specification - an API description format for REST APIs. An OpenAPI file allows you to describe your entire API, including:

  • - a framework that automagically handles HTTP requests based on OpenAPI Specification (formerly known as Swagger Spec) of your API described in YAML format. Connexion allows you to write an OpenAPI specification, then maps the endpoints to your Python functions; this makes it unique, as many tools generate the specification based on your Python code. You can describe your REST API in as much detail as you want; then Connexion guarantees that it will work as you specified.