Open Data

Defining ‘open’

Open Data means data that is published in accordance with the principles set out in the Open Data Charter, which speak for themselves, though in the private sector we would replace the word ‘Citizen’ with ‘Community’:

  1. Open by Default
  2. Timely and Comprehensive
  3. Accessible and Usable
  4. Comparable and Interoperable
  5. For Improved Governance and Citizen Engagement
  6. For Inclusive Development and Innovation

It’s important to understand just how ‘open’ this is.  It means no sign-ups, no logins, no contracts and no restrictions on further sharing.  Truly open.

The case for Open Data

Publishing open data is a path that organisations take for more than just the strategic benefits they get from it, they also take this path because it aligns with their mission of serving a community or simply because it is the right thing to do. In that context, the overall benefits of open data can be listed as:

  • Informs the community and helps them make informed decisions.
  • Deepens the engagement of your existing community.
  • Creates an ecosystem of independent data users who increase the utility and value of the data.
  • This ecosystem can generate valuable insights, help identify hidden issues and explore the viability of potential new products

One surprising outcome that every organisation that publishes open data soon discovers, is that internal staff switch from getting their data from an internal system and instead prefer to get it from the open data platform as if they were any other member of the community.

Four levels of publication

Open data is still maturing, and different vendors and data publishers have taken different approaches to open data publication.  The four levels below represent the full set (as we understand it today) but it is rare to see any vendor or data publisher fully addressing all four.

1. Catalogue

Many organisations choose to publish a comprehensive list of datasets as it is a key part of the implementation of data governance and the creation of a data asset inventory.  Some organisations do not publish a comprehensive list, but only list those they are able to publish as open data.  In very large organisations, such as governments, it is sometimes only possible to publish the catalogue centrally with pointers to the datasets that are held within different divisions.  

If done correctly, there is still a major technical component here as the catalogue should be available in machine-readable form as should the metadata for each dataset.

2. Data

This is publishing the data in an entirely open way.  Just point your browser, Excel or other tool at a URL and download.  Nothing else required.

Almost all open data publication platforms will make every dataset they hold available in any common data format: CSV, JSON, ODATA, SODA and more, enabling the data consumer to use whatever tool and format is best for them.

It should be noted that it can take a lot of work for an organisation to publish data if it means building multiple complex pipelines to extract, aggregate, anonymise and then push data to an open data platform.  Some platforms aim to lessen this work by providing a wide range of harvesting tools, but with those tools come new security risks.

3. Interactive exploration

Data consumers have different skill levels ranging from those that can build an interactive data visualisation in code, to those that find Excel too complex.  In order to ensure as widespread use of data as possible, many open data platforms have built-in interactive data exploration tools that enable the user to sort, filter, aggregate and chart the data. These charts can then be saved, shared and embedded in other web pages.

It’s easy to see how these features can build a community with people exploring the data, finding an insight, preserving it as a chart or table, and then sharing it for others to comment on and reply with their own observations.

4. Data stories

The final level of publication is using the data, often from multiple datasets, to tell a story through text and charts in a long-from publication.  Because the data is open, those reading the story can download it themselves and verify any claimed insights or they can write their own story around the data.

With the embedding feature of interactive exploration, anyone can do this on their own site, but those stories are then isolated and hard to find.  Open data platforms are now adding their own tools to support story creation and sharing in a central location alongside the data, thereby making them easier to find.

Open data platforms

Most organisation choose to use an off-the-shelf platform to publish their open data rather than write their own, as that quickly enables the various levels of publication and all of the various features.  The platforms range from the fully open source, to open source with proprietary add-ons to entirely proprietary.  All are available as cloud platforms and the open sources platforms can also be installed in-house.

The market in these platforms is developing rapidly and they all have some significant differences from each other, which means that the choice of platform normally requires an assessment of the market against formal requirements or even a tender rather than simply picking one of the leaders.

How we can help

We can help you in multiple ways:

  • Manage a full open data implementation programme.
  • Advise on the best open data platform to meet your needs or conduct an RFP to test the market.
  • Generate a comprehensive data asset inventory and associated standards.
  • Introduce a full or partial data governance framework to support open data.
  • Develop staff support resources and deliver the cultural change needed to embed open data.
  • Specify measurements and metrics to demonstrate the ongoing value of open data.