How To Choose The Right Open Data Platform For You

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Publishing Open Data

David Tarrant [email protected] @davetaz

Aim

Provide an overview of current open data


publishing practices.

Outcomes


Understand the dierence between data on the web and the web of data
Evaluate a number of dierent approaches for publishing open data.
Develop a strategy for publishing data applicable to a specific domain.

Publication phases
Phase 1: Get the data online, in some
form. This will help with the trust and
transparency and community building.

Phase 2: Increase the usability of the data by
potentially publishing dierently and keeping it up
to date.

Data ON the web


Government data
Private sector data
Google advanced
Aggregators and portals
Scraping

data.gov.XX

Government

Government / Private

Suppliers

X
BP: You may not frame this site nor link to a page other than the home page without our express permission.
To nd this Google bp sta's'cal review

Suppliers

h@p://manufacturingmap.nikeinc.com/#
You agree not to change or delete any ownership noDces from materials downloaded or printed from the PlaEorm. You agree not to
modify, copy, translate, broadcast, perform, display, distribute, frame, reproduce, republish, download, display, post, transmit or sell
any Intellectual Property or Content appearing on the PlaEorm

Aggregators and portals


Collect together data from across the web into one place.

enigma.io

transportAPI

Data IN
the web

The developers secret

Linked data

Amazing but hard to publishing and use.

Approaches to publishing data

ON the web

IN the web

Exercise

List 3 datasets that have been
published ON the web (& where)?

List 1 that has been published IN the web (& where)?
5 minutes

Open Data
Platforms

h@p://www.ickr.com/photos/wwarby

Types
Specialist Solu'on
+ Easy to get setup and maintain.
+ Open Data focused
+ Clear workows for publishing open
data
+ VisualisaDon tools
+ Data mashing tools
+ Best for transacDonal data

Integrated Solu'on
+ No new plaEorm to learn
+ Data is provided in parallel to web
pages
+ No separaDon from authoritaDve data
+ Easy discovery of data
+ Best for reference data
+ Best for Linked Open Data

Key characteristics of
specialist solution
1. Separate from your main org website
2. Designed to publish open data, not
to fulfill other organisation goals

Key characteristics of
integrated solution
1. It is your main website
2. Publishes data alongside everything
else that the organisation does

Merging specialist and integrated


Method 1: Build the functionality of your current
website into a new open data platform.

Method 2: Hide the specialist solution behind your
main website and use it as a loosely coupled CMS.

The sliding scale of specialist solutions


1.
1 Catalogue: Point to data (leave it at source)
2.
2 Re-present: Provide data services (leave it at source)
3.
3 Host the data: Be the source
4.
4 Control the data: Be the authority
5.
5 Be the hub: Host the data and processor

Specialist Solutions
1
2
h@p://www.ickr.com/photos/okfn

2
3
4
5

1
2
Open Knowledge Foundation Supported

Data Catalogue
Open Source
Feels like a record manager
Simple API and search
Lots of community tools
http://demo.ckan.org/

Evolution of CKAN

1
2
Updated July 2014

Early) Dataset catalogue (data.gov.uk)


no data hosted or searched
Mid) Data and dataset catalogue
no data hosted but it is searchable
Now) Integrated data driven web site
data platform is integrated with data, search and content

Features
Publish, Store and Manage Data and Metadata
Visual and Geospatial
Social
Full Stored History
Federate Your Data With Other Organizations
Rich RESTful JSON API for Developers

1

Open Data Soft

2
3

Open Data Soft


Data as a Service (DaaS)
Hosted enterprise solution
Rich Interface
Query based API (3-Star)

2
3

Open Data Soft

2
3

Open Data Soft


Data as a Service (DaaS)
Closed Source (main product)
Hosted enterprise solution
EU Based
Rich Interface
Query based API (3-Star)

2
3

4
5
Data as a Service (DaaS)
Hosted enterprise solution
Allows user created content
Full linked API (5-Star)

https://opendata.socrata.com/

Features
Data Publishing, Optimized for Business Users
Flexible Metadata Management

Federate Your Data With Other Organizations


Metrics of the Success of Your Initiative in Real-time
Anyone Can Create Maps and Charts
Data Becomes Social
Developers Are Supported Every Step of the Way

4 5

4
5
Data as a Service (DaaS)
Closed Source (main product)
Hosted Solution
US Based
Clean Interface
Powerful API (SODA)

The sliding scale of specialist solutions


1.
1 Catalogue: Point to data (leave it at source)
2.
2 Re-present: Provide data services (leave it at source)
3.
3 Host the data: Be the source
4.
4 Control the data: Be the authority
5.
5 Be the hub: Host the data and processor

Specialist Solutions
1
2
h@p://www.ickr.com/photos/okfn

2
3
4
5

Integrated solutions
Integrated solutions expose data using the
current infrastructure (web pages).

Data driven web site

Best for reference and live data

The developers secret

Linked data

Amazing but hard to publishing and use.

Recap
Specialist Solu'on
+ Easy to get setup and maintain.
+ Open Data focused
+ Clear workows for publishing open data
+ VisualisaDon tools
+ Data mashing tools
+ Best for transacDonal data

Integrated Solu'on
+ No new plaEorm to learn
+ Data is provided in parallel to web pages
+ No separaDon from authoritaDve data
+ Easy discovery of data
+ Best for reference data
+ Best for Linked Open Data

Both great for open data


Integrated solutions more suited for building a web of linked
data

Exercise
Take a look at the following portals and
list 3 things you like and 3 things you
would improve about each:

CKAN (http://data.gov.uk)
Open Data Soft (http://public.opendatasoft.com/)
Socrata (http://data.cityofchicago.org/)

Outcomes


Understand the dierence between data on the web and the web of data
Evaluate a number of dierent approaches for publishing open data.
Develop a strategy for publishing data applicable to a specific domain.

Exercise

Which strategy do you feel best suits


your domain and why?

Thank-You

David Tarrant [email protected] @davetaz

You might also like