The Gurstein Open Data update: end-user requirements for effective usage

Michael Gurstein has updated and expanded his much talked about earlier intervention on open data, which we also reblogged here.

Excerpts:

“In the following I want to extend the argument to include some discussion and analysis that was left out of the original post and as well to integrate some of the comments that folks have made, into the overall argument.

First, is to point to Tim Davies’ extremely useful notion of Open Data “supply” as compared to Open Data “demand” ( “use” in my lingo). By “supply” Tim is referring to the various background conditions and design features in how the data which is being made available (open to the user) is structured, configured, and otherwise pre-processed prior to it being provided to the end user community. I would refer readers to Tim’s Master’s thesis where among other things, he goes into considerable and useful detail on these issues and particularly how they impact on the overall capacity of end users to make use of the data.

A second commentator on the blog, Writeruns makes a somewhat similar argument but relates the issues of the supply side of data more directly to the broader social and cultural context from which the data has been gathered (processed) and further links this into how the now, non-decontextualized data is recontextualized in a new form both of which processes (decontextualizing and recontextualizing) have significant impacts on the semantic content of the data and thus on how the data gains meaning for the end user. One example that Writeruns refers to that of crime statistics whose meaning (and thus whose use) may be very much a function of, for example, the geographical divisions by means of which the data is formatted and made available (i.e. where geographical boundaries are built into the data descriptions for example).

This leads to a third and perhaps even more important point raised by a number of commentators which is that in looking at “open data” it is necessary to include a three step process: “access”, “interpretation” and “use”. In my original blogpost I only referred to the “access” and “use” elements and made some non-articulated assumptions about matters of “interpretation/meaning (or sense) making and so on.

The point that is made here is that the process of interpreting or understanding “open data” is a separate process from making (effective) use of the data and that any critical analysis of “open data” use has to include how and under what conditions the data that is being made available is contextualized and given meaning. Thus for example among the cases I discussed in the earlier post, in the case drawn from Tim Berners-Lee’s presentation, the “interpretation” (sense making) of the data was the contribution of the consulting firm and was presumably based on their experience and expertise in on-going work with geographically based information and advocacy.”

Michael also gives more specific suggestions:

“In the following I will itemize what I think are the various elements that are required to be in place on the end user side for effective use of open data to take place. Some of these are more essential than others but to my mind some component of each needs to be in place or large numbers of those who might otherwise make use of Open Data to improve their lives and particularly the poor and marginalized will be excluded from making “effective use” of open data.

These include:

1. Internet access – having an available telecommunications/Internet access service infrastructure sufficient to support making the data available to those all. Issues here would include:

a. the affordability of Internet access – a major issue for many particularly in the Developing world.

b. the availability of sufficient bandwidth for the range of uses to which the data might be effectively put e.g. whether the data access has been designed on the basis that for example, broadband is necessary for the use of the data being made available

c. the accessibility of the network e.g. where access to the network or to connectivity is restricted for political or other reasons.

d. physical accessibility/usability of access sites as for example for the physically disabled

2. Computers and software –having access to machines/computers/software to access and process the available data and machines that are sufficiently powerful to do various analyses; having sufficient time on the equipment to do the analyses (many people need to share computers); knowledge of how to operate the equipment sufficient to access and analyse the data and so on. Does the use of the data require more powerful (and expensive) computers or software than might be generally available for example?

3. Computer/software skills – having sufficient knowledge/skill to use the software required for the analyses/making the mashups/doing the crosstabs etc.etc. Techies know how to do the visualization stuff, university and professional types know how to use the analytical software but ordinary community people might not know how to do either and getting that expertise/support might be either difficult or expensive or both.

4. Content and formatting – Having the data available in a format (language, coding for display, appropriate geo-coding, and so on) such as to allow for effective use at a variety of levels of linguistic and computer literacy. What are the language, computer literacy, data analytic literacy levels that are required for an effective use of the “open data”? Does the use of the data presume that it is being used by a professional and are there means through which those professionals might be available to those who can’t afford expensive fees?

5. Interpretation/Sense making – sufficient knowledge and skill to see what data uses make sense (and which don’t) and to add local value (interpretation and sense making); being able to identify the worthwhile information and to figure out how to put the data into the right format or context so that what might otherwise be numbers on a page becomes something that can change people’s lives.

6. Advocacy – having supportive individual or community resources sufficient for translating data into activities for local benefit. Availability of skills and local resources, community infrastructures, training, the means for advocacy and representation all are required to enable effective local interventions based on the open or other data.

7. Governance – the required financing, legal, regulatory or policy regime, required to enable the use to which the data would be put.”

Leave A Comment Cancel reply