Raymond Joseph Census 2011: StatsSA failed the open data test

The way South Africa and the UK released their census data recently showed how far SA has to go to embrace Open Data, says Ray Joseph.

The release last week of the census findings in England and Wales has been an excellent opportunity to compare the way in which it was handled, against the launch of the South African Census 2011 in October.

And, compared to the extent of the content and the format in which the data was released by The UK Office for National Statistics (ONS), it is clear that how Statistics South Africa (StatsSA) has a long way to go to getting it right.

While ONS released extensive data sets both as PDFs and in user-friendly Excel spreadsheets, allowing researchers and data journalists to independently interrogate the data, in SA it was a different story.

In SA the census release was accompanied by expensive countrywide road shows and much self-congratulatory back slapping by StatsSA for a job well done. But because the release of data to South African journalists was locked inside PDFs, it was difficult to extract without hours of mind numbing work to scrape and clean it, a problem with most of the official data available.

StatsSA also came up with several different tools, some relatively simple to use, others quite complicated, with Excel and some CSV files to interrogate their data – but each had some indicators not available in the others.

So, for example, Google Public Data Explorer had some indicators available nowhere else – but since it is merely a visualisation tool, the underlying data cannot be extracted. Shockingly, data accompanying some of these tools differed to that supplied in others, but so far queries about these discrepancies to StatsSA have gone unanswered.

The quality of the reportage in the UK compared to South Africa (more here) was also worlds apart: while data journalists on the serious papers in the UK dug deep into the data and delivered  some excellent reportage, including interactive visualisations, in South Africa most media merely took what StatsSA released and regurgitated it (including colourful graphs and charts) without any serious attempt at interrogation, interpretation or analysis.

South African newsrooms are light years behind better resourced British and American newsrooms (like The Guardian and The New York Times), but there has been a concerted attempt in 2012 to introduce data-driven journalism to local newsrooms via local chapters of HacksHackers – a worldwide movement that facilitates collaboration between journalists and coders.

Census 2011 would have been a golden opportunity to boost data journalism in our newsrooms, but the shoddy and unstructured way in which the data has been released so far, means a golden opportunity has been lost. 

Raymond Joseph is a freelance journalist and journalism trainer and media consultant. He is also voluntary convenor of the Cape Town chapter of HacksHackers. Contact him on rayjoe@iafrica.com or via Twitter on @rayjoe

© Copyright Africa Check 2019. Read our republishing guidelines. You may reproduce this piece or content from it for the purpose of reporting and/or discussing news and current events. This is subject to: Crediting Africa Check in the byline, keeping all hyperlinks to the sources used and adding this sentence at the end of your publication: “This report was written by Africa Check, a non-partisan fact-checking organisation. View the original piece on their website", with a link back to this page.

Comment on this report

Comments 5
  1. By Raymond Joseph

    We put up the shape files and other data that available on http://africaopendata.org/group/south-africa
    But we have tried to get additional, essential data (as well as the files that allign the 2001 census to 2011) out of StatsSA. After I wrote this blog I was contacted by the Statistician General asking how he could help: we told him and nothing has happened. I’m not sure if StatsSA is just incompetent or that there is a problem with the data we have requested (quite a lot of indicators either not up or only availale with limited info within PDFs) and they are avoiding giving it to journos who will interrogate it. I think it’s a bit of both…

    vote
    Reply Report comment
  2. By Stefaan Swarts

    I believe all the data that is available can be obtained from the Census 2011 Community Profiles in SuperCROSS, http://interactive.statssa.gov.za/superweb/login.do and saved to .csv files. Down to a stratum/ward level.

    It’s just impractical for StatsSA to produce multiple pdfs or spreadsheets for all the different types of requests that people might want. There is just too much data. That’s most probably why they have this interactive web interface so that anyone can get specific data they want.

    vote
    Reply Report comment
    • By Africa Check

      Thanks Stefaan. A very useful link indeed and one we use regularly at Africa Check. It is important to note that much of the data that is now available was not available when this article was written. The blogger’s argument is that if the UK’s Office for National Statistics can release extensive data sets, why can’t Stats SA do the same? It is not impractical for them, so why should it be impractical for SA?

      vote
      Reply Report comment

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Africa Check encourages frank, open, inclusive discussion of the topics raised on the website. To ensure the discussion meets these aims we have established some simple House Rules for contributions. Any contributions that violate the rules may be removed by the moderator.

Contributions must:

  • Relate to the topic of the report or post
  • Be written mainly in English

Contributions may not:

  • Contain defamatory, obscene, abusive, threatening or harassing language or material;
  • Encourage or constitute conduct which is unlawful;
  • Contain material in respect of which another party holds the rights, where such rights have not be cleared by you;
  • Contain personal information about you or others that might put anyone at risk;
  • Contain unsuitable URLs;
  • Constitute junk mail or unauthorised advertising;
  • Be submitted repeatedly as comments on the same report or post;

By making any contribution you agree that, in addition to these House Rules, you shall be bound by Africa Check's Terms and Conditions of use which can be accessed on the website.

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.