Evan Hensleigh of The Economist writes about how the publication will publish online the data behind its stories.
Hensleigh writes, “Years ago, ‘data’ generally meant a table in Excel, or possibly even a line or bar chart to trace in a graphics program. Today, data often take the form of large CSV files, and we frequently do analysis, transformation, and plotting in R or Python to produce our stories. We assemble more data ourselves, by compiling publicly available datasets or scraping data from websites, than we used to. We are also making more use of statistical modelling. All this means we have a lot more data that we can share — and a lot more data worth sharing.
Big Mac
“We decided to make the Big Mac index our first open data project. It is an ideal fit: the index is based on publicly available data, and it involves some work on our part to turn those data into a final product. Also, the index is intended as an easily-digestible introduction to how relative currency valuation works, so exposing its inner workings is a natural step. Although we have done that in words before, publishing code gives readers a more concrete way of seeing how the calculations work. The Big Mac index has often been imitated — for example, there is a Billy Bookcase index and a Spotify subscription index. Releasing our code will make it easier for people to remix the index.
“We started calculating the Big Mac index in 1986, and until this year it has been compiled and calculated manually. There are still a lot of hands-on parts to the process — in particular, compiling the list of prices — but we have now converted much of the calculation into code.”
Read more here.