The Open Knowledge Foundation‘s Edinburgh meetup group gathered together at Napier University’s Merchiston Campus on Thursday to discuss Scotland’s open census, hack days, data protection, integrating datasets, data journalism, and interactive accountability.
After hearing about these guys at FOSS4G, I’m very glad to finally meet the community. The group was friendly and the talks were fascinating. Eagerly awaiting the next one!
Here are my highlights from an enlightening evening.
About 25% of responses came through the new online form. The estimated response rate was about 94%. They imputed the data to make that up to 100% for analysis. I learned that ‘imputation‘ is a fancy word statisticians use to mean ‘making up missing data’.
All census products are available under the Open Government License. Some tables are still to be published so watch for updates throughout 2014.
If you want to compare regions visually, check out the area profiles map. The speaker demoed a neat lasso tool for selecting multiple regions, but I can’t find it yet.
Visit the data warehouse if you want to grab the data in bulk for your own analyses.
Sally Kerr from Edinburgh Council talked about the council’s continuing efforts to find and free our tax-funded digital assets.
Lots of useful data is still buried in an Excel spreadsheet somewhere!
Edinburgh Apps, launched in Leith late last year, was the council’s first ‘public data hackathon’, an attempt to rally the Edinburgh tech community to solving civic problems.
Volunteers exchanged ideas and prototypes for the modest prize of local fame and business advice. The event was so enthusiastically received that the council plans at least one repeat this year, with even greater incentives.
Glasgow Council is hosting four of these events. We’re not at all bitter!
Tim Musson, now self-employed after a long lecturing career at Napier, talked about his role in advising companies and lawmakers on computers, data, and privacy.
There are many social and legal challenges in implementing effective rules at government and company level. ‘Trusted’ third parties might pose a threat to vulnerable people when handling their valuable data.
For example, Ed Davey, the Energy Secretary, proposes that energy companies encode energy consumption on bills as a QR code. How many people even know what a QR code is, never mind how to read it? What stops a third party from capturing this?
Wilbert wanted to join disparate data sets on educational institutions and courses to create a richer set of data for analysis.
He discussed the challenges of dirty data and resolving different keying conventions, and practical ways to concord each data set with all the others.
His most practical option is to key everything against Freebase, the most stable and comprehensive set of reference IDs for his topic.
Unusually for a Google product, the data in Freebase is under an open CC-BY license.
Wilbert prefers Freebase to Wikipedia-derived DBpedia because Freebase uses stable numbers as keys, whereas DBpedia uses names which can change over time.
Without getting bogged down in technical jargon, Wilbert was basically singing the praises of surrogate keys, an important part of an effective data warehouse.
Hey, Wilbert, have you ever considered a career in Business Intelligence? 🙂
Ally Tibbitt works at STV and is a budding data journalist.
He calls for contributors to help him build Placemakr, a website to make available the results of FOI requests in Scotland and analyses using such data.
Sounds like his long term plan is to build a more socially-minded ScraperWiki, to help those who can’t afford such services. Watch this space for more on that 🙂
What’s the gayest neighborhood in Edinburgh? How long will you have to wait for an allotment? What is the most car-clogged section of the city? What’s the noisiest town in Midlothian? It’s all on Placemakr.
Bruce Ryan demoed an in-progress interactive map of Scottish community council information sourced from various government APIs.
The map was implemented using GeoJSON for data interchange and leaflet.js for presentation, with the markercluster plugin to neatly split and group nearby councils at different zoom levels. Looking forward to the first public release!
Finally we heard from Daniel Duma, a PhD at Edinburgh Uni, after his team’s sleepless three-day toil to win Smart Data Hack 2014. I would call the result an experiment in “interactive accountability”.
The MSP Involvement Map is the result of a small team’s three day sprint to win a hack day competition. The app periodically parses the Holyrood transcripts to generate a tag cloud for what each MSP discusses in parliament.
Does your electee represent your interests? Is average word count and intervention count really an effective measure? Could the “MSP tag cloud” become another metric for politicians to game?
Questions like these generate a lot of excitement among parliamentarians, with Duma’s team meeting members this week. Where will it go?
The code’s on Github, so you can take it anywhere you want! 🙂