What if I told you a brand new public library is coming to your town. It's going to be really well stocked with great books - but you can't open them, you can only look at the covers. That is the current state of our public data right now. Here's why you should care.
Irecently argued that we all need to level up the ways we share data, particularly public agencies. PDFs seem to be the default for how public data are made available, which just doesn't cut it. The source of my frustration is not that workarounds don't exist. They do, or else nothing would ever get done. It's that nobody should have to be a 'halfway decent programmer' in order to liberate data from the dreaded PDF. What good is that for the vast majority of people who can't program? This is what I call the 'let them eat cake' perspective, and it's dreadfully out of style.
The broader point is that the data to which I am referring - those administered by taxpayer-funded institutions and mandated by law to be released - are not a privilege. Data about the health, wealth and safety of our communities are a civic right. They exist precisely because taxpayers paid for them to exist, for the exact purpose of using them to learn how to better our lives. That makes public data a public good, not a state secret. They are already being made available, so why not go one step further to actually make them useful?
Did you know that 13 million people in the United States lack access to a safe municipal water source? Me neither, until I downloaded and analyzed the data. (For reference, there are 1.2 million people living with HIV in the US.) Knowing that kind of information empowers people to elect legislators, produce and reproduce science, plan communities, and set goals. Releasing data for public consumption is better for government, better for science, and better for democracy.
That's why we need to demand that the data that describe our lives be made accessible. Not "available" - accessible. Widely disseminated, and easy to find, download, and analyze. PDFs are necessary but not sufficient for this task. I see no reason why data can't be released in multiple formats; it's not like it all needs to be re-entered by hand each time.
Luckily things are changing. The Obama Administration, theNational Science Foundation and the European Union are all working towards opening their data. We're not there yet though!
"Send me your data - PDF is fine," said no one ever
The public health paradox ("When public health works, it's invisible")
Let's make data a civic right
Scholarly impact of open access journals
Six months later, disease detectives still battling fungal meningitis outbreak