- Jamillah Knowles & Digit / betterimagesofai.org / CC BY 4.0
Data is collected everywhere—in spreadsheets, databases, shared documents. But how does collected data become open data? And why should you bother? A guest article by Johannes Filter from the TimeTiles project.
People are constantly collecting data. Not just government agencies or research institutions, but journalists, activists, associations, neighborhood initiatives, and passionate individuals. Anyone who keeps a list – of incidents in their neighborhood, political events, price changes, or anything else that changes over time – is collecting data.
The tool of choice is almost always the same: a spreadsheet. Microsoft Excel and Google Sheets are the unsung heroes of data collection. They are low-threshold, flexible, and require no programming knowledge. You open a blank sheet, create columns—date, location, description, source—and get started. For most data collections, this is the natural starting point. Spreadsheets are universally understandable, they allow for collaboration, and they enforce just enough structure to keep data reasonably consistent.
Many of the most important civil society data collections in Germany started this way. Chronicles of right-wing violence, documentation of police incidents, collections of court decisions, or environmental data—before they ended up on a website or in a database, they were a shared spreadsheet.
Why collect data yourself?
One might ask: Isn't that the job of government agencies, statistical offices, or research institutions? In part, yes. But there are many topics for which no official data exists - or for which the existing data is incomplete, outdated, or difficult to access.
One example: In Germany, there are no central, freely accessible statistics on police gunfire. Anyone who wants to know how often the police use firearms has to rely on civil society research that compiles information from press reports, parliamentary inquiries, and other sources.
Such gaps exist in many areas. And filling them is not just the job of large institutions. Anyone who systematically collects information can potentially create an open data set – even without an official mandate. The crucial question is: What happens to this data afterwards?
From spreadsheet to open data set
This is where the real work begins. Collecting data in a spreadsheet is relatively easy. Preparing it in such a way that others can understand and reuse it is much more time-consuming.
Location information such as “near Stuttgart” must be converted into coordinates. Time specifications require a uniform format. Categories that are self-evident to those collecting the data must be documented. And errors or inconsistencies that are hardly noticeable in an internal table become problematic as soon as third parties start working with the data.
This work is tedious and often invisible. But this is precisely what determines whether a private collection becomes a publicly usable data set.
The limits of spreadsheets
As good as spreadsheets are as a starting tool, they eventually reach their limits. Anyone who wants to display thousands of entries on a map or build a searchable chronicle will quickly realize that a spreadsheet alone is no longer enough.
Excel and Google Sheets are tools for collecting and sorting - not for publishing and visualizing. The step from spreadsheet to interactive map or filterable overview requires additional infrastructure. Those without programming skills often end up using proprietary platforms. These often come at a cost or host the data on third-party servers - both of which can be problematic if independence and data sovereignty are important.
What is missing are bridges between the world of spreadsheets and the world of publishing. Tools that take a CSV file or a shared spreadsheet and turn it into something useful: a map, a timeline, a searchable overview – as open source software that you can run yourself.
Open data needs open tools
The open data discourse often focuses on licenses and formats. Less discussion is devoted to the question of tools. If the data is open but the software for using it is not, a new dependency relationship arises.
Yet powerful open infrastructure exists. OpenStreetMap provides freely usable geodata. MapLibre enables interactive maps without ties to commercial providers. And there are numerous open source projects in the field of data visualization.
What is often missing is the integration layer: tools that combine these building blocks in such a way that they can be used even without in-depth technical knowledge. This is exactly where I come in with the TimeTiles project, which I am developing as part of my Prototype Fund grant. The goal is to create modular, self-hostable software that simplifies the process of turning a table into an interactive chronicle with location and time references—based on open infrastructure.
It starts with a table
Regardless of the specific tools used, the most important step remains the first one: starting to collect data systematically. With clear columns, consistent formats, and a brief description of what the data means.
It doesn't have to be perfect. It doesn't have to be a complex database. A neatly maintained table is a start. As the collection grows and is published—as a CSV download, an embedded map, or a searchable chronicle—it becomes a contribution to open knowledge.
Open data does not unfold its effect simply through the existence of a data set. Rather, it is through the work that makes it understandable, usable, and accessible. Often, this work begins with nothing more than a table—and the decision to share knowledge. And that's what it's all about.
Johannes Filter he/him
Johannes Filter is a freelance full-stack developer and data journalist in Berlin. In class 01 of the Prototype Fund, he is developing TimeTiles, a modular software library and application for interactive chronicles with location and time references. In addition to his work on open source tools, he runs transparency projects such as polizeischuesse.cilip.de and verfassungsschutzberichte.de.