The year 2020 wasn’t just dominated by the pandemic. It was also a year of open data.
Many health-related organizations published daily and real-time updates about the spread of the virus around the world, circulating an unprecedented amount of numbers and figures. The challenge for journalists has been to analyze this information accurately, and communicate their findings to the public effectively.
It’s imperative that journalists first understand the data they’re working with. While there is often a rush to publish in today’s non-stop news cycle, doing so inaccurately does more harm than good. During a crisis like COVID-19, data can help raise critical awareness among the public. But if mishandled, it can place people at greater risk.
Always analyze numbers with healthy skepticism. As journalists, we should investigate when and from where the data we use originates. We should determine who originally collected and published the numbers, as well as the funders behind it.
Journalists must also fix illogical or missing values, and clean up mislabeled figures. These errors may occur during the data entry process, whether done manually or automatically.
The Jordanian Ministry of Health, for example, used to manually enter some COVID-19 test results that didn’t automatically get uploaded into the government database. As the number of daily cases increased, results were lost, and mistakes related to the names and their samples were made, former Jordanian Health Minister Saad Jaber told local media
Keep in mind, too: even when using reliable software like Microsoft Excel, human error can sneak through. Take, for instance, this incident that occurred in the U.K. last year: 16,000 records of COVID-19 patients were accidentally deleted from an official database, resulting in the spread of inaccurate data which hindered efforts like contact tracing to combat the virus.
To avoid publishing inaccurate data, rely on credible sources and verify the numbers. Here’s a checklist to help:
Seek out resources that are transparent about how they compile and document data. This includes the technology and algorithms they used during the process. The more transparent data providers are, the more potential for accuracy there is.
To this end, make sure you understand how data is being collected by the source you’re referencing. This will enable you to best analyze and verify numbers before you include them in your own reporting.
Don’t publish a dataset without attaching the corresponding metadata file, which helps explain how the data was collected. It can also include information about sample size, error margin and missing values, and it includes a glossary of terms and abbreviations. Without these details, you’re like a person who has discovered a gold treasure chest, but doesn’t have the keys to open it.
by Amr Eleraqi, International Journalists’ Network
Photo by Anne Nygård on Unsplash