Data Journalism
ONLINE JOURNALISM
INTRODUCTION
► Data Journalism is a flourishing field of journalism that
actively looks for, analyses and interprets various forms of
data for storytelling.
► More than half of all news organizations in the US and
Europe now have at least one dedicated data journalist
working in their newsrooms.
Telling Untold Stories
► Data Journalism is the intersection between journalism and data analysis - often
through the use of technology.
Untold
Journalism Data Analysis
Stories
What is data journalism?
► When people hear the phrase “data journalism” most automatically
think of charts and infographics. However, data journalism is a
larger field. It’s the entire process of deriving meaning from data to
develop a story - not only the visual output.
► A written story that relies on data analysis and interpretation is a
better example of data journalism than an infographic with dozens of
meaningless numbers. The key ingredient is asking questions to our
data just as if we were interviewing it.
What is data journalism?
► Your output can take on the form of a map, video, chart, written article and
even social media posts. This allows you to be very creative with your
output and not be constrained by a specific medium. This cross-platform
approach is a very important part of digital content creation.
► Data-led stories have the power of reaching and engaging with new
audiences by making sense of the data-rich world that we live in. It is
important to remember that data journalism is not about using shiny new
technologies, rather it is about using technology to help extract contextual
information for your readers.
5 myths about data journalism
Myth Reality
More than anything, data is about stories that
“That data journalism isn’t personal.” play a direct role in people’s lives.
Many data stories have the ability to not only
tell individual stories but also contextualize a
story by placing a person in his or her
neighbourhood or country.
“Data journalists are not real journalists, Data journalists do more than just sift through
they’re only interested in numbers, not mountains of data and identify trends. They
telling stories.” can often provide a creative perspective to a
story which allows them to engage with new
audiences.
5 myths about data journalism
Myth Reality
While it’s true that data-driven stories can
benefit from people with technical and design
“Data journalism is for programmers skills, most of the work stems from an
and designers” editorial understanding of a subject.
As long as you have an eye for a story and are
willing to collaborate with others, you can
become a good data journalist.
An infographic or chart without an underlying
“Data journalism is all about making story is not data journalism.
charts and infographics.” A data journalism project should involve the
uncovering of a story from a dataset.
5 myths about data journalism
Myth Reality
Within Al Jazeera we’ve produced
award-winning data journalism projects with a
“Data journalism is expensive and mobile phone, camera and computer (as will
time-consuming.” be demonstrated in our case study below).
While, longer-term investigative projects may
take time and resources to develop, there are
many daily stories that newsrooms are
producing that involve analysing and
presenting data.
Searching for facts
► The spread of misinformation online has created a huge problem for news
consumers. Building your audience’s trust in your data stories requires that you
treat your data sources like every other source of Information. Your data must be
verified for accuracy and truth.
► Tips for sourcing reliable data:
1. Use trustworthy sources
2. Cross-reference datasets
3. Watch out for missing or outdated values
4. Understand the data collection methodology
5. What are the consequences of getting it wrong?
THE DATA JOURNALISM PROCESS
► Now that you have a basic understanding of what data journalism is,
let’s have a look at how it is done. Through our own experience, the
best data-driven stories start their lives as a series of questions.
► E.g. - “How many people are affected by...”, “Where are the most
cases of …..” or “Is this a pattern?”
► By focusing on first asking questions and then looking for data to
find answers, your story is more likely to have a real impact on
people and make them care.
The 4 step data journalism process
STEP 1: Storyboarding
► The main ingredient to a successful data story is creativity. Data by itself is
not a story. It requires you to think creatively about what’s relevant to your
audience and what is not. On the flip side, a great story idea without data is
also not a data-driven story. Often, finding the right balance between what
story you want to tell vs. what data you have requires some trial and error.
► A mistake a lot of inexperienced data journalists make is thinking that they
need to analyse big datasets to tell a story. A better approach is to start off
with smaller datasets and develop them over time. This will help develop
data-fluency and ensure more effort is placed on extracting the story’s
meaning.
What makes a good data-driven story?
► 1. Contextual and explainer stories
► 2. Dense or complex stories
► 3. Exploratory or interactive stories
► 4. Investigative stories
STEP 2: Get data
► Sourcing good data is often cited as the biggest challenge data
journalists in the world face today.
► While this problem is not unique to Asian journalists, it does create
an additional barrier to adopting data-driven reporting within
newsrooms.
How can I find the right data for my story?
STEP 3: Clean & analyse
► Once you have your data you can begin the process of cleaning and
analysing it. Cleaning data starts with converting it into a format that
you can make sense of, for example, extracting tables from a pdf
document into a spreadsheet1.
► The next step is to check for incorrect, missing or duplicate values.
Spending additional time thoroughly cleaning a dataset can
significantly reduce the chance of drawing the wrong conclusions
during your analysis.
How to analyse or “interview” your data?
► 1. Get to know your data - very well
► 2. Ask critical questions
STEP 4: Deliver your story
► The final step in the process is to deliver
your story. Remember not all data-driven
projects need to be visual.
► Choosing your delivery mechanism will
depend on what type of data you’d like to
present and what skill-sets you have
available in your team.
EXAMPLES: Al Jazeera’s news Broken homes
► Al Jazeera’s data and interactive journalism unit is known as @AJLabs. Formed in 2011 during the
height of the Arab Spring, the team, which is based in the Doha headquarters, focuses on telling human
stories behind data. One of the team’s most widely circulated projects is Broken Homes published in
English, Arabic and Bosnian. Broken Homes is the most comprehensive project to date tracking home
demolitions in Jerusalem, the eastern portion of which has been occupied militarily by Israel for over 50
years.
► Working closely with the United Nations, Al Jazeera tracked every single home demolition in East
Jerusalem in 2016. It turned out to be a record year, with 190 structures destroyed and more than 1,200
Palestinians displaced or affected. This project contextualizes this data by revealing the human impact
these demolitions have on the people living there.
► 360-degree photos and video testimonies were gathered from some of the major sites to allow readers to
witness the remains of a demolished home. Our reporter on the ground travelled throughout East
Jerusalem over the course of the year to speak with many of the affected families.
► We decided to tackle this project after witnessing an escalation in violence between Israelis
and Palestinians in late 2015. The goal was twofold: to see how Israel’s home demolitions
policy would be affected by the increased tensions, but also to convey to readers that
demolitions data is about more than just numbers. Each number represents a family, and each
number tells a story.
► To provide geographical context to the story we decided to use a map to pinpoint the
locations of each of the destroyed homes. At the end of each month we wrote a short
commentary and produced an infographic to provide additional context.
► Read the story here: http://aljazeera.com/brokenhomes
HOW TO GET STARTED
► Starting your first data project will require assembling the right
combination of journalists and technologists. This often involves
breaking down organizational silos and working across departments.
► The first step will involve obtaining management’s buy-in to the
project. From our own experience, the best way to achieve this is to
speak with other teams about the kind of work that they do. Your
goal should be to bridge the gap between editorial and technology.
a) Bridging the gap between
editorial and technology
► Your goal should be to bridge the gap between editorial and technology. Here are
the risks involved in focusing too much on a particular approach:
► Technology/Design dominated
► Editorial dominated
► Ideal scenario
b) Developing a successful team
► Bahia Halawi - Co-founder & Data Engineer at Data Aurora
► Currently, data-driven journalism is still looked at as a luxury by the majority of newsrooms and
media outlets in the Arab world. Most managers are more concerned with their online presence
and content curation, ignoring the fact that riding the digital transformation will optimize these
processes. I believe that this can be changed by working first on the data mindset and nurturing a
tolerating environment to transform the working. It requires communicating with team leaders
and hosting data evangelists to show different employees the power of becoming data-driven at
all aspects. Also, we need more collaborative initiatives to break the ice between media
practitioners and technologists. At the same time, journalists should tap into the open-source
community to help develop the right Arabic language tools and platforms.
► Alia Chughtai (Online Producer)
► Diversity is the key ingredient to make a successful data journalism team.
Everyone needs to realise that combining different skill sets creates unique and
interesting work. Data journalism can’t function in silos, it needs conversation,
arguments, complete and utter failure and experimentation.
Typical team roles
Job Role Responsibility
Data Journalist The data journalist is responsible for crafting the
story. This person is typically trained as a journalist
or is a subject specialist who is passionate about
telling a story.
Editorial Designer The designer’s role is to produce the functional and
aesthetic design to best deliver the story to the
audience.
Developer/Data Analyst/Data Scientist The developer/ data analyst or data scientist is
responsible for transforming and analyzing the data
so that it can
be understood by the rest of the team.
Data Editor The data editor makes sense of the complete story
and finds the best way possible to deliver the story to
the audience.
Tools and datasets
Tool Link
Data Extraction
Tabula - extract data from PDFs https://tabula.technology/
Document Cloud https://www.documentcloud.org/
Data Cleaning and Analysis
Google Spreadsheets docs.google.com/spreadsheets
Open Refine http://openrefine.org/
Data Visualization
Datawrapper https://www.datawrapper.de/
Infogram https://infogram.com/
Flourish https://flourish.studio/
Tools and datasets
Tool Link
Data journalism frameworks
Workbench http://workbenchdata.com/
(No coding required)
R - Tidyverse suite https://www.tidyverse.org/
(for programmers)
Python - Pandas https://pandas.pydata.org/
(for programmers)