I would like to pay my respect to the Bedegal and Wangal peoples, the traditional custodians of the land on which I
work and live, whose sovereignty was never ceded to the British Crown. I acknowledge the unbroken connection to
place and pay my respects to the Elders past and present. To acknowledge country is to pay attention to the violence
– historic and ongoing – that occurs in the name of settlement. It is also to acknowledge the long history of
resistance, teaching, learning, sharing, and creativity. Always was, and always will be, Aboriginal land.
Who am I? • Andrew Brooks
• teacher, researcher, critic, artist, writer;
• my work focuses on race and antiracism; the
relationship between infrastructure and
inequality (including of data infrastructures);
and historical materialism (which involves the
study of social transformations across
history);
• more specifically, my work looks at political
struggles; riots and uprisings; forms of
communication (from platforms to poems);
policing and abolition; colonisation and
capitalism; communism.
• one half of the critical art collective Snack
Syndicate, who make publications, events, and
objects with a focus on study and the
histories/futures of political struggle;
• editor for Rosa Press publishing collective;
• the course convenor + lecturer! which means I
am happy to answer any questions you have!
‘But the student has a habit, a bad habit. She studies. She studies but she does not learn.
If she learned they could measure her progress, establish her attributes, give her credit.
But the student keeps studying, keeps planning to study, keeps running to study, keeps
studying a plan, keeps elaborating a debt. The student does not intend to pay.’
‘[S]tudy is what you do with other people. It’s talking and walking around with other
people, working, dancing, suffering, some irreducible convergence of all three, held under
the name of speculative practice.’
— Fred Moten and Stefano Harney
Data deluge
What exactly do we mean by massive amounts of digital data?
Roughly much digital data is produced everyday?
A simple bar graph showing the volume of data created and consumed each year from 2010 to 2025.
Source: Statista (the later years are, of course, projections)
Visualisation of how much digital data was generated every minute in the 2011 (left), 2020 (middle), and 2021 (right)
😱😱😱 That’s a lot of data!!! 😱😱😱
Conceptualising data, contextualizing data
What is data?
Data is a concept with varied definitions and meanings.
1. Data are sensory stimuli that 2. Data are the ‘raw material’ 4. In computational terms,
we perceive through our derived from observations. data can be defined as a type
senses. Here data is often presented in of information object made up
quantitative terms – as of units of binary code. These
Common sequence or numbers, figures, and other units of code can be stored,
hierarchy: data–information– abstracted representations. processed, organised, and
knowledge: transmitted by computational
(We will return to this idea of processes and systems. While
Data precedes information ‘raw’ data in order to it is important to understand
which precedes knowledge. complicate it later). this computational definition,
Data are sensory stimuli that we will think in more
we perceive through our expansive ways about what
3. Data can be qualitative (i.e. data is and what it does!
senses. Information is data
non-numeric) such as texts,
that has been processed into a images, sensory stimuli, video,
form that is meaningful to the art, sounds, smells, and so on.
recipient. Knowledge is
This
something understood and
evaluated by the knower.
Given, taken, partial, whole?
Data, noun [mass noun]
from the Latin verb dare, which means ‘to give’.
Data are elements that can be abstracted from
(given by) phenomena – measured and recorded in
various ways. But it might be more accurate to say
that data is something that has been captured (i.e.
given by phenomena and taken by us).
Why is the etymology of the word important?
Understanding that units of data are captured or
harvested is to understand that data is inherently
partial. That is, the data we encounter has been
selected from the sum of all potential data.
‘Data harvested through measurement are always a selection from the total
sum of all possible data available – what we have chosen to take from all
that could potentially be given. As such, data are inherently partial, selective
and representation, and the distinguishing criteria used in their capture has
consequence.’
— Rob Kitchin
What gets counted (and how it gets counted)
counts! We must look not only at the data but at the
criteria used to collect it and the context of its
collection.
René Magritte, The Treachery of Images (1929).
The text translated reads: This is not a pipe
Context matters
• Data is not neutral and objective
• Data are not facts and they do not
appear out of thin air (as if by magic)
Rather:
• Data is the product of human cognitive
labour; data is generated and involves
criteria.
• ‘Data produce and are produced by the
operations of knowledge production’
(Gitelman and Jackson, 2013, pp. 3)
Lisa Gitelman and Virginia Jackson (2013, pp. 2-3) write:
At first glance data are apparently before the fact: they are the
starting point for what we know, who we are, and how we
communicate. This shared sense of starting with data often leads
to an unnoticed assumption that data are transparent, that
information is self-evident, the fundamental stuff of truth itself. If
we’re not careful, in other words, our zeal for more and more
data can become a faith in their neutrality and autonomy, their
objectivity. Think of the ways people talk and write about data.
Data are familiarly “collected,” “entered,” “compiled,” “stored,”
“processed,” “mined,” and “interpreted.” Less obvious are the
ways in which the final term in this sequence—interpretation—
haunts its predecessors. At a certain level the collection and
management of data may be said to presuppose interpretation.
“Data [do] not just exist,” Lev Manovich explains, they have to be
“generated.” Data need to be imagined as data to exist and
function as such, and the imagination of data entails an
interpretive base.
The Coloniality of Data + Data Sovereignty
Petroglyphs in the Burrup Peninsula. These carvings are estimated
to be up to 30,000 years old, making them the artworks in the
world. If data are abstractions of phenomena in the world
(representations, recordings of observations, measurements) then
these carvings can be understood as data renderings. They show
presence of animals such as the Tasmanian Devil (bottom right)
and show the continuity of First Nations culture, knowledge, and
sovereignty.
Gail Mabo, Tagai 2018
Tagai is the traditional name of the star constellation
Torres Strait Islanders consult for safe navigation through
the Torres Strait from island to island. Its position dictates
the timing of seasonal rituals for planting, harvesting and
hunting. Mabo uses bamboo, shells and twine to render
these maps. Currently showing at the Museum of
Contemporary Art as part of the Biennale of Sydney.
Colonialism
‘Invasion is a structure, not an event.’
— Patrick Wolfe, Traces of History (2016)
Colonisation describes the invasion, occupation, and
exploitation of one territory by another. Colonisation
is not a singular event but refers to the structures,
institutions and relations of power that are
produced as a result of this occupation. That is,
colonisation applies not only to the physical and the
economic but also to knowledge, culture, and being.
Settler colonialism is motivated by access to land,
which is crucial for settlers both as a place to live Vernon Ah Kee, theendofliving (2009)
and as a source of capital. As distinct from other
kinds of colonisation, in settler colonialism settlers
come with the intention of staying. The making of a
new home requires the imposition of settler
sovereignty (on top of already existing Indigenous
sovereignty).
The white possessive
In Australia, colonisation is driven by what Distinguished
Professor Aileen Moreton-Robinson calls the ‘possessive logics
of patriarchal white sovereignty’. The process of colonisation is
about claiming Australia as a white possession.
‘I use the concept of ‘possessive logics’ to denote a mode
of rationalization, rather than a set of positions that
produce a more or less inevitable answer, that is
underpinned by an excessive desire to invest in reproducing
and reaffirming the nation-state’s ownership, control, and
domination. As such, white possessive logics are
operationalized within discourses to circulate sets of
meanings about ownership of the nation, as part of A Tourism Australia campaign created in
2006 under the direction of former-PM
commonsense knowledge, decision making, and socially Scott Morrison who was, at the time, the
produced conventions… Subjects embody white possessive managing director of Tourism Australia.
logics… The courts operationalise a patriarchal white
possessive logic through the way in which they rationalise
the nonexistence of race while simultaneously deploying it
through their racial signifiers… Race indelibly marks the
law’s possessiveness.’
— Aileen Moreton-Robinson, The White Possessive (2015)
The Coloniality of Data
Captain Cook’s journal entry recording the first sighting of the Australian coast. Journals,
fieldwork notebooks, and logbooks are all forms of data and they often become the basis
for claims to knowledge and legitimate acts of possession and dispossession.
Racism
Racist and fascist imagery concerned with reproduction of Australia as a white nation: from the ‘White
Australia Policy’ to Pauline Hanson to racist depictions in children’s books to the rise of the alt. right to racist
newspaper cartoons.
The Coloniality of Data
L: An illustration by J Redaway titled ‘Aborigines of Australia: heads and implements’
R: An illustration of Aboriginal people by T R Browne in J Skottowe, ‘Select Specimens
from Nature’ (1813)
Numbers are not neutral
‘Numbers are not neutral entities. Statistics are human artefacts and in colonizing
nation states such numbers applied to Indigenous peoples have a raced reality (Walter,
2010; Walter & Andersen, 2013). Their reality emerges not from mathematically
supported analytical techniques but the social, racial and cultural standpoint of their
creators who make assumptive determinations to collect some data and not others, to
interrogate some objects over others, and to investigate some variable relationships
over others.’
— Maggie Walter and Michele Suina, ‘Indigenous data, indigenous methodologies and
indigenous data sovereignty’, International Journal of Social Research Methodology, 22:3
(2019), p. 236.
‘Across first world colonizing settler nation states, Indigenous data largely conform to
what Walter (2016, 2018) describes as 5D data. That is, mainstream Indigenous
statistics focus almost exclusively on items related to Indigenous difference, disparity,
disadvantage, dysfunction and deprivation. Magnifying the impact of this discursive
frame, 5D data are produced within a set of research practices that tend to the
aggregate, are decontextualised from their social and cultural context and simplistically
analyzed with the problematic Indigene compared pejoratively to the non-Indigenous
norm.’
— Maggie Walter and Michele Suina, ‘Indigenous data, indigenous methodologies and
indigenous data sovereignty’, International Journal of Social Research Methodology, 22:3
(2019), p. 235.
Archie Moore, Graph of Perennial Disadvantage, 2020, acrylic paint on
handmade paper made from pages of Hansard Parliament of Australia.
Moore’s artwork is suggestive of a bar graph but abstracts away from
the measures. The work is printed on an unusual medium—pulped
Hansard pages which is the official record of Australian parliament
proceedings. The pages invoke the 1901 Australian Constitution, a
document that mandated that ‘Aboriginal natives shall not be counted’
among the nation’s people which was not repealed until 1967. The black,
red, and yellow stripes that suggest the bar graph use the colours of the
Aboriginal Flag in a gesture that critiques and challenges the reduction
of Indigenous life to what Walter and Suina call 5D data – that is, data
related to Indigenous difference, disparity, disadvantage, dysfunction
and deprivation.
This Australian Government graph produced by the Australian Law
Reform Commission shows the percentage of Aboriginal and Torres
Strait Islander people incarcerated in each state and territory against
the percentage of First Nations people in the general population. But
what does this type of data fail to show?
Data Sovereignty
‘Indigenous self-determination relies on data self-
determination.’
— Maggie Walter and Michele Suina, ‘Indigenous data,
indigenous methodologies and indigenous data
sovereignty’, International Journal of Social Research
Methodology, 22:3 (2019), p. 236.
‘Indigenous Data Sovereignty centres on Indigenous
collective rights to data about our peoples, territories,
lifeways and natural resources and is supported by
Indigenous peoples’ inherent rights of self-
determination and governance over their peoples,
country and resources as described in the United
Nations Declaration on the Rights of Indigenous
Peoples (UNDRIP). The concept is defined as the right
of Indigenous peoples to determine the means of
collection, access, analysis, interpretation,
management, dissemination and reuse of data
pertaining to the Indigenous peoples from whom it has
been derived, or to whom it relates.’
— Walter and Suina (2019), p. 236-237.
Principles of Data Sovereignty
A workshop developed by participants of the United Nations Permanent Forum on
Indigenous Issues asserted that the following in relation to data:
‘[A] wide range of sources and types of data were desirable in building a complete
profile of a people and also noted was the desirability of having trained Indigenous
peoples engaged in the full range of work concerning data collection, such as
planning, collecting, analysing and report writing.’ — UN Report of the Workshop
on Data Collection and Disaggregation for Indigenous Peoples (2004).
Megan Davis reports that outcomes of this expert group meeting included the
following questions relevant to data sovereignty:
• For whom are we collecting data?
• How do we collect the data?
• What should be measured?
• Who should control information?
• What are the data for?
• Why do Indigenous peoples in resource-rich areas experience poor social conditions
and a lack of social services?
• To what degree is remoteness responsible?
Assessment 1
Assessment 1
Due dates: Week 2 -5
Length: 2, 000 words +/- 10%
Details: Each week (from week 2-5) you are required to produce a short (500 word) critical response to the
weekly readings. You must engage with a concept or idea in the reading and make a connection to an
example, text, data rendering, or issue that you have come across in your own independent research. This
might include: analysis of a data rendering or visualisation, a consideration of how data relates to a current
event, reflection on your assessment 2 portfolio. You must upload your response to your class blog 24
hours before your tutorial.
You must provide 4 entries of approx. 500 words each. Please do not give your responses in bullet point
form but rather as critical writing that reflects on the readings and your own research. The journal will be in
the form of a blog accessed through the assignment section on Moodle. Your critical journal will be marked
at the end of week 5 and you will receive short feedback provided.
Assessment 1
Option 1 ('Anatomy of an AI'): Discuss why a situated approach to the study of data is important. You can
draw on Crawford and Joler’s analysis of the Amazon Echo to explain the idea of situated analysis.
Alternatively, you might choose to focus on an object (a technology, an app, a social media platform, a
bureaucratic system, a shipping label on a package from Amazon, etc.) that involves data collection and
consider how that data is collected (what processes and materials might you need to consider to analyse
the object)?
Option 2 ('Deaths Inside Database'): Consider where the data for the Deaths Inside Database comes from
(the Coroner's Court) and how this relates to the ongoing process of colonisation. How does this database
situate and contextualise the data? How does it move beyond a statistical or quantifiable representation of
deaths inside? Does the database provide a way of accounting for (or holding to account) the role the state
plays in Indigenous deaths in custody? How?
Data and data sets
Kinds of data
A rough taxonomy of kinds of data includes:
• form (quantitative or qualitative)
• structure (structured, semi-structured, or unstructured)
• source (captured, derived, exhaust, transient)
• producer (primary, secondary, or tertiary)
• type (indexical, attribute, metadata)
Characteristics of data
In relation to large data sets, good quality data are:
1. discrete and intelligible (each piece of data is individual,
separate and separable, and clearly defined),
2. aggregative (can be built into sets),
3. have associated metadata (data about data), and;
4. can be linked to other datasets to provide insights not
available from a single dataset.
Data sets
A 'data set' is an aggregation of
data that has been organised
and structured.
The Iris data set published by
R.A. Fisher in 1936. This data
set is one of the first scientific
data sets and is often used as a
test case for many statistical
classification techniques and in
machine learning.
Anna Ridler, Myriad (Tulips), 2018
Myriad (Tulips) is an installation of thousands of hand-labeled photographs of
tulips that were taken by the artist. The collection of images is a type of data set
that has laboriously categorized by the artist. The work shows us the skill, labour
and time that goes into generating a data set. It exposes the human labour and
decision making involved in the production of data and its classification.
Methodologies + orientations
Seeing data like an Atlas
‘An atlas is an unusual type of book. It
is a collection of disparate parts, with
maps that vary in resolution from a
satellite view of the planet to a
zoomed-in detail of an archipelago.
When you open an atlas, you may be
seeking specific information about a
particular place—or perhaps you are
wandering, following your curiosity, and
finding unexpected pathways and new
perspectives.’ (Crawford 2021, 9-10)
Seeing data like an Atlas
‘An atlas presents you with a particular
viewpoint of the world, with the imprimatur of
science—scales and ratios, latitudes and
longitudes—and a sense of form and
consistency. Yet an atlas is as much an act of
creativity—a subjective, political, and aesthetic
intervention—as it is a scientific collection.’
‘Maps, at their best, offer us a compendium of
open pathways—shared ways of knowing—that
can be mixed and combined to make new
interconnections. But there are also maps of
domination, those national maps where territory
is carved along the fault lines of power: from
the direct interventions of drawing borders
across contested spaces to revealing the
colonial paths of empires.’ (Crawford 2021, 10)
Two maps of the same space that tell two different stories about
place, belonging, and sovereignty
Data is like matryoshka
dolls
Or networks of
entangled relations
and interests.
Mark Lombardi, Oliver North, Lake Resources of Panama, and the Iran-
Contra Operation, ca. 1984-86 (fourth version), 1999.
Looking under data
We will both look into data and under data rather than simply looking at data sets,
numbers, and figures. This approach will lead us toward a consideration of the
ways that data intersects with social, political, economic, and environmental
issues.
An excerpt from Kate Crawford and Vladan Joler’s Anatomy of an AI System schematic, 2018.
Toward a critical data studies
• What is the environmental impact of a Netflix binge? What is the
environmental cost in powering the data centres that store the
ever increasing amounts of data we produce?
• How has racial bias been imported into automated systems, from
the systems that sort images to those that attempt to predict
crime?
• Who does the invisible (and often manual) labour that is required
in order for data and automated systems to exist? Who scans the
books that appear on google books? Who sorts the images that
are used to train machine learning systems?