In the data wasteland
HOW TO
Everything is data: from accurate figures to impressionistic narratives, through to rumors and plain lies, which carry information on the people articulating them. The issue with data is that, precisely because it is ubiquitous, it forms an incoherent slush until organized into something recognizable.
Its analytical value will depend on two variables in
particular: volume and structure. Discrete information may yield indications on
where else to look for more, but will fail to offer broader insight into social,
economic or political trends. Large quantities of data will prove impossible to
mine unless organized in such a way as to make it comprehensible. The goal,
therefore, must be a critical mass of structured data.
The availability of different sorts of data varies
widely from one part of the world to another, depending on how bureaucratic or
informal their systems of governance are, how mature their civil society tends
to be, and so on. Societies, also, can render themselves transparent or opaque
to varying degrees. The Arab world, for example, is a case-study on data
paucity: dysfunctional authoritarian regimes produce little reliable
information and share even less, while citizens conceal much about themselves,
notably on social media where multiple accounts, shifting aliases, and cryptic
forms of expression, such as sarcasm, are the norm.
In such a data wasteland, even conventional wisdom
becomes suspect and hard to fact-check. It will nonetheless reverberate, impose
itself and often gain validation from supposedly legitimate and trusted
sources. In Jordan and Syria, in the late 2000s, the UN embraced wild
government estimates on the number of Iraqi refugees long before any
institutional measures had been taken to register them. In Iraq, serious media
outlets consistently describe Mosul as the “second biggest city” in a country
that hasn’t had a proper census in decades, and even though Basra shows many
signs of being larger. In Lebanon, a host of international bodies adopt
economic figures that are rendered unverifiable by the government’s years-old
refusal to divulge essential financial data or even a draft budget.
Groundbreaking research on social issues will often unearth, at first glance, a wealth of rich but not necessarily reliable narratives and a dearth of hard data. Academics and pollsters use various techniques to overcome this problem: notably they build questionnaires and code the answers given by respondents. The outcomes can help produce more clarity, which can be deceptive too, because respondents’ answers are shaped by predetermined questions.The issue with data is that it is ubiquitous
The truth is that data collection, despite its façade
of objectivity, is a handicraft more than a science. Success depends far more
on common sense, creativity, trial and error, and flexibility than it does on
any formalistic methodology. A farmer doesn’t need complex technology or
methodology to develop a sophisticated database of his crop yields, factoring
in diverse soils, climate patterns, past experiments with seeds or fertilizers,
annual variations, comparisons with neighbors, etc.
Researchers can cultivate their field of investigation
in much the same fashion. In fieldwork-based research, data collection consists
of parsing information from a variety of sources and weaving it anew into
thematic threads. All dates will go into a chronology (or several timelines
covering different aspects of a topic). Information on individuals and their
relationships can build up into a biographical data-set, which may be conducive
to a genealogical tree, an organizational chart or a visualization of networks.
Geographic information will naturally feed into maps. Descriptive “building
blocs” will also emerge from scattered information, gradually adding up into a
history of a particular institution, a memo on a specific legal issue, an
infographic or the like.
The data itself will likely emanate from a mix of
sources. Most topics will reveal themselves through existing “literature” or
expertise; documents containing raw material; media mentions over a period of
time; and interviews conducted with the concerned. As a rule, much more
information is available than we are initially tempted to believe—simply
because it’s convenient to save ourselves the trouble of digging deep into
archives and narratives, which indeed is time-consuming. Assuming the opposite,
i.e. that a data treasure trove is out there just awaiting to be discovered,
will in fact save you time: more often than not, you’ll come across people who
have already done much of what you could do. “Mapping the mappers” is,
therefore, always a good place to start.
Shuffling data is tedious, for sure. But it is also an essential component of the analytic process. Our eye “sees” because it organizes things into categories—colors, textures, movements, distances—that may be irrelevant to other living creatures whose senses are wired differently. Their reality—that is, their understanding of the world—will inevitably be distinct from ours, since the information they collect and synthesize is itself different. Making sense of anything boils down, consciously, conscientiously or intuitively, to categorizing and reorganizing data.
Making sense of anything boils down to categorizing and reorganizing data
This sorting mechanism adds layers of meaning to
something initially nondescript and perhaps chaotic. You could be looking at a
pile of blocs of different shapes and colors. If you leave it as such, that’s
about all you can say about it. However, if you manipulate the blocs and
separate them into groupings, many more things can then be said: how many they
are in all; what exactly their shapes and colors are; what may be missing or
lost; whether they are heavy or not; what material they are made of; what underlying
logic they may conceal, etc.
The act of compiling data will likewise reveal trends,
inconsistencies, ambiguities, voids, distortions and so on, all of which are
precious analytical material that was not visible when data remained in bulk.
Even when incomplete, a chronology, a biographical data-set, a map and a
collection of figures will all help make sense of various complex and competing
narratives.
Besides, information collected systematically will
bolster your credibility and boost your value-added—especially in an
environment where it is a rarity. Information is power, they say. And as you
chose research, it may well be the only means you’ve got to get a little taste
of it. It would be a pity not to indulge!
27 February 2017
Illustration credit: Silhouettes in the rocky wasteland by Unsplash via Wikipedia / licensed by Unsplash.