“I don’t side-sell. I only sell maize to the private crop aggregator company you mentioned,” said Sam (name changed to protect privacy). But later in the day, as we became more familiar with each other, Sam introduced me to two gentlemen visiting his village in Eastern Rwanda. They were private buyers, and Sam and his fellow villagers sold their crops to them. Some villagers sold regularly to these buyers, while others sold to them when they needed cash for a family emergency.
“But you said you didn’t side-sell,” I feigned surprise. “I lied,” Sam replied with an impish grin.
We have witnessed this theatre play out across contexts for nearly two decades now - the untruth that seems close to the truth, the incentives that dictate answers, and the biases of social desirability and acquiescence that influence respondent behaviour.
As each year comes to a close, we take time to pause and reflect. This year, we’ve decided to delve into our experiences of collecting data. For almost two decades, we have been gathering data on WASH, agriculture, climate, and energy across the world. While we’ve shared some of our data collection experiences before - from remote data collection to collecting farmer data at scale - we felt a broader overview, paired with actionable recommendations, would complement those more focused lessons.
We’ve divided this blog into six sections. The first uncovers the incentives to lie, while the second explores the power of conversation. The third highlights the importance of narratives beyond numbers. The fourth examines the potential of snap surveys over single snapshots in time, and the fifth discusses using a mix of technologies and techniques to conduct surveys. Each section contains conclusions and recommendations, but in the final section, we summarise these insights and make the recommendations more explicit and actionable. We hope readers and practitioners in the development sector will find them useful.
How we collect data, who we talk to, and who is asking the question all influence data quality. For a respondent hoping to access a government scheme meant for people living below poverty levels (and if you are surveying on behalf of the government), there may be an incentive to underreport their income. However, if social status comes into play, the same respondent might report a higher income. That said, income is one of the hardest indicators to measure via a questionnaires - something we’ve been optimising with IDH for the last four years.
Similarly, questions like “Do you use the toilet?” or “Who makes financial decisions in the household?” almost always yield socially acceptable answers. People want to be seen doing the right thing. We’ve learned to be mindful of the biases that emerge when asking questions. Phrasing questions appropriately, using the right proxies (e.g. consumption to understand income), remaining observant of surroundings (sights, sounds, and smells), and empathising with respondents have all helped us navigate these untruths. Because these inaccuracies only magnify when aggregated into statistics.
Surveys often fail to provide reliable information because respondents don’t always trust the interviewers. Respondents dislike long Q&A sessions and can suffer from survey fatigue, particularly when they see little benefit in participating. “I don’t know what these survey agencies/NGOs do with all the information they collect from me,” a fisherman in Bangladesh once lamented. Even when respondents consent and know they can stop at any time, they sometimes continue out of politeness, providing answers just to satisfy the interviewer—undermining data quality.
Over the years, my colleague Francis Warui taught me the power of small talk. For the first five minutes of a survey, I learned to discuss the weather, check on the crops (even when surveying toilet use), and talk about anything except the survey. Understanding the culture that drives conversation is key - what’s appropriate in Indonesia might not be in Uganda. Once the conversation flows naturally, you can introduce questions (not necessarily in the order of the survey) and listen carefully, as answers often hide in the subtext.
Field-testing questionnaires, rehearsing with trained enumerators (like those in Akvo’s network, who share our values and ensure data quality), and practising extensively before collecting actual data all help make conversations more organic. The more natural the dialogue, the closer we get to the truth.
In a focus group discussion (FGD) in Western India, we engaged with a group of men and women. During the discussion, 90% of respondents claimed that women played a key role in deciding what crops to grow and where to sell. Meanwhile, most men sat on chairs, while the women sat on the floor with us. In a separate FGD with women, we learned that they hardly played any role in these decisions.
Getting to the bottom of an issue and capturing its nuance takes time, effort, and cost. Numbers are useful, but without a context-sensitive narrative, they can hide more than they reveal.
The development sector is a graveyard of baseline and endline surveys. While these snapshots in time definitely serve an important purpose, it is perhaps time to rethink traditional surveys. While we acknowledge the power of immersive field investigations, combining narratives with numbers and presence in the field, we want to highlight the importance of breaking down large surveys into smaller chunks and administering remote surveys. Establishing a regular and effective system for user/beneficiary monitoring is often more valuable than to survey a larger number of users once or twice.
For example, we broke down user satisfaction with WASH services (in Kenya) and farm level business models (Zambia) into smaller closed-ended questionnaires that were administered via USSD, WhatsApp and phone calls. Insights from these frequent “snap surveys” both complemented and contradicted findings from large-scale field investigations. In cases of conflicting results, we either validated the findings in subsequent surveys or applied pre-defined criteria to determine which data to prioritise.
Above: A GIF illustrating the difference between in-person data collection and remote data collection. You can read more about this here.
The advances in AI and machine learning have thrown interesting possibilities in the technology mix. In the past, we have administered USSD, WhatsApp, Web based and phone surveys in addition to mobile app-based field data collection. But AI and machine learning have made qualitative inquiries more interesting. It is easier to record, code the responses and analyse open ended responses using machine learning.
Sampling techniques are equally crucial. We have carried out multi-stage cluster sampling in countries with proper sampling frames. At the same time, we’ve used snowball sampling to build a sampling frame and administer the actual survey. No matter the country, or which technologies or techniques are selected, we have learnt to remain adaptive and flexible in the field. We learnt to oversample in Uganda to avoid conflicts between residents and refugees, we built sampling frames together with farmers in Kenya, we crossed flooding rivers in Liberia and dealt with the eventualities of displaced people and no internet for days.
It is difficult to capture the richness and diversity of nearly two decades of experience in data collection—both in the field and remotely. Summarising our conclusions and making recommendations is even harder, as I am aware that I may have missed mentioning things that matter and occasions that deserved highlighting. Yet, one feels the need to conclude, recommend, and at least provide a sense of direction to the reader. So, here goes the list:
Collecting data is an exercise in truth-seeking. However, it requires investments of money, time, and effort. While quick data collection methods and linear narratives of social impact may seem attractive at first glance, we should reject simplistic notions of change and strive for the truth. The truth is messy but magical. And it is this magic that empowers us to make powerful decisions that improve people's lives.