This blog was written with Jildemarie Brouwer, Akvo's agriculture and data expert.
When looking at farmers all over the world, 84% are smallholders. Although their land plots are small, they contribute substantially to global food chains. Paradoxically, these farmers are also among the poorest groups in the world. To ensure global food stability and improve social justice, it is important to identify what factors cause their low-income levels and to leverage drivers of income for smallholders worldwide.
Smallholder farmers are also a diverse group that differ in several ways: the country and region they live in, the crops they produce, their household composition, access to finance, gender, farm size, etc. It is essential to capture this diversity when determining what interventions can leverage the income of a group of farmers. The process of defining this diversity by dividing the farmers into meaningful groups is called segmentation. In practice, this may mean you do not automatically use the common approach for dividing farmers into groups, namely by gender, region, or farm size. Instead, you divide farmers into groups based on multiple variables that are carefully selected depending on the context. Proper segmentation will help you identify several meaningful groups in your datasets that contain farmers with different characteristics and needs, such that you can provide tailored recommendations for each group.
Implementing segmentation into your collected data is a challenge, since knowing the characteristics that actually distinguish a certain group is not easy and requires good knowledge of the situation and context. At Akvo, we dived into the possibilities of making this decision more data-driven through the application of cluster analysis. A handful of pilot cases gave some promising insights.
What is cluster analysis?
Cluster analysis is an exploratory technique to discover patterns in data not apparent to the human eye. It detects the sets of data points that are most similar to each other and groups them together in clusters. For example, the outcome of clustering analysis could be a segmentation of farmers in terms of farm size, crop diversification, and age.
Using cluster analysis for segmentation
The challenge of applying cluster analysis is that there are many different algorithms producing different outcomes. To overcome this and make the procedure more robust, Akvo uses a mixed approach where advanced clustering is combined with relevant domain expert knowledge. In the past months, we’ve applied this approach using several pilot cases in collaboration with our partners.
In our pilot cases, different stakeholders (analysts, agronomists, local experts) were consulted to share what farm and farmer characteristics are key in explaining differences in income from a specific crop and farmer income levels. We learned that it is beneficial to consult multiple experts since their views are often different. One analyst expert could highlight the usage of mechanised equipment as a determining factor based on findings from previous analyses, while a local analyst could contradict this since they’ve observed that this specific group of farmers does not make use of mechanised equipment at all. The diversity in opinions can be combined into a general expert opinion that serves as input for data-driven clustering, as visualised in the image below. Here, we can see that two variables are indicated by all three experts. It’s important to consult the experts in the design phase of your project since the interviews might reveal farmer characteristics you were not even planning to measure.
The outcome of this approach is a proposed segmentation of farmers where farmers are segmented based on several variables. The final selection of variables is done using both the expert opinions and the variance present in the data. It can serve to debunk or validate the individual expert opinions on the one hand, and add value to your farmer insights to make more data-informed decisions on the other hand.
Analysing the data
Once the groups are defined, it is time to start the analysis. As a first step, we study the groups in more detail to learn what other factors make them different. As an example, we may have grouped the farmers based on their farm size, age, and membership of a cooperative, but it's also of interest to understand other characteristics of the different groups such as their education level, household sizes, or access to finance. Studying this helps you to identify possible drivers of income. A second step in the analysis is to assess what drivers of income are within your ability to leverage. Once this is clear, you can model how leveraging these drivers affect the income of different groups. We learned that what may help to increase the income of one group may not be a relevant intervention for the other group, as the likely income increase is minimal. The last step in the analysis is to discuss the findings with the different experts. While the data analysis may help you identify interventions to increase the income of the different groups, we know that it also leads to more questions. Questions about why the differences between groups are the way they are, and about the feasibility of the different interventions you identified. It is key to discuss these questions with the experts, and preferably with farmer representatives that know the context, to come to an informed decision on the next steps. A decision that is not data-driven but data-informed.
Integrating this combined approach of expert input and a data-driven technique is very promising. A better segmentation means a better understanding of the farmer group in a certain case. This is the first step to developing smartly-targeted interventions to improve the farmers’ livelihoods.
Do you have in-house data of a target group which you'd like to understand better? Reach out to us. Akvo can help to customise this technique for your organisation’s activities. Let’s maximise impact and make your organisation’s operation and policy more efficient and farmer-oriented.