Above: Lal Singh (right) from Seva Mandir interviews a respondent on food security in Udaipur, India, 2013.
Data standardisation is an essential but often overlooked part of the data journey. By instilling data standardisation into your data processes right from the start, you can make sure that your data provides valuable and comparable insights both now and in the future. Data standardisation can be interpreted a number of ways. Here, we mean both data standards, which are standardised approaches developed with a sector or thematic focus, and the standardisation of the data process, which focuses more on how you process your data to ensure consistency across the board. Incorporating these two aspects into your work will not only mean that the data will be comparable within your own programme, but also across time and space. Follow these six steps and you're ready to go.
Here are some key tips to keep handy:
- Ensure that your standardisation is instilled in every part of your data process, from design all the way to act.
- Review and evaluate whether specific data or survey standards are relevant to your data collection. If so, look at how you can incorporate them into your work.
- Test run and pilot the full process before implementing - you may find you have overlooked a key question or piece of information.
- Ensure that all the actors in the process, from data collectors to analysts, understand the standards you have set in place and their importance.
- Document your logic, and any variations to the standards you have made.
Step one: Research and review pre-existing data and survey standards
Increasingly, organisations and programmes are seeing the value in having standardised data definitions, tools, and processes. Regardless of which sector you’re working in, there’s likely to already be a standardised survey or certification standard available. For instance, in water, and sanitation, hygiene (WASH) the Joint Monitoring Programme (JMP) between UNICEF and WHO has produced several standard definitions and questionnaires for measuring the access level to WASH in households, schools, and medical centres. Likewise in agriculture, certification systems exist for just about every commodity on the market. Some, such as the Sustainable Rice Platform (SRP), are set to have a standardised series of questions that are asked over the course of several years to demonstrate a farmer’s progress towards sustainably produced rice. There are even standardised question series and full questionnaires for screening for trauma and progress towards empowerment.
There are two key areas to consider when reviewing data standards:
- Definitions - these are the defined terms and parameters within a data standard. This covers both what is meant by a specific term, for instance “pit latrine,” and also the defined parameters that make up a definition, for instance “safe water source.”
- Standardised questionnaire - Often, a data standard will come with either a full questionnaire or a series of standardised questions which incorporates set definitions.
Start by researching the available standards and tools in your area of work, and compare these to your own needs. It may be that the data standards and definitions only partially match your needs. In step two, we will look at what you can do if this is the case.
Step two: Incorporate existing standards into your data process
Once you have identified the relevant standard (or standards, depending on your needs and topic), you will need to decide whether the standard matches your needs as it is or whether you only need to incorporate part of it. Both of these options have their pros and cons. If you take the whole standard, you’ll need to double check that you don’t need any additional information to what the standard asks for. Alternatively, if you incorporate only part of a standardised questionnaire, you’ll need to make sure that it fits well into the flow of your overall survey. It’s important to make sure you aren’t asking for the same information multiple times in different ways.
If you're using an international standard, or one that has been used in a different country or context, you should be very careful when incorporating it to ensure that it is relevant to your situation.
Step three: Map out the data process according to your standard
According to the data standard, you can map out your data processes. This involves defining and documenting what you plan to do at each step of your data process to ensure it is consistent and replicable. Often, a data standard will have a scoring pattern in place in order to measure compliance with, or adherence to, the practices the data standard is promoting. You will need to be aware of these calculations when incorporating the data standard into your questionnaire.
Above: Scoring pattern to evaluate farm management pre-harvest according to the Sustainable Rice Platform (SRP) standard.
Step four: Train data collectors and field test the data collection tools
It is important to emphasise the importance of data collectors having a common understanding of the data collection tools, and in particular the standard (or partial standard) you have incorporated. Without building this into your training process, you could encounter issues with data quality. During the data collector training, field test the survey to ensure that the questions and standards are relevant/appropriate to the local context of your work. This is particularly important when using an international standard. Some data standards allow adjustments to the questions. You will also need to continue to monitor the incoming data to ensure there are no outliers or inconsistencies. For any changes to be made to the standardised questions, these should be documented for the data analysts and for future use.
Step five: Set up a standard approach for data cleaning, transformations and visualisations
As mentioned in step three, you will have already noted the scoring pattern and built it into your data collection. The data standard is likely to have an in-built scoring system as a way of calculating compliance to the practices they are promoting. By using a data standard, you will be able to set up standard data processes such as standard transformations/calculations in analysis and visualisations. Ideally, this will reduce the amount of data cleaning you need to do as the answers will already be standardised. However, there are always a few outliers that need to be cleaned, for instance a qualitative answer that needs to be thematically coded.
Data standards allow a structured approach to data cleaning and recoding. This should include backing up the raw data, any cleaned/recoded data, and an overview of how (if any) answers have been coded. By having these in place, you can create standardised visualisations for comparison across projects and locations.
Step six: Reuse the data process and your data
Following a data standard and/or a standardised data process will allow you to replicate the process easily. It also gives you a place to better evaluate your pre-existing systems and procedures in order to improve them. Actively use your data to do cross comparisons, and ensure that you use your data for decision making. Once you’ve standardised your data process, you can make sharing, merging, comparing and ultimately using data easier.