Collecting and using large amounts of geo-referenced data, all part of so-called ‘big data’, is all the rage these days. In TAMASA we have a spatial sampling frame for collecting household and farmer yield data, which on a map looks very neat and tidy, and eminently ‘doable’. For example, in Tanzania this frame covers 650 households located within 25 randomly located 10×10 km grids spread across Tanzania’s major maize growing areas in the Southern Highlands and Northern zone. Within each grid, surveyed households are drawn from a further randomly defined set of three 1×1 km grid cells.
There is increasing interest, and use, for spatial and temporal estimates of on-farm or farmer yield. These data are most obviously needed for food security assessments and national or regional planning (i.e. greatest need and greatest opportunity for investment). Sustainable intensification and closing the yield gap are very much in vogue, and farmer yield is the baseline that defines the potential opportunity. Furthermore, many applications or models need this baseline to make, for example, nutrient recommendations. One could argue, and I would certainly do so, that we know technically how to close the yield gap in any given location. So, can we estimate farmer yields with any degree of confidence spatially and temporally, especially in sub-Saharan Africa?