Feed the Future
This project is part of the U.S. Government's global hunger and food security initiative.

Measuring economic well-being from space

This post was written by Marshall Burke and David Lobell, with Stanford’s School of Earth, Energy & Environmental Sciences.

Our first post looked at how we can use newly-available satellite imagery to dramatically improve the measurement of agricultural productivity on smallholder fields in developing countries.  New research is also showing that these satellites can be used to measure other key components of household livelihoods, such as their asset wealth and consumption levels.  As in agriculture, the hope is that these satellite-based approaches can strengthen traditional ground-based approaches to livelihoods measurement, by dramatically expanding both the scale and the timeliness with which these measurements can be made.  In doing so, it should allow practitioners to better target interventions, monitor progress, and understand programmatic impact.

How might satellites be able to shed light on local-level livelihoods? A simple rule of thumb is that if humans can detect things in imagery that might be predictive of local-level livelihoods, we can train a computer to do it as well.  We’re now very accustomed to looking at high resolution daytime satellite imagery (e.g. the sort viewable on Google Maps), and many objects that are visible in these images are plausibly related to the economic well-being of households:  the size of houses, the density of buildings, the existence of paved roads and other infrastructure, the frequency of specific assets like cars. 

But other types of imagery, and other features of this imagery, are also useful in distinguishing local-level differences in economic development.  For instance, researchers have long noted that cities and economically active areas show up brighter in nighttime images.  Even daytime imagery that is too coarse to identify individual objects (e.g. cars) can still pick up on clues that a given area is wealthier than another, for instance by identifying the density of urbanization or the productivity of agriculture.

In fact, modern “deep learning” approaches to extracting information from imagery do not prescribe what the computer should look for in an image.  Instead, when given enough training data, deep learning models figure out what patterns or features in each image are useful for solving whatever task they’re given (e.g distinguishing dogs from cats, or rich villages from poor villages).

As in agriculture, the constraint to making these models work is rarely the imagery inputs but instead the availability of high-quality, georeferenced ground data that can be used to train these data-hungry models.  And unfortunately, in most development settings, we have orders of magnitude less data than is typically available in deep learning applications – e.g a few thousand villages where we know wealth levels, as opposed to many millions of images where we know “dog” vs “cat”.

Figuring out how to exploit these limited training data is the key challenge for anyone trying to use imagery to predict local-level economic outcomes. In our research – including work with multiple excellent students and postdocs – we’ve tried a number of approaches, including training models that use both coarse daytime and nighttime images as inputs or intermediate training data, models trained to look for specific objects such as cars and buildings in high-resolution images, and models that simply try to group similar looking images together.  

All approaches seem to have something to add.  In Figure 1 below, we show results for two different settings.  The top two panels show the performance of a model that’s trained to find specific objects such as cars and buildings in high-resolution imagery, and then use counts of these objects to predict village-level consumption expenditure.  In ongoing work in Uganda with students and postdocs here at Stanford, we show that this approach can explain about half of the village-level variation in consumption expenditure, which is above previously published benchmarks. The advantage of this approach is that it’s fairly interpretable and performs well, while a disadvantage is that high-resolution imagery can be expensive and not available at regular intervals going back in time.  

Figure 1. Top panels: predicting consumption expenditure in Uganda villages using an object identification model and high-resolution imagery.  Bottom panel: predicting asset wealth across African villages using coarse-resolution images.

The bottom two panels in Figure 1 show a model trained on coarser daytime imagery that is available going back many decades.  In ongoing work, we train a deep learning model to use this imagery to predict village-level asset wealth, and can explain over 70% of variation in asset wealth across our set of study villages.  Here we are evaluating the model on held-out villages in countries that the model has not seen – i.e., we train a model on half the countries in our dataset, and evaluate it in the other countries it hasn’t yet seen.  This is a hard test, but resembles a real-world setting in which we might want to deploy this approach in a country with no training data at all.  The fact that imagery alone can explain almost three-quarters of village-level variation in wealth is remarkable. 

Asset wealth (which is constructed from survey responses to a short number of questions about asset ownership) is typically well-measured in ground surveys and changes slowly over time, which is probably a main reason our asset wealth predictions are more accurate than our consumption expenditure predictions.  And, as in our agricultural setting, these numbers would probably be even higher were it not for noise in the ground data, including noise from sampling variability (village-level means are often constructed from only 10-15 households) as well as from noise that has been purposefully added to the latitude-longitude information to protect households’ privacy. 

Because imagery is available everywhere, we can again use these approaches to generate estimates of local level economic well-being across wide geographies. In our view, academia is not the right place to continually serve and update these types of datasets and help them get incorporated into the operations of the many organizations that see value in them. And so, as noted in our earlier post, we have founded AtlasAI (http://www.atlasai.co) to generate and serve these estimates, and are beginning to release continent-wide maps of asset wealth that will be updated over time.  Stay tuned for more, and please get in touch if these sorts of estimates can be useful in your work.

This figure shows where the research in today's post contributes to the Feed the Future Results Framework