By Sarchil Qader, Nicci Potts, Giulia del Panta and Paddy Brock
We don’t have time for business as usual. Global forced displacement is on the rise. To respond effectively, governments and their partners need data that is timely, high-quality, and accessible. But collecting this takes time and is expensive. Given the urgency and scale of the need, what can we do to accelerate and enhance the generation of socioeconomic data on forced displacement? One answer is to responsibly harness open data and innovative methods. To do this well, we need to work together, exchanging knowledge and tools, so that we make progress as quickly as possible as a community of practice. This is why we took part in a session at the UK Royal Statistical Society (RSS) International Conference, convened by the World Bank-UNHCR Joint Data Center on Forced Displacement (JDC) with the RSS International Development Section.
Our session took the audience from the national to the global, and from internally displaced people (IDPs) living in shelters in East Africa to people seeking asylum in Europe. It resulted in a rich discussion on the opportunities and challenges in this space.
– The starting point was existing socioeconomic microdata on those affected by forced displacement. At the JDC, we have collated datasets that include a representative sample of refugees or IDPs in an explorer, along with a methods paper and blog. Through our new strategy, we are exploring how innovation can help to fill data gaps, particularly by including forcibly displaced people in national statistics. You can read more about our work here.
– At the UK Office for National Statistics (ONS) and the Foreign, Commonwealth and Development Office (FCDO), working with government and international partners, we have developed a machine learning model that detects tents and shelters in IDP camps from satellite imagery. This model, with its automated data processing pipeline, can produce accurate tent and shelter footprints from satellite imagery in minutes, a task that would otherwise take months for a human to perform. The outputs from the models are being used to plan census enumeration where conflict, drought and flooding have displaced millions of people, ensuring that the most vulnerable are included in national statistics. You can read more about our project here.
-At the WorldPop research group at the University of Southampton we have worked on another way that innovative data work can enable the inclusion of the forcibly displaced in national statistics. We used UNHCR refugee registration data in combination with high-resolution population maps and data derived from satellite imagery as inputs to machine learning models to map refugee numbers in 100mx100m grid cells. Together with a data processing pipeline and merging algorithm for creating pre-enumeration areas to support National Statistical Offices prepare for census, this allows national household survey sampling approaches to take displacement dynamics into account. You can read more about this project here.
-At UNHCR, we have enhanced the timeliness of refugee and asylum-seeker population statistics. We collated data from a wide array of sources, including UNHCR’s registration system, government websites, and global broadcast, print and web news in over 100 languages (through the GDELT project). This data is used as inputs to a combination of different forecasting approaches, including machine learning models, to accurately nowcast refugee and asylum-seeker statistics for country pairs. This has meant that the frequency with which UNHCR publishes these figures, which are essential for planning, has increased from twice a year to monthly. You can read more about this project here.
There was positive and constructive feedback from the audience, which included technical queries on the methodologies presented, especially on model validation and performance, and which models and outputs are publicly available, as well as questions on data responsibility, and how partners are connecting on forced displacement data. On data responsibility, there was good back-and-forth on how we balance the value and risk of making models and outputs publicly available, for example, publishing low-resolution data in sensitive areas, and providing high-resolution data through data sharing agreements, and publishing nowcasts aggregated over appropriate periods of time. In response to questions on where to find the work discussed, we highlighted the Refugee Data Finder and Global Trends, both of which include the UNHCR nowcasts, the many datasets WorldPop make publicly available, and that ONS continue to invest in making their shelter identification model and outputs publicly accessible.
Together these initiatives underscore the transformative potential of data innovation to improve the global the landscape of socioeconomic data on forced displacement. The session catalysed new discussions on how best to take forced displacement dynamics into account in population estimates, official and national statistics, and strengthened links between partners working towards complementary objectives.