A shock to the economy such as that wreaked by the COVID-19 pandemic can lead many to start receiving income support from the government. Some individuals may end up relying on government transfers for protracted periods of time. Such welfare dependency can have demoralizing effects on recipients and create large outlays for governments.
A new IZA discussion paper by Dario Sansone and Anna Zhu shows how machine learning algorithms can be used to predict which individuals are most at risk of becoming long-term welfare recipients. The fusion of high-quality big data and machine learning methods allows these researchers to provide better predictions than commonly used benchmark models. Specifically, they can predict the proportion of time individuals are on income support in the subsequent four years with at least 22% greater accuracy than standard early warning systems.
Machine learning can be successfully applied to large administrative data
Governments increasingly use machine learning to tackle social problems and to make resource-allocation decisions. For example, it has been used to help judges to improve bail-granting decisions, schools to identify students at risk of dropping out, and surgeons to screen patients for hip-replacement surgery. In this study, the authors use data on the full population of Social Security enrollees in Australia.
These data include daily information on the income support receipt patterns of millions of individuals and their household members, as well as other demographic and socio-economic information. The size and richness of the dataset makes it ideal for a machine learning application, allowing the algorithms to achieve high performance by detecting subtle patterns in the data and by identifying new powerful predictors.
The authors’ approach is aimed at complementing existing early intervention programs targeted to long-term welfare receipt. Before governments can implement these programs, they need to know which individuals are most at-risk – a role that can be ably fulfilled by machine learning algorithms. Additionally, these improved predictions may reduce conscious and unconscious biases common in human decision-making. Importantly, the approach is also relatively low-cost to implement since it exploits administrative data already available to practitioners.
Remaining challenges and next steps
Despite their growing popularity, there is still a large degree of skepticism about the impact of adopting such automated systems because of accuracy concerns and bias reinforcement. This is why the authors do not believe their algorithms should replace human expertise but rather act as its complement. For example, caseworkers could focus their attention and time providing personalized service and targeting the appropriate support to individuals that the algorithm identifies as most at risk.
Similarly, identifying individuals who are at-risk is only the first step. Policymakers intending to help suitable individuals promptly leave the welfare system also need to know which interventions work and on whom these interventions are most likely to work. Prediction alone cannot answer these questions: causal estimations such as those obtained from randomized controlled trials are required here, ideally combined with machine learning to identify the sub-population of interest.