Reality (TV) Check: The Value of Replication Studies

When you ask an applied economist what distinguishes economics from other social sciences, the likely answer will include economists’ use of rigorous quantitative methodologies lent from the natural sciences – where empirical strategies are judged against a gold standard of randomized experimental designs.

Despite this methodical application of research criteria that are not always easy to implement in the analysis of social interactions, economists put far less emphasis on a related scientific practice that is standard in other disciplines like medicine or psychology: the replication of earlier results. This appears especially worrisome with respect to the strong influence of economic findings on policy making. Many reasons for the lack of interest in replication studies can be traced to the lack of career incentives to do so, e.g., through the unwillingness of editors to publish research on the (lack of) replicability of previous results.

A noteworthy exception has now been undertaken by researchers David A. Jaeger (CUNY Graduate Center, U Cologne, IZA, NBER), Ted Joyce (Baruch College, CUNY Graduate Center, NBER) and Robert Kaestner (UC Riverside, NBER) and can be accessed as IZA Discussion Paper No. 10317.

The authors address a recent article in the American Economic Review (a flagship journal of the economic profession) that had been published after a seemingly thorough peer-review process in 2015. In the paper, Media Influences on Social Outcomes: The Impact of MTV’s 16 and Pregnant on Teen Childbearing, researchers Melissa Kearney and Philip Levine conclude on the basis of their statistical analysis that the MTV reality shows 16 and Pregnant, Teen Mom, and Teen Mom 2 caused a 4.3 percent drop in teen birth rates between July 2009 and December 2010 by dramatizing the challenges of pregnancy and child rearing. The paper garnered immediate and widespread media attention in print and on TV.

Despite this influence on public debate and the apparent high quality of the publication (signaled through the AER’s reputation), Jaeger, Joyce and Kaestner now conclude from a reassessment of Kearney and Levine’s results and research design that causal conclusions about the impact of 16 and Pregnant on teen births are unwarranted.

The original approach utilizes the fact that the attributes of MTV’s viewership prior to the beginning of 16 and Pregnant broadcasting in June 2009 had been heterogeneous across US regions (designated market areas or DMAs). This would allow the researchers to uncover differences in the intensity of the impact of the reality show.

Birthrates were already declining in some regions

Jaeger, Joyce, and Kaestner argue that because 16 and Pregnant began broadcasting everywhere in the U.S. at the same time, there is no clear way to identify teens who were not exposed to the show; in other words, there was no group that could serve as a comparison group. Such “control” groups are critical for eliminating the possibility that other changes might have affected outcomes in addition to or instead of the availability of television or the specific programming.

The authors of the rebuttal argue that other unobserved factors that coincidentally happened in the same time window as the broadcasting of 16 and Pregnant – such as locally deteriorating labor market conditions after the beginning of the Great Recession – could have also influenced the outcomes found in the original study.

kl2015aer-fig5

Figure 5 in Kearney and Levine (2015)

If this claim is valid, then the question arises: Are the regions in which MTV was watched more frequently by young people prior to the beginning of 16 and Pregnant different from regions in which they watched much less MTV? And if so, could teen birthrates have already been falling faster in the regions with high MTV viewership relative to regions with low MTV viewership before the release of 16 and Pregnant?

jjk-izadp10317-fig4

Figure 4 in Jaeger, Joyce and Kaestner (2016)

To answer whether these regions were in fact different, the authors begin by replicating the exact same statistical methodology as the original study but extend the observation window by several years. Extending the time window by 3 years indicates a noticeable downward trend in birth rates even before the broadcasting of 16 and Pregnant, which according to Jaeger, Joyce and Kaestner, invalidates the original research design. In addition, they find little evidence of a discontinuity at the point when 16 and Pregnant was released (click figures to enlarge).

Artificially changing the broadcast dates challenges show’s effect

Similar to clinical studies, the authors use what is known as a placebo test to demonstrate their findings. If 16 and Pregnant actually reduced teen births, no effect should appear when the original analysis is replicated with the “broadcasting” of 16 and Pregnant being artificially assigned to placebo periods prior the actual premier in 2009. When changing the release date to 2005, 2006, and 2007, the placebo tests confirm that pre-trends in regions with high MTV viewership indeed have confounded the original results. Regardless of the chosen fictitious broadcasting, significant negative effects on fertility appear where none should be (see table below).

jjk-izadp10317-table1

Excerpted from Table 1 in Jaeger, Joyce and Kaestner (2016)

What are the lessons from this reassessment? Most importantly, Jaeger, Joyce and Kaestner’s revisiting of the original results adds to the growing evidence in economics and other social sciences that replication is important and necessary. Without this replication, the problems in the original analysis would not have come to light, and there would be no opportunity to correct the record on the effect of reality television on teen reproductive activity.

Beyond the purely scientific correction, because the original study also attracted extensive media coverage, policymakers may believe that “nudges” like those represented by 16 and Pregnant are effective when, at least in this case, no causal link has been proven. Getting the answer right, which depends on both revisiting the analysis with the original data and replications of the “experiment” in different contexts, should have a higher priority in economics and social sciences journals.

Read the whole paper (IZA DP No. 10317):

Update (Nov. 2, 2016): Kearney/Levine have posted a response (IZA DP No. 10318).

Providing incentives for replications

The rebuttal by Jaeger, Joyce and Kaestner highlights the important role replications play in building a robust base of empirical evidence. A previous IZA Newsroom post discussed the lack of replications as a classical “tragedy of the commons”: There is wide agreement that replications are useful, but most people count on others to conduct them. New incentives, e.g., through better publication possibilities or specific funding supporting this type of research, have to be provided to raise the intrinsic value of replications, especially for early career researchers.

This entry was posted in Research and tagged , , , , . Bookmark the permalink.