5 minute read

Data Malpractice

Behavioral economics research plays a critical role in today’s public policy decisions. What comes with those decisions, however, are spillover effects. Decisions are an incision in the timeline of humanity; lives are affected, and money is spent implementing policies. It is well documented that, at times, research scientists fake data in order to get papers published. (Ariely et al). Political scientists should be wary of research-driven policy because of perverse research incentives, data manipulation, replication issues, and data dredging. This problem can potentially be solved through structural changes and an increase in data transparency.

In order to get an idea of how data malpractice occurs, it is important to first look at the cultural incentives that underly these transgressions. Perverse research incentives are the driving force behind such unethical research practice. “There are other concerns about the culture of modern academia, as reflected by studies showing that the attractiveness of academic research careers decreases over the course of students’ PhD program…reflecting the overemphasis on quantitative metrics, competition for limited funding, and difficulties pursuing science as a public good” (Edwards, Marc A., et al). The overemphasis on quantitative metrics is a mode of determinism keen to fetishize numerical results over best practice. When humans become overly focused on hitting numerical goals, biases insert themselves into the research, or researchers turn to statistical methods to ensure their work is published and bolster their resume, disregarding the true purpose of making new discoveries about the human experience (Catalog of Bias).

Behind the cultural issue of perverse research incentives are manipulation of the data itself. Dan Ariely’s tax study is one such example. In investigating Ariely’s study on whether participants are more likely to be honest when signing at the top as opposed to the bottom of a tax form, a team of independent researchers found that the data were visually indistinguishable from a synthetically generated uniform distribution (Simonsohn, Uri, et al). Thus, in their independent investigation, Data Colada found credible evidence of data manipulation utilized for the purpose of getting a paper published. This shows that research purportedly true can in fact be completely irrelevant. This is important because when policymakers are making monetary decisions based off research studies, they are under the assumption that the data is verifiable and accurate. The paper was not retracted until 2020, meaning that policymakers relied on a false research study for eight years. Although this case study is quite benign in nature, one can imagine heavier implications for a study of greater magnitude.

In addition to data manipulation, the replication problem is another fortuitus example of issues with data-driven research, at the forefront of behavioral economics. Although replication and data-sharing efforts have increased in recent years, the incentives with regards to replication are low. Replicators are commonly doubted, treated poorly, or questioned because they “[reflect] a lack of trust in another scientist’s integrity and ability” (Duvendack, M., et al). Replication was neither incorporated in the structure of research, nor incentivized, which must change if political scientists are to formulate the most just policies. Adding to the lack of incentive, there are also significant barriers to replication. Many researchers utilize proprietary statistical packages which prohibit copying the program; replication is impossible for the individual researcher without subsidized access to the same computer hardware and software used in the original study (Dewald, William G., et al). In essence, replication efforts will continue to fail without proper funding. If researchers cannot peacefully access the same framework that the original researchers did, replication efforts will remain low and the lack of data transparency in behavioral economics research will continue to be a significant issue.

Another form of data malpractice that occurs in the field of research is data dredging. In some epidemiological studies, “one in 20 of the associations examined [is] ‘statistically significant’” (Smith, George et al). Research scientists are bound to come up with significant results over the course of their study if they run enough hypothesis tests in their correlation matrix. The political nature and structural inequalities of our world perpetuate this possibility, as we live in a highly associated world. Issues that stem from systemic oppression show up in statistics, and the opposite is true as well: statistics has the power to create oppression (Clayton). Research scientists then hold the key to promoting equity and furthering human progress, rather than halting, thus maintaining age-old narratives.

The solution to the aforementioned issues with research-based policy lies in changing procedural incentives. “…the conditions of daily practice must elevate the importance of the more abstract, longer-term goals in comparison to the persisting importance of the concrete, shorter-term goals.” (Nosek, Brian A., et al.) Rather than being fixated on instant results, it is important to have a patient mindset and allow researchers to gather data, question their results, seek outside help, and ponder results. Allowing researchers to fully open their minds to all possibilities and relaxing constraints would help foster a more ethical and productive research environment. In addition, “It may be easier to conduct replications by crowd sourcing them with multiple contributors.” (Nosek, Brian A., et al.) A communal approach to replication efforts would quicken the vetting process of research papers, by having more bodies and more diverse perspectives inspecting the process from beginning to end. A communal and diverse approach would also be transformational in promoting equity and an egalitarian power dynamic within the scientific community. These are the structural changes that need to be implemented in order to fix the research process at the beginning, in order to avoid mistakes along the way.

In conclusion, perverse research incentives and a toxic research culture combine to form a pathway to data fabrication or research missteps. The problem with this is the fact that it causes public policy makers to lose money in their attempts to make informed policy decisions based on behavioral economics research. There needs to be a paradigm shift in research culture where we restructure the rewards and incentives in research. Whereas we have previously rewarded the number of publications, thus emphasizing quantitative metrics, it is more important to strengthen the communal culture of research, allow researchers to take their time, and strictly follow processes in order to ensure ethical research practice.

Works Cited

Clayton, Aubrey. “How Eugenics Shaped Statistics.” Nautilus, Science Connected, 14 Sept. 2021, https://nautil.us/how-eugenics-shaped-statistics-9365/.

“Data-Dredging Bias.” Catalog of Bias, 27 Nov. 2020, https://catalogofbias.org/biases/data-dredging-bias/.

Dewald, William G., et al. “Replication in Empirical Economics: The Journal of Money, Credit and Banking Project.” The American Economic Review, vol. 76, no. 4, 1986, pp. 587–603, http://www.jstor.org/stable/1806061. Accessed 28 Apr. 2022.

Duvendack, M., Palmer-Jones, R. W., & Reed, W. R. (2015, May 1). Replications in economics: A progress report · Econ Journal Watch : Replication, Data Sharing, publication bias. Econ Journal Watch. Retrieved March 18, 2022, from https://econjwatch.org/articles/replications-in-economics-a-progress-report

Edwards, Marc A., and Siddhartha Roy. “Academic Research in the 21st Century: Maintaining Scientific Integrity in a Climate of Perverse Incentives and Hypercompetition.” Environmental Engineering Science, vol. 34, no. 1, 2017, pp. 51–61., doi:10.1089/ees.2016.0223.

Kristal, Ariella, et al. “When We’re Wrong, It’s Our Responsibility as Scientists to Say So.” Scientific American Blog Network, Scientific American, 21 Mar. 2020, blogs.scientificamerican.com/observations/when-were-wrong-its-our-responsibility-as-scientists-to-say-so/.

Nosek, Brian A., et al. “Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability.” Perspectives on Psychological Science, vol. 7, no. 6, Nov. 2012, pp. 615–631., doi:10.1177/1745691612459058.

Nosek, Brian A., et al. “Transparency and Openness Promotion (TOP) Guidelines.” OSF Preprints, 5 Oct. 2016. Web.

Simonsohn, Uri, et al. “[98] Evidence of Fraud in an Influential Field Experiment about Dishonesty.” Data Colada, 25 Aug. 2021, https://datacolada.org/98.

Smith, George Davey, and Shah Ebrahim. “Data Dredging, Bias, or Confounding.” BMJ (Clinical Research Ed.), BMJ, 21 Dec. 2002, www.ncbi.nlm.nih.gov/pmc/articles/PMC1124898/.

Updated: