Trial by p-values: Preliminary thoughts on the Jens Förster report

I’ve just had a quick look at the report (available at Retraction Watch) that led to the investigation of Jens Förster for possible data manipulation. It makes the case that the data in three of Förster’s papers are statistically highly improbable, largely due to the fact that the means for the levels of various three-level factors all tend to fall in a straight line. There are also claims about the data being far too consistent across independent studies, of effect sizes being implausibly large, and demographically implausible samples.

It is dry, depressing reading.

From the comments I’ve seen on Retraction Watch and Twitter, some people are already convinced. For my part, I’m reserving judgment until the psychological/statistical community has time to complete its “post-publication peer review” of the report. To stimulate discussion, here are some thoughts I had after a first read:

1. The report analyzes just 3 papers. Forster is a highly productive researcher with 50+ papers. How were these 3 chosen? Were there other papers by Forster which did not show any questionable patterns?

2. The F-test for linearity assumes continuous DVs, but some of the DVs are discrete (from rating scales). The simulations at the end of the report suggest that the test might be robust against violations of this assumption, but are the simulations themselves valid and based on reasonable assumptions?

3. Control studies were selected based on a search for single factor 3-level designs. Do control studies involve same type of data? Did the selection process for the control studies mimic the selection process that led to identification of the 3 questionable papers?

4. Could p-hacking give rise to a linear pattern? (this idea is from @bahnik)

If this is going to be a trial by p-values, I hope that we can make sure that Jens Förster gets a fair hearing!


One thought on “Trial by p-values: Preliminary thoughts on the Jens Förster report

  1. These are all good questions. The best is #1. As a personality psychologist I believe in cross-situational consistency, and indeed judging from past cases it seems like nobody ever fakes data just once. If anomalous data patterns were found in more than a few other papers by the same author, that would be very disturbing. On the other hand, if others look statistically “normal” I would be inclined to look more critically at the claims being made about this one.

    By the way, at the NSF meeting I attended a couple of months ago, ( officials signaled an eagerness to fund research on, among other things, fraud detection. The conversations about this case seem to indicate that more needs to be known about the techniques that are being used to draw conclusions about seemingly anomalous data. Who wants some government money?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s