Friday, November 23, 2012

How noisy is economics/finance peer review?

My paper’s first main finding is a quantitative estimate of the objective component in the economics/finance refereeing process. Consider a scale defined by a single parameter that measures referee accuracy, named lambda. Lambda can be from 0 to 1 and measures the fraction of the referee report that constitutes an objectively agreeable paper quality. ... λ = 1 means that every referee reports the paper’s objective aspect. λ = 0 means that every referee reports noise. ...

The observed consensus estimates among referees were λ ≈ 0.30 for the Journal of Finance (JF) and Review of Financial Studies (RFS), λ ≈ 0.35 for Econometrica (ECMTA), the Quarterly Journal of Economics (QJE) and the SFS Cavalcade; and λ ≈ 0.40 for the International Economic Review (IER), the Journal of Economic Theory (JET), the Journal of the European Economic Association (JEEA), and the Rand Journal of Economics (Rand). 

Roughly, referee reports were one part signal, two parts noise. ...

For economics journals, when two referees are consulted, the top-10p [percentile] paper receives two rejects with probability 14%, one reject and one non-reject with probability 47%, and two non-rejects with probability 40%. With three referees, the top-10p papers receives a majority of reject recommendations with 30% probability, a majority of non-reject recommendations with 70% probability.

For finance journals, with their lower lambdas and higher rejection probabilities, the higher than 50% reject probability for the top-10p paper results in a strange situation: The more referees are consulted, the more likely it is that the referees will agree that the top-10p paper is bad. For this top-10p paper, with one referee, the probability that the majority of referees recommends rejection is 38%; with three referees, it is almost 70%. (This also obviates the idea of using a tie-breaker referee when two referees disagree.) In fact, only the top-2p papers have a conditional probability of rejection that is less than 50%, resulting in a majority rejection probability that does not increase with the number of referees.