Sunday, February 22, 2009

A mild defense of bibliometrics

In an ideal world, bibliometrics would be a character from Asterix (the village librarian, presumably). In real life, they are one of the dark clouds that hang over the academic world, and give it cause for concern. They are going to be used in Britain's Research Excellence Framework, which may be enough to antagonize a few people. And yet, I'd like to propose that they have some merit; to make a half-hearted case in their favour. I don't have a particularly good bibliometric standing myself, so this defense is not motivated by self-interest.

Here are two places where I have recently seen bibliometrics coming under fire. This post at Luca Aceto's blog has a go at them, and links to a statement by the editorial board of a journal that criticises them. More recently, I have been following a debate that was begun by a letter to the Times Higher a week ago that criticised the emphasis on economic benefit of research proposals in funding decisions; one place this debate has been continued is here at Steven Hill's blog. Hill commends bibliometrics as evidence that Britain's scientific prowess is undiminished; his opponents challenge this evidence. Hill's status as a funding chief marks him out as the villain of this particular battle, thus bibliometrics, as a measure of a person's scientific contribution, are further tainted.

So what do I like about them, then? Actually, as a measure of scientific contribution, they are indeed inaccurate. What I like is not so much the way they measure a researcher, as the way they incentivise one. Let's accept as a given that one's scientific output must, from time to time, be assessed for quality. The act of measuring will inevitably affect behavior, as the measurement gets announced ex ante, so that measurement-as-incentive is just as important as measurement for the sake of measurement, if not more so. Now, to boost your citation count, what must you do? Write stuff that other people find interesting, seems like an obvious answer. This seems positively virtuous. (An alternative way to measure a person's research output, which is not unfamiliar to most of the scientific community, is to compute his research grant income. Grant income may indeed correlate with research quality, but it seems clear that the pursuit of grant income is by no means as socially virtuous as the pursuit of citations.)

To develop this observation about measurement-as-incentive in more detail, consider h-index, for which I will now include a definition for the sake of completeness. A person's h-index is the largest value of N such that he/she has written at least N papers each of which have been cited at least N times. Again, as a measure there are problems with this - if my h-index is 10 and I write a new paper that picks up less than 11 citations, it cannot improve my h-index. But surely that paper should have positive contribution to my research output? Yes, but h-index encourages a researcher to "raise his game" as his h-index increases; the better you do, the more ambitious should be your next effort -- and this seems like a good thing to encourage.

Now, let me turn to the most obvious fault of bibliometrics, which is their weak correlation with research quality. My hope, as a technocrat, is that technology will in due course alleviate this problem. Let me give one example: their current failure to distinguish between a high-quality and a low-quality citation. By way of example (I exaggerate to make the distinction obvious) a paper may be cited in a context like: ...finally, the papers [4,5,6,11,12,13,17,24] also study minor variations of this model, thus providing useful evidence that this is a "hot topic". or it may be cited in a context like We make repeated use of a fundamental and masterful theorem of Haskins [13] without which the present paper could never have been written. Both of these sentences would result in [13] picking up a single citation; clearly, one would like the second to carry more weight. In the future, better computer analysis of citation contexts may well allow this distinction to be make. One may also hope that better citation analysis may also be able to detect other undesirable artifacts, such as a flurry of mediocre papers that all cite the previous ones but have little or no outside influence. Another idea I have is a simple one - take the age of a cited paper into account. It should be more a valuable citation when you cite a 10-year old paper, than when you cite a 1-year old one.

Finally, a possible strength of bibliometrics, acknowledged in Oded Goldreich's (otherwise critical) essay On the Evaluation of Scientific Work, is that the errors they make are the work of many, not the work of a few, as is the case for expert evaluation. So, can "the wisdom of crowds" be harnessed more effectively? Perhaps. Indeed, making it work could turn out to be quite a technically interesting problem.