Statisticide III: Nurse Lucia
This is the second follow up in the Statisticide thread. I will discuss here the trial of nurse Lucia de Berk.
At a children’s hospital in Netherlands in 2001, an unexpected infant death caused a review of past incidents. It was discovered that child nurse Lucia’s shift coincided with many of these incidents. Subsequently, these deaths till then thought unremarkable, were marked suspicious, and charges were pressed against Lucia for murder of her patients. The mainstay of the prosecution was their statistical argument: the probability of a nurse’s shift coinciding with so many incidents by mere chance was minute: 1 in 342 million! Lucia was sentenced to life imprisonment for 4 murders and 3 attempted murders.
A note of caution here: In two cases against Lucia, medical evidence was quoted as the basis of the guilty judgment. This evidence has since been contested by medical experts. It is not the point of this writing to convince you that Lucia was definitely innocent. I only point out the incongruity in the statistical argument, the miscalculation involved in obtaining 1 in 342 million, and the questionable relevance of this probability figure.
Based on preliminary analysis following the death of an infant during Lucia’s shift in September 2001, Lucia was suspected of being involved in as many as 30 incidents across hospitals she had worked in. Many of these could however not be linked to Lucia because of her absence during these. This set of incidents was no longer regarded as suspicious. This is the first and most primary of the statistical errors in this case. The list of suspicious incidents should have included all (and only) those deaths/near-deaths for which definite natural reasons could not be assigned. For instance, if there was a total of 50 incidents, of which 20 were unexplained, Lucia being present in (say) 10 of them, then the list should include all, and only, these 20 incidents; not only the 10 Lucia was present at. Lucia’s presence itself should not be criterion for classification. This is called confirmation bias: In the verification of a hypothesis, selecting relevant data again based on the same hypothesis. This obviously stacks the odds against Lucia!
This problem was further compounded during data collection. Initially, many of the ‘suspect’ incidents were classified as ‘natural’, including the last infant death. When inquiries were made so as to reclassify incidents as suspicious or natural, the people who were asked these questions very obviously knew they were being asked to ascertain Lucia’s involvement. Considering that she was already being reviled in the media, and the 342 million figure was so widely known, their responses have very little chance of being unbiased.
Selectiveness in assessing data has another major issue. A very relevant question that needs to be asked is: What was the trend in ‘suspicious’ deaths before nurse Lucia arrived at the scene? Lucia’s hospital unit, reported 6 unexplained deaths during her two years there. However, the same unit also reported 7 unexplained deaths for about the same length of time before she arrived at her hunting ground!
Another issue is the choice of statistical models for determining the probability value. In such real-life problems, various statistical methods can be applied depending on the extent of information available and the assumptions. Using different models, estimates of the same probability, computed as 1 in 342 million by the prosecution, have been calculated to be as high as 1 in 10 and 1 in 48 (after also making corrections for the other issues with this figure). I am not detailing here the more technical issues (like multiplication of p-values instead of Fischer combinations, and deliberations on which models – Bayesian/Epidemiological etc.) regarding the computation of the 1 in 342 million number, but you are welcome to see the references for more.
Not only is the figure of 1 in 342 million highly suspicious, the question arises: is this number relevant at all. And here again, is the prosecutor’s fallacy. Mark Buchanan states in his article in Nature:
“The court needs to weigh up two different explanations: murder or coincidence. The argument that the deaths were unlikely to have occurred by chance (whether 1 in 48 or 1 in 342 million) is not that meaningful on its own — for instance, the probability that ten murders would occur in the same hospital might be even more unlikely. What matters is the relative likelihood of the two explanations. However, the court was given an estimate for only the first scenario.”
It’s not the probability of an innocent nurse’s shifts coinciding with the incidents, but the probability of a nurse whose shift coincides with the incidents being innocent, which is important. (This I pointed out earlier too, in the first article in this thread.)
Following protests by statisticians across the world, a Dutch government committee was set up to deliberate on whether or not to reopen Lucia’s case. Other evidence was also called into question recently following emergence of new facts. The case was subsequently reopened in October 2008. It might be the case that the culprit is statisticide, and not Lucia.
- Mark Buchanan’s article in Nature
- Oxford Journal publication detailing statistical issues in the case
- Wikipedia page on Lucia de Berk