There is magic in large numbers. Most often we scientist – regardless if we are field scientists or lab rats – struggle with acquiring sufficiently large sample sizes for the statistical tests we have set out to do. There are ways to deal with sparse data, but nothing beats a good-looking huge dataset if you want to test your hypothesis with confidence. Moreover, given that every biological system we measure has a degree of uncertainty, so called noise, means that if we are to find effects that are small we need to collect a lot of data.
Earlier this year, I co-authored a publication on Campylobacter epidemiology that really took advantage of large numbers. In this case, Cody et al. investigated if people get campylobacters from wild birds. This is something that has been suspected given the huge impact domestic poultry has – the single largest source of human campylobacteriosis – but not really proven. Over the years, the lab in Oxford has collected an enormous dataset on the occurrence of Campylobacter jejuni in patients in Oxfordshire, UK. Not only is there a lot of data, each and every clinical case is associated with a genotyped bacterial isolate. That is an awesome treasure trove to investigate.
In this study, 5628 genotyped clinical isolates from Oxfordshire were run in a STRUCTURE analysis to try to associate each isolate with a putative source. The rationale here is that there are distinct sets of C. jejuni genotypes in different types of animals, especially in different species of birds. And as campylobacteriosis is a zoonotic infection with little to non human-to-human transmission such an analysis can indicate the degree of relevance of different sources for human epidemiology.
Did that sound awfully advanced? Perhaps. It really is quite simple. Consider you make a row of bins. Each bin gets a name, such as ‘chicken’, ‘cattle’, ‘goose’, ‘blackbird’ etc. Then you take each bacterial isolate in your hand, scrutinize it and put in a bin that you think it fits best in. A little bit like a sorting box for children. Starshaped objects go into the starshape hole, square objects in the square hole, etc. Except that it in this case it is the degree of resemblance at the genetic level that decides whether an isolate should be grouped with a particular source. The second thing is that you let the computer rerun this procedure over and over again until you get a probabilistic assignment to each bin.
In this paper, it was shown that the proportion of clinical isolates from Oxfordshire attributed to wild birds was 2.1%-3.5% each year. That is way lower than the values for chicken products, but given the very high incidence of campylobacteriosis in the human population it still means a large number of actual infections caused by bacteria that normally are found in wild birds. Which wild birds, you may ask. Primarily thrushes, is the answer – at least in Oxfordshire. The blackbird and the song thrush are two common garden birds that like to live close to us humans. Looking at the seasonal variation, the analysis showed that wild bird associated campylobacteriosis cases was more common during the warmer months of the year. This makes sense, as it is in summer when we loiter around in our gardens, and in nature, eating fruits and vegetables potentially contaminated with bird feces.
There is magic in large numbers, for sure.
Link to the paper:
Cody, A.J., McCarthy, N.D., Bray, J.E., Wimalarathna, H.M.L., Colles, F.C., Jansen van Rensburg, M.J., Dingle, K.E., Waldenström, J. & Maiden, M.C.J. 2015. Wild bird-associated Campylobacter jejuni isolates are a consistent source of human disease, in Oxfordshire, United Kingdom. Environmental Microbiology Reports 7: 782-788.