Grinding the crack – contemplating on rejection times

Our minimum NRT at the time of writing.

Our minimum NRT at the time of writing.

Science is slow and tedious. And that is how it should be. There is a beauty to behold in the accumulated data points in supplementary files and appendixes. A life worth of time, poured into tables. Neatly ordered references with doi numbers. Ordnung.

However, at times science is also dead fast. Try the last months of your PhD, for instance – that time just is surreal. But the fastest of all is NRT – the average timespan between the click on the button on Nature’s submission form to the email back in your inbox with a rejection letter. Most senior scientists I know have tried at least once to publish in Nature or Science, the two most famous journals across disciplines.

Most of them without success.

This makes for great conversation, and it is common to hear people telling stories of their NRT. Our record is 2 hours. TWO HOURS! Damn, that’s fast. A friend of a friend apparently was rejected within 22 minutes. That’s impressive!

At present we are up at 27 hours, 11 minutes and 31 seconds. Pretty good – at least we joke about it in the little cohort of authors. This likely means it will be read properly by an Editor, and hopefully make it out on review.

And if it doesn’t cut it for Nature, I am sure it will find a good home elsewhere – it is a very nice piece of work that taken us 3 years to analyze.

Meanwhile, enjoy Jeb Corlis in his Grinding the crack video. It summarizes the processes of publishing in top-notch journals better than any words could.

Just click it.

For you, Jo.



The final NRT for our manuscript was 49:30:21,92. Something to beat next time. Now this little baby needs to find another journal home

Influenza A virus history rewritten, once more

By Jonas Waldenström

Pale horse, pale rider - the Apocalypse foretold in the book of revelations. (Gustave Doré [Public domain], via Wikimedia Commons)

Pale horse, pale rider – the Apocalypse foretold in the book of revelations. (Gustave Doré [Public domain], via Wikimedia Commons)

It is time for yet another tale about pestilence – the scourges of mankind. Once again it is about flu, but today I will not present my own research. Instead I would like to write about a new article that was published a little while ago in Nature. It is a story about the speed of virus evolution, and how new analytic methods has put a bright light on some of the most influential disease events in the last 200 years. It is time to rewrite influenza virus evolution, once more.

The paper in question is both a very simple and a very complicated paper. Or rather, the idea is simple, but the science to test the idea is complicated and novel.

If we start with the idea: does the rate of virus evolution differ depending on which host species the virus infects? Remember that the influenza A virus (the correct name for flu) has a remarkably broad host range – there are a lot of viruses in wild waterfowl (those that we study), a more limited set in domestic fowl (which we don’t study), and a handful of viruses in mammalian host, including viruses adapted to pigs, dogs, horses, and, of course, humans. It is likely to assume that all ducks are fairly similar from a flu virus perspective. But is a duck really similar to a pig, or a horse, for an influenza virus? If so, what consequences would among-host species differences have for influenza evolution?

As it turns out, a lot.

If we press fast forward and jump directly to the results, it comes in the form of a tree. Or rather eight little mini-trees, one for each of the virus’ 8 RNA segments. Using fancy new phylogenetic methods, allowing the evolutionary rates to vary in the different host species, the authors ended up with trees that were much better dated than those from early analyses. This means that we now can say with good precision when the forks in the trees occurred and use that data to put epidemiological data in a historical perspective. Firstly, look at the trees for hemagglutinin (HA) and neuraminidase (NA). As evident from the long branch lengths, these gene variants go back a long, long time. The estimated time since the most recent common ancestor of all circulating HA subtypes and all NA subtypes is roughly a thousand years. This is a bit further back than previously believed, but not dramatically different.

The evolutionary trees of the influenza A virus segmented genome. Note that the H7N7 virus from horses is a sister group to all present day internal virus genes. Original paper can be found here.

The evolutionary trees of the influenza A virus segmented genome. Note that the H7N7 virus from horses is a sister group to all present day internal virus genes. Original paper can be found here.

If we now instead focus on the remaining six segments. Look at each of them in detail and you’ll soon see that they all look similar. Surprisingly similar, actually. It seems that all viruses circulating today (except those in bats) have genes for the internal proteins that stem from a limited set of variants that existed sometime around 1870.

Thus, around the mide-1800s something happened to the pool of influenza viruses. Backtracking disease history events the authors found congruence with a panzootic event of flu in horses – a worldwide epidemic in our four-legged friends. This panzootic somehow purged the world of all other internal gene variants, like a big broom on a dusty floor. Left were only the variants whose ancestors we see today. This is truly remarkable: a global sweep in the evolution of influenza! Why horses, you may ask. Aren’t they just decorative animals? Well, if we go back in the pre-automobile era horses must have been everywhere. They were used in agriculture, in mines, in cities – everywhere where something heavy was to be transported you could bet there was a horse around (as well as a few oxen, mules, donkeys). I think it may be hard to phantom for our phlegmatic selves how the world looked like before the advent of fossil-based engines. A big panzootic in horses would have had the potential to reach everywhere humans were and, as the data suggests, beyond, as even the viruses in wild ducks today have internal genes dating back to the 1870s.

Horse pulling a tram, late 1800s. (From BBC - A history of trams)

Horse pulling a tram, late 1800s. (From BBC – A history of trams)

Another important finding in this study is that the Spanish flu of 1918 actually seems to have originated in the Western hemisphere, and perhaps wasn’t overly Spanish after all. Reading this paper is just overwhelming for a flu biologist. Full of trinkets and goodies – I will return to it many times, for sure.

OK, let’s rewind the tape all the way to the beginning and look at the rationales a bit deeper (for those of you who are interested).

What evidence did we have of different evolutionary rates in influenza viruses before this study? What would we expect? To start with, different species have different morphology and physiology, the receptors that the virus binds to vary in their molecular structures, as do the distributions of cells with different receptor types in different organs. A duck and human are different both on the outside and the inside, so to speak. Also, the routes of virus transmission are different, being mainly fecal-oral in birds, and mainly respiratory in mammals. You release influenza virions when you sneeze, while the ducks poo them out with their feces.

These are well-known facts. We also knew that a host shift could make a big splash, both for the virus that is introduced, and for the viruses that were there before. For instance, when H5N1 crossed from wild birds into poultry a decade and a half ago it embarked on a rapid evolutionary journey. It quickly acquired a score of new mutations as it adapted to the poultry niche and, with time, geographically separated virus clades emerged – each with its own evolutionary trajectory. This initial evolution was fast and furious, and much faster than the normal pace seen in wild ducks.

Another example is pandemic flu in humans: when a new virus is introduced in the human population it will rapidly increase in frequency and is likely to replace previous strains, especially if they are of a subtype similar to those that dominate the existing human virus population. The new virus sweeps the old one out, just like the broom analogy above. Thus, the H1N1 Spanish flu from 1918 was replaced by the H2N2 Asian flu in 1957, and was in turn replaced by the H3N2 Hong Kong flu in 1968.

Thus, there were several good reasons to assume that speed of virus evolution is dependent on where it happens to be. But how can we estimate these different rates of evolution, and how can we use that knowledge for predictions? This is really the big problem – a big technical problem.

Imagine you want to draw an evolutionary tree. The tips of the tree represents the taxa we have included in the analysis, and the branches represent the evolutionary pathways backwards in time all the way to the most common recent ancestor, represented by the trunk. Drawing this tree is very simple if the number of tips is few. Consider that we want to make a tree representing the evolutionary history of the large primates gorilla, chimpanzee, bonobo and us humans. There is just a limited set of possible tree that can be drawn, and with good input data we are likely to get a tree that encapsulates the true evolutionary history of these apes. But if we want to make a phylogeny of all mammals, or all plants of the Solanaceae family, or all contemporary influenza viruses, the number of possible trees quickly becomes massive, a big Amazon jungle of potential trees. Finding the ‘right tree’ in this forest of possible trees will rely on a multitude of factors.

If we then also want the tree nodes to correspond to real time events – dating the tree – it becomes very computer intensive. And very sensitive to model assumptions. To date the tree we need some estimate of time, a clock. There are a number of ways to set that clock, either to use specimens that have been dated – for the phylogeny of apes above, that could be fossils from strata with known ages. Or for viruses, specimens collected at different time points. But for most occasions we will have to make assumptions about the clock given the variation in genetic sequences in contemporary samples.

For flu, it has been conventional to use a model of time called a ‘relaxed clock’. When using that model it has been possible to estimate the times of divergences of different virus subtypes, making a timetable of flu history. In the new paper, the authors first developed a novel evolutionary model that they termed ‘host specific local clock’. This model allowed for different rates of evolution in different host species. They tested the performance of this model together with other models on a simulated dataset of influenza viruses. Prompted by the good fit of the model to the ‘true’ simulated data they then applied it to a large collection of influenza sequences and mapped the time of different virus divergence patterns; the forks in the trees. And the end product is the little figure I shared above.

Science is about testing hypotheses. To challenge established truths over and over again to see if they hold. This is particularly true for the influenza A virus research field, where the whole narrative has been rewritten over and over again. When I started to become interested in this virus some 10-12 years ago the field virtually exploded with activity. Some of the old knowledge still holds, but it is inspiring to see how large progress the field have gone through in recent times. A big collective forward movement. But, fortunately for us scientists, it seems that this virus still has many secrets yet to reveal.

Link to the paper:

Michael Worobey, Guan-Zhu Han & Andrew Rambaut. 2014. A synchronized global sweep of the internal genes of modern avian influenza virus. Nature. doi:10.1038/nature13016


If you enjoyed this post, or other posts on this blog, why not follow the blog via email, Feedly or get updates via Twitter by following @DrSnygg?

Gustave Doré [Public domain], via Wikimedia Commons

The duck genome – and why it is important

It is Friday night, the kids are in bed and my wife is out with her friends. What do you do? Go to bed with a Sci-Fi book? No. Watch TV? No. Sort the laundry? NO!

The answer: I open a beer and read an article on duck genomes!

A happy Pekin duck - the domestic variant of the Mallard - and the most recent bird to have had all its genes sequenced. (From Wikipedia, Marin Winter)

A happy Pekin duck – the domestic variant of the Mallard – and the most recent bird to have had all its genes sequenced. (From Wikipedia, Marin Winter)

The article was published this week in Nature genetics, and I know at least two of the 51 authors (by the way, it is amazing how many authors there are on genome papers – more people than base pairs sometimes…). I have been waiting for this particular article a long time and have known that it is was on the way. In the pipeline, as they say.

Why so eager? Well, the duck – or more properly termed the Mallard, Anas platyrhynchos – is the main study organism in my lab. The most common duck in Europe, the most widespread duck in the world, the reservoir host of so many influenza A viruses, the most beautiful…. Eh, hmpf, perhaps not the most beautiful bird, but you get the picture – it is an important bird to me. And the duck genome is a treasure trove for us duck researchers; in essence the blueprints of what makes a duck a duck. Some of the base pairs in the genetic code might be coding for that particular trait you are interested, be it plumage, migration directions, or ability to withstand infection. And that’s when you need the blueprint.

The last couple of years, in the aftermath of highly pathogenic H5N1, you often hear the words Mallard and flu together. And it is right: Mallards are an important reservoir host for influenza A viruses. Meaning that they sustain perpetuation of virus subtypes in nature and are important for influenza A virus evolution. And, as you know by now, flu in humans and influenza A virus in birds are linked – thus flu concerns both ducks and men.

The paper of Huang and her 50 academic friends presents the overall genetic architecture of the Mallard genome and put it in relation to earlier bird genomes (chicken, turkey and zebra finch) and genomes from fish and mammals. It gives a tale on events that have occurred on really long time-scales, for instance the rate of gene duplication and gene loss over the last 100 million years. However, for me the most interesting is the second part of the article where they infect duck with highly-pathogenic H5N1 viruses and do what is called transcriptomics to investigate which genes that are affected by infection.

A transcriptome is a deep sequencing of mRNA transcripts, the transcribed genes on their way to the ribosomes to become proteins. By amplifying the RNA in your treated animals (in this case ducks infected with virus) and comparing the number of copies of particular gene mRNAs to untreated animals (in this case ducks not infected with virus) you can make a crude measurement of which genes that are up- or down-regulated upon infection. This can then help you to understand, and pinpoint particular genes with certain functions that may be important for immune processes and pathogenicity.

The wild Mallard - the home of influenza A viruses in their billions. (From Wikipedia, Richard Bartz)

The wild Mallard – the home of influenza A viruses in their billions. (From Wikipedia, Richard Bartz)

It is a great piece of work. And what I like is that it is the entry point for new studies; it’s like the opening of a highway where we other duck researchers can drive our cars. For my own part, I am extremely interested in the duck immune genes and the list of 150 cytokines, the Toll-like receptors, the defensins and the MHCs will be scrutinized in detail. We are already working on some of those, but now it becomes much easier to make progress.

Having said all these nice things, I do have some objections too. The strongest is how well the infection, and subsequent transcriptomes, reflect the natural situation. Experimental intranasal infections with a high titer of virus is not the natural way of infections, and hence may evoke biased responses, either because of wrong dosage, or because virus ends up in the wrong tissue. It is also important that controls and experimental animals measure the same thing, and in the right tissues. Some additional experiments, involving more animals and natural infections are warranted.  But overall it is a great achievement and staggering amount of work poured down in this paper. Hats off for you – all 51 of you!

The rest of us, we roll up our sleeves and get to work with the blueprint of the duck! Interesting times ahead!

Jonas Waldenström

Full link to the article: