This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
The hemagglutinin subtypes from Influenza A can be divided into distinct lineages. This is important for tracing the evolutionary history of the gene. It allows regional lineages to be identified and studied. The process of lineage identification depends on phylogenetic analysis to identify the distinct clades within the data. Identification of lineages within the Influenza Internal genes would help to simplify the analysis of reassortment where these genes are transferred between subtypes. In this paper we show that a rapid clustering method can be used to assign lineages to the internal gene segments without the need for a full phylogenetic analysis.
This paper shows a rapid way of identifying sequences that share a close common ancestry in the influenza internal genes. These can be referred to as lineages or clades. There has never been a previous attempt at this sort of classification and the method presented here carries it out rigorously objectively and mathematically.
This paper is also placed here to claim priority as it was rejected by peer review in Virus Gene. Part of the problem is that the raw data from IRDB is plagued with errors. There is another version with a more detailed mathematical analysis currently being prepared.
This work is so significant and important that it seems to raise considerable resistance within the influenza phylogenetics community especially as it shows that reassortment is much more extensive than previously considered. This undermine almost all of the current phylogenetic analysis which is very seriously flawed.