Building upon previous efforts producing a high-quality de novo genome assemblies of deadly 2011 E. coli O104:H4 outbreak strain (http://www.genomics.cn/en/news_show.php?type=show&id=651), the BGI and their collaborators at the University Medical Centre Hamburg-Eppendorf have now released the first complete map of the genome and plasmids without any assembly gaps. (genome publicly available at ftp://ftp.genomics.org.cn/pub/Ecoli_TY-2482/Escherichia_coli_TY-2482.chromosome.20110616.fa.gz and plasmids at ftp://ftp.genomics.org.cn/pub/Ecoli_TY-2482/Escherichia_coli_TY-2482.plasmid.20110616.fa.gz)
BThis final draft of the genome shows the disease strain has a circular chromosome 5,278 kbp in length, and three additional plasmids 88 kbp, 75 kbp and 1.5 kbp in size, respectively. The chromosome contains around 5,000 predicted coding sequences (CDSs), covering 87.09% of the genome. The biggest plasmid is highly homologous to a previously sequenced plasmid isolated from a horse and carrying additional multi-drug resistance genes; the smallest one is a so-called "selfish plasmid" carrying only two genes, one of which encodes a DNA replication protein; the other carrying the aggregative adherence fimbria I (AAF/I) gene cluster, which is associated with E. coli aggregation ability and virulence, and likely to play a role in the persistence of the disease.
BGI researchers found that the Shiga-toxin-encoding genes, responsible for most of the pathogenicity of the disease, were likely encoded by a viral prophage that integrated in the bacterial chromosome. Several insertion hotspots, including one nested multi-antibiotic resistant associated locus, were also identified in the research. This indicates that horizontal gene transfer events may have played important roles in the evolution of virulence and drug resistance of this strain.
The results of these and previous phylogenetic and comparative genomic analyses now give us the confidence to confirm that the outbreak strain belong to an EAEC (enteroaggregative E. coli) lineage, but acquired the Shiga toxin producing ability by the integration of a phage genome. This explains the initial confusion as to why the EAEC-lineage bacteria harbored some characteristics of EHEC (enterohaemorrhagic E. coli) strains. Therefore, the deadly Germany E. coli is not a completely new bacteria and can be considered a "hybrid" strain, now temporarily termed Shiga toxin-producing enteroaggregative Escherichia coli (STpEAEC).
BGI has a new portal collecting most of the genomic evidence, and due to the urgency in fighting the outbreak, BGI's data has been released under a public domain license. To aid the growing crowdsourcing effort this is accessible for the first time via a data DOI – a digital object identifier that allows citation of a dataset before formal publication. These novel ways of freely and publicly releasing of information has allowed unprecedentedly rapid progress in understanding the genomic insights of this infection, and a growing team of researchers have now confirmed they will also release their work in the same manner (see: http://bacpathgenomics.wordpress.com/2011/06/13/e-coli-data-released-under-creative-commons-0-license/). Please see the Github repository for the latest open research by the E. coli community and our portal page: