Complete Genomics publishes in Science on low-cost sequencing of 3 human genomes

The manuscript, titled, "Human Genome Sequencing Using Unchained Base Reads on Self‐assembling DNA Nanoarrays," describes the methodology used to sequence cell lines derived from two individuals previously characterized by the International HapMap project. These included a Caucasian male of European descent (NA07022) and a Yoruban female (NA19240). In addition, researchers sequenced lymphoblast DNA from a Caucasian male sample (NA20431) obtained from the Personal Genome Project (PersonalGenomes.org).

Complete Genomics' proprietary platform enables efficient imaging, while requiring low reagent consumption, through its combinatorial probe anchor ligation (cPAL™) chemistry and its use of patterned genomic DNA nanoarrays. With this approach, Complete Genomics' scientists generated high-quality diploid base calls in as much as 95 percent of the genomes sequenced, identifying 3.2 million to 4.5 million sequence variants per genome processed.

Complete Genomics makes DNA nanoarrays for use in its sequencing process by introducing a solution of millions of DNA nanoballs (DNBs) to a patterned surface. The DNBs stick to activated, "sticky" spots, while avoiding the field between the spots. Once a DNB has adhered to a spot it repels other DNBs, which results in only one DNB per spot. In this time-lapse footage, a high-density DNA nanoarray is efficiently self-assembled from DNBs in solution. This method eliminates one of the most costly aspects of producing traditional patterned DNA arrays.

(Photo Credit: Complete Genomics, Inc.)

Detailed validation of one genome dataset demonstrates a sequence accuracy of just one false variant per 100 kilobases, a remarkably low error rate, particularly for such an affordable technology.

Patterned genomic DNA nanoarrays and 70-base, unchained sequence reads are unique technical achievements. The company's new patterned genomic DNA nanoarrays, which achieve a record high density of 2.85 billion spots per slide at 0.7 micron pitch, will enable Complete Genomics to sequence 10,000 human genomes in 2010.

Complete Genomics' sequencing process includes four distinct steps: 1) Sample preparation and library construction 2) Self-assembling DNA nanoarrays 3) Imaging, assembly and analysis 4) Combinatorial probe -- anchor ligation (cPAL).

(Photo Credit: Complete Genomics, Inc.)

This summary table identifies variations with respect to the National Center for Biotechnology Information (NCBI) version 36 human genome reference assembly. Novel variations were ascertained by comparison to dbSNP (JDW, release 126; NA18507 [Bentley], release 128; all other genomes, release 129). NA18507 and NA19240 are Yoruban HapMap samples, which may explain the number of SNPs and novelty rates. In partially called regions of the genome, only one allele could be called confidently. The high call rate in NA19240 reflects reduced library bias with a modified sample preparation protocol.

(Photo Credit: Complete Genomics, Inc.)

Source: Complete Genomics