Analysis of little-explored regions of genome reveals dozens of potential cancer triggers

A massive data analysis of natural genetic variants in humans and variants in cancer tumors has implicated dozens of mutations in the development of breast and prostate cancer, a Yale-led team has found.

The newly discovered mutations are in regions of DNA that do not code for proteins but instead influence activity of other genes. These areas represent an unexplored world that will allow researchers and doctors to gain new insight into the causes and treatment of cancer, said the scientists.

"This allows us to take a systematic approach to cancer genomics," said Mark Gerstein, the Albert L. Williams Professor of Biomedical Informatics and co-senior author of the paper, which appears in the Oct. 4 issue of Science. "Now we do not need to limit ourselves to the roughly 1% of the genome that codes for proteins but can explore the rest of our DNA."

The analysis — led by Yale researchers and scientists from the Wellcome Trust Sanger Institute, as well as Weill Cornell Medical Colleges and other institutions — is a statistical marriage of separate mammoth research projects, each providing groundbreaking insights in our genome, the genetic blueprint of life.

The 1000 Genomes project is compiling the personal genomes of many individuals. The data help pinpoint regions of DNA that vary little within the population and thus are of crucial importance to human health. The Encyclopedia of DNA Elements (ENCODE) project is working toward cataloguing the function of each location in the human genome.

The team took non-coding DNA elements from ENCODE project and looked for those that are highly conserved in the 1000 Genomes data. They then contrasted the data with mutations in tumor samples from about 90 patients with breast or prostate cancer. They found dozens in areas of DNA that vary little and therefore are likely to drive tumor progression. They also looked for additional features of the cancer mutations such as their proximity to regulatory-network hubs, which also indicate they may be particularly damaging.

While the research focused on variants of single base pairs, many of conclusions also apply to other, larger forms of genetic variation, the authors say.

The great diversity of variants found proves that massive data projects have direct relevance to cancer in individuals, the authors said.

"Our approach can be directly used in the context of precision medicine," says Ekta Khurana, an associate research scientist in Gerstein's lab and a first author of the study.

Source: Yale University