CHASM: Computational biology goes after top genetic cancer suspects

Johns Hopkins engineers have devised computer software that can sift through hundreds of genetic mutations and highlight the DNA changes that are most likely to promote cancer. The goal is to provide critical help to researchers who are poring over numerous newly discovered gene mutations, many of which are harmless or have no connection to cancer. According to its inventors, the new software will enable these scientists to focus more of their attention on the mutations most likely to trigger tumors.

A description of the method and details of a test using it on brain cancer DNA were published in the Aug. 15 issue of the journal Cancer Research.

The new process focuses on missense mutations, meaning protein sequences that each possess a single tiny variation from the normal pattern. A small percentage of these genetic errors can reduce the activity of proteins that usually suppress tumors or hyperactivate proteins that make it easier for tumors to grow, thereby allowing cancer to develop and spread. But finding these genetic offenders can be difficult.

"It's very expensive and time-consuming to test a huge number of gene mutations, trying to find the few that have a solid link to cancer," said Rachel Karchin, an assistant professor of biomedical engineering who supervised the development of the computational sorting approach. "Our new screening system should dramatically speed up efforts to identify genetic cancer risk factors and help find new targets for cancer-fighting medications."

The new computational method is called CHASM, short for Cancer-specific High-throughput Annotation of Somatic Mutations.

Rachel Karchin, right, an assistant professor of biomedical engineering, and doctoral student Hannah Carter led a Johns Hopkins team that developed software to narrow the search for mutations linked to cancer.

(Photo Credit: Will Kirk/JHU)

Developing this system required a partnership of researchers from diverse disciplines. Karchin and doctoral student Hannah Carter drew on their skills as members of the university's Institute for Computational Medicine, which uses powerful information management and computing technologies to address important health problems, and collaborated with leading Johns Hopkins cancer and biostatistics experts from the university's School of Medicine, its Bloomberg School of Public Health and the Johns Hopkins Kimmel Cancer Center.

The team first narrowed the field of about 600 potential brain cancer culprits using a computational method that would sort these mutations into "drivers" and "passengers." Driver mutations are those that initiate or promote the growth of tumors. Passenger mutations are those that are present when a tumor forms but appear to play no role in its formation or growth. In other words, the passenger mutations are only along for the ride.

To prepare for the sorting, the researchers used a machine-learning technique in which about 50 characteristics or properties associated with cancer-causing mutations were given numerical values and programmed into the system. Karchin and Carter then employed a math technique called a Random Forest classifier to help separate and rank the drivers and the passengers. In this step, 500 computational "decision trees" considered each mutation to decide whether it possessed the key characteristics associated with promoting cancer. Eventually, each "tree" cast a vote: Was the gene a driver or a passenger?

"It's a little like the children's game of 'Guess Who,' where you ask a series of yes or no questions to eliminate certain people until you narrow it down to a few remaining suspects," said Carter, who earned her undergraduate and master's degrees at the University of Louisville and served as lead author of the Cancer Research paper. "In this case, the decision trees asked questions to figure out which mutations were most likely to be implicated in cancer."

The election results—such as how many driver votes a mutation received—were used to produce a ranking. The genetic errors that collected the most driver votes wound up at the top of the list. The ones with the most passenger votes were placed near the bottom. With a list like this in hand, the software developers said, cancer researchers can direct more of their time and energy to the mutations at the top of the rankings.

Karchin and Carter plan to post their system on the Web and will allow researchers worldwide to use it freely to prioritize their studies. Because different genetic characteristics are associated with different types of cancers, they said the method can easily be adapted to rank the mutations that may be linked to different forms of the disease, such as breast cancer or lung cancer.

Source: Johns Hopkins University