Finding cancer culprits' fingerprints

11 Jan 2013

Signatures of five mutational processes were extracted from the mutational catalogues of 21 breast cancer genomes using the cancer genome project's computational framework when single base substitutions, kataegis, dinucleotide substitutions, indels at microhomologies and mono- or polynucleotide repeats are analysed.

Signatures of five mutational processes were extracted from the mutational catalogues of 21 breast cancer genomes using the cancer genome project's computational framework when single base substitutions, kataegis, dinucleotide substitutions, indels at microhomologies and mono- or polynucleotide repeats are analysed.

Researchers from the Wellcome Trust Sanger Institute's cancer genome project have developed a computer model to identify the fingerprints of DNA-damaging processes that drive cancer development. Armed with these signatures, scientists will be able to search for the chemicals, biological pathways and environmental agents responsible.

The computer model will help to overcome a fundamental problem in studying cancer genomes: that the DNA contains not only the mutations that have contributed to cancer development, but also an entire lifetime's worth of other mutations that have also been acquired. These mutations are layered on top of each other and trying to unpick the individual mutations, when they appeared, and the processes that caused them is a daunting task.

"The problem we have solved can be compared to the well-known cocktail party problem," explains Ludmil Alexandrov, first author of the paper from Sanger Institute. "At a party there are lots of people talking simultaneously and, if you place microphones all over the room, each one will record a mixture of all the conversations. To understand what is going on you need to be able to separate out the individual discussions.

"The same is true in cancer genomics. We have catalogues of mutations from cancer genomes and each catalogue contains the signatures of all the mutational processes that have acted on that patient's genome since birth. Our model allows us to identify the signatures produced by different mutation-causing processes within these catalogues."