Active and neutral mutations are distinguished by studying their neighbourhoods.
Researchers at IIT Madras have developed an AI tool called NBDriver(neighbourhood driver) for use in analysing cancer-causing mutations in cells. By looking at the neighbourhood, or context, of a mutation in the genome, it can look at harmful”driver” mutations and distinguish them from neutral”passenger” mutations.
This technique of looking at the genomic neighbourhood to make out the nature of the
mutation is a novel and largely unexplored one. In a paper published in the journal Cancers,
the researchers explain that the nature of the mutation depends on the neighbourhood, and how this tool may be used to draw the line between driver and passenger mutations.
B. Ravindran, head of the Robert Bosch Centre for Data Science and AI at IIT Madras and one of the corresponding authors, said in a press release that one of the major challenges faced by cancerresearchers involves the differentiation between the relatively small number of “driver” mutations that enable the cancer cells to grow and the large number of”passenger” mutations that do not have any effect on the progression of the disease.
In previously published techniques,researchers typically analysed DNA sequences from large groups of cancer patients, comparing sequences from cancer as well as normal cells and determined whether a particular mutation occurred more often in cancer cells than random, said Prof. Karthik Raman, from the biotechnology department of IIT Madras and another corresponding author.”However, this ‘frequentist’ approach often missed out on relatively rare driver mutations,” he noted, adding that some studies have also looked at the changes caused by the driver mutations in the production of essential biological products such as proteins.
Statistical modelling
The method of distinguishing between driver and passenger mutations solely by looking at the neighbourhood is novel.”Through robust statistical modelling, we show that there is a significant difference in the pattern of sequences (or context) surrounding the driver and passenger mutations,” said Shayantan Banerjee, who is a master’s student in the Department of Biotechnology, IIT Madras, and the lead author of the paper.
Accuracy of tool
The researchers studied a dataset containing 5,265 mutations to derive the model. According to Prof. Raman, NBDriver, had an overall accuracy of 89% and ranked second out of 11 prediction algorithms. In comparison, he said that the top performing tool, or FATHMM, achieved an accuracy of 91% on the same dataset.
For the future, the group aims to develop an easy-to-use drag-and-drop web interface that will enable cancerresearchers with limited computational or programming skills to get predictions and extract genomic information on their preferred set of mutations.”We will also be pursuing further studies on the context [or neighbourhood] of these mutations, and how they impact the evolution of cancer. Why do we see differences in the context between the driver and passenger mutations in the first place?” said Prof Raman.
The group also plans that NBDriver will be a part of a broader cancer genomic sequence analysis “pipeline” being developed at the centres.