The group of Søren Besenbacher focuses on using statistical and computational approaches to study questions in human genomics. One of the groups primary research interests is the human germline mutation process where we want to understand the rate, pattern and effects of new mutations entering the human population. We are also interested in developing machine learning methods that can be used in precision medicine. In particular we are interested in developing new methods to detect whether an individual has cancer based on sequence data from cell-free DNA (cfDNA).
MODELLING THE HUMAN MUTATION RATE
Mutation of the DNA molecule is a truly fundamental process in biology. It occurs in all species and is the ultimate source of all genetic variation.
It has been known for some time that the mutation varies across the genome, but previously it was hard to get an unbiased estimate of the human mutation rate and to study the causes of the rate variation. The advent of cheap Whole Genome Sequencing (WGS) has, however, alleviated this problem.
By sequencing nuclear families with high coverage we can directly observe new mutations that are present in a child but absent in the parents.
Using such new data sets of directly observed de novo mutations it is now possible to study the human germline mutation process without bias from selection and other confounding factors. We are involved in using such data sets to:
- Estimate the rate of mutations in humans and other primates.
- Finding genomic factors that affect the mutation rate.
- Build predictive models that can estimate the probability that a certain kind of mutation happens at a specific site in the human genome.
- Examine the evolution of the mutation rate and spectrum across and within species.
Developing methods for analyzing cfDNA data
Cell-free DNA (or cfDNA) are DNA fragments floating in the bloodstream outside of cells. Usually, these fragments come from dead blood cells, but there are interesting exceptions.
In pregnant women, a fraction of the cells will originate from the fetus and thus provide a non-invasive opportunity for early detection of genetic abnormalities. In cancer patients, the presence of circulating tumor DNA (ctDNA) among the cfDNA offers a cheap and non-invasive strategy to detect and monitor cancer.
We are currently working several new methods to detect the presence ctDNA. This includes methods that detect ctDNA based on the presence of tumor mutations as well as methods that use “fragmentomics” features to detect ctDNA. The idea of fragmentomics is that cfDNA is fragmented in vivo by enzymes that cut the DNA in positions not bound by nucleosomes, which means that the length and position of cfDNA fragments provide information about the nucleosome position and chromatin organization of the cells they come from. This information can, in turn, reveal what types of cells the fragments come from and which genes and transcription factors were active in those cells.
kmerPaPa - Tool to calculate a "k-mer pattern partition" from position specific k-mer counts. https://github.com/BesenbacherLab/kmerPaPa
GeNovo - Identifying disease genes with de-novo mutations. https://github.com/BesenbacherLab/genovo