We will receive funding from the Deutsche Forschungsgemeinschaft for our project “Algorithms for the Analysis of Approximate Gene Clusters”. One Postdoc/PhD position is available as part of this project.
The order of genes in genomes can be used to determine the function of unknown genes, as well as the phylogenetic history of the organisms. In view of the ever-increasing speed of genome sequencing, there exists a huge amount of data for such studies. On the algorithmic side, though, methods are often based on overly simplified genome models, use heuristics to solve optimization problems, or suffer from long running times.
Gene clusters are sets of genes that occur as single contiguous blocks in several genomes. Unfortunately, the requirement of exact occurrences of gene clusters turns out to be too strict for the biological application. In this project, we want to develop models and algorithms for the computation of approximate gene clusters, that combine a formal strictness with applicability to biological data. At the same time, our algorithms must be swift to allow application to the increasing amount of genome data. We will combine methods from combinatorial optimization and algorithmic graph theory with a statistically sound evaluation.
We will implement, train, and evaluate our methods to allow an automated processing of gene order data. Finally, we want to apply our method to biological data, to derive new insights about gene function.