A step towards neural genome assembly

Abstract

De novo genome assembly focuses on finding connections between a vast amount of short sequences in order to reconstruct the original genome. In ideal case, the central problem of genome assembly would be finding a Hamiltonian path through a large directed graph. However, due to local structures in the graph and biological features, the problem can be simplified. Motivated by recent advancements in graph representation learning and neural execution of algorithms, in this work we train the MPNN model with max-aggregator to execute several algorithms for graph simplification which are used prior to looking for a Hamiltonian path. We show that the algorithms were learned successfully and can be scaled to graphs of sizes up to 100 times larger than the ones used in training.

Publications