Reconstruction of phylogenetic marker genes from short sequencing reads in metagenomes

MATAM is a pipeline to assemble full-length marker genes from metagenomic and metatranscriptomic data. It takes as an input a file of reads (fasta or fastq format) and a reference database containing the largest possible set of sequences from a given target marker gene.

MATAM was validated on the 16S rRNA gene and Illumina data, and can perform out-of-the-box assemblies using the provided default reference database. Optionnaly, assembled sequences can be assigned using the RDP classifier and visualized with Krona.


MATAM is implemented in C++ and Python and freely distributed under the GNU Affero General Public License v3.0 (AGPL) MATAM is available through conda and docker. MATAM latest version source code can be found on GitHub. To get MATAM, please follow the instruction on the GitHub repository.

Bug Reports

Users are encouraged to submit bug reports or issues to the MATAM issue tracker on GitHub, or sending an e-mail to the authors.

Thanks to everyone for bug reports and feedback!