## What is `mreps`

`mreps` is a flexible and efficient software for identifying serial repeats (usually called *tandem repeats*) in DNA sequences. It was developed at LORIA in Adage group and is currently maintained at LIFL by Sequoia team.

See a mini-tutorial of `mreps` for more explanations on what `mreps` is looking for.

The following paper describes mreps 2.5 as well as some case examples of its application to genomic studies. Please cite this paper when referring to `mreps`.

**[1]** R. Kolpakov, G. Bana, and G. Kucherov, `mreps`: efficient and flexible detection of tandem repeats in DNA, **Nucleic Acid Research**, 31 (13), July 1 2003, pp 3672-3678.

Combinatorial algorithms implemented in `mreps` have been presented in the following publications.

**[2]** R. Kolpakov, G. Kucherov, Finding maximal repetitions in a word in linear time, **1999 Symposium on Foundations of Computer Science (FOCS)**, New-York (USA), pp. 596-604, IEEE Computer Society

**[3]** R. Kolpakov, G. Kucherov, Finding approximate repetitions under Hamming distance, **Theoretical Computer Science**, 2003, vol 303 (1), pp 135-156. An extended abstract appeared in* the 9th European Symposium on Algorithms*(ESA 2001), Aarhus, Denmark, 2001

## Current version

Current version is `mreps` 2.5 (binaries available for linux, windows, mac os x).

An old distribution `mreps` 2.1 is still available (this version has the option of treating the general ascii alphabet, and therefore can still be useful).

## Some features of `mreps` 2.5

**Mixed combinatorial/heuristic approach**

`mreps` 2.5 is based on a mixed combinatorial/heuristic paradigm. The core of `mreps` is constituted by exhaustive combinatorial algorithms (described in [2,3]) used to find **all** repeats verifying certain mathematical properties. This insures the exhaustivity of the approach. Those repeats are then submitted to an heuristic treatment in order to obtain more biologically relevant representation of the repeats. A description of `mreps` 2.5 can be found in [1].

**Identifying "fuzzy" repeats**

`mreps`2.5 has a resolution parameter that allows to compute

*"fuzzy"*repeats. In metaphoric terms, this parameter plays the role of

*"magnifying glass"*allowing to

*"zoom out"*the genomic sequence in order to compute more loose repeats.

**Efficiency**

`mreps` has no limitation whatsoever on the pattern size (size of the repeated unit) of computed repeats -- repeats of all possible pattern sizes can be computed within a single program run. Moreover, depending on the resolution parameter, this run is very fast: for low resolution values processing sequences of dozens of millions bases takes only several seconds on a regular PC.

**Limitations**

`mreps` algorithm does not deal with indels (insertions/deletions of nucleotides), but only with substitutions. As a result, indels are treated in an indirect way, and certain repeats containing indels may be missed.

## Download and use `mreps`

`mreps`is distributed under the GNU General Public License.`mreps`runs on Linux/UNIX and Windows platforms. You can download the`mreps`archive now.- Consult howto page of
`mreps`. - Query
`mreps`

## Credits

The following people contributed to mreps: Ghizlane Bana, Mathieu Giraud, Liliana Ibanescu, Roman Kolpakov, Gregory Kucherov, Ralph Rabbat