**DeNovoMoDe** is a tool for performing a basic de novo motif discovery based with an iPMM as motif model in a given set of arbitrary DNA sequences.
The promoter model follows the OOPS assumption with a single motif model and takes into account motif occurrences on strands. If a search only on the forward strand is desired and/or if multiple motifs are to be learned, use **FlexibleMoDe** instead.
The learning algorithm is a stochastic search that uses the BIC score of the motif model as target function.
For further details see:
R. Eggeling, T. Roos, P. Myllymaki, I. Grosse. Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data. *BMC Bioinformatics*. 16:365, 2015. 

If the content of the "Input data" file starts with '>', it is interpreted as FastA file. Otherwise it is interpreted as plain text, where every line contains a single sequence. 
The input expects upper- and lower case letters of the standard DNA alphabet {A,C,G,T}. If other symbols from the IUPAC code (such as N) are encountered, they are replaced by a random sample from the distribution of {A,C,G,T} in the data set. 
The input sequences are allowed to differ in length.

The "Motif width" determines the length of the putative binding sites and must thus not exceed the length of the shortest input sequence.

While a high maximal "Motif order", i.e. maximal order of the iPMM motif model, can be fruitful, the effects on runtime are here more severe compared to **SimpleMoDe**.
Since structure learning is carried out in each iteration step of the iterative stochastic search, a significant drop in runtime typically appears when the "Order" exceeds 3. 

The "Flanking order" pertains to the homogeneous Markov chain that models all part of the sequence that are not covered by the motif.
The main purpose of the flanking model is to model mono-, di-, or trinucleotid-repeats in order to avoid them to be erroneously identified as motif.
As a consequence, a value larger than 3 is rarely justified.

The default values for "Initial iterations", "Additional iterations" and "Restarts" are relatively small values which are, however, in many cases sufficient for finding a motif (if present in the data).
Increasing the number of restarts is typically the most promising option to increase the probability of finding a hard-to-spot pattern.

If no "Name" is specified, it is set by default to "DeNovo(*a*,*b*,*c*)", where *a* is "Motif width", *b* is "Motif order" and *c* is "Flanking order".

The tool returns
(i) a logfile containing the scores of all iteration steps in the stochastic search for evaluating whether the parameter values for "Initial iterations", "Additional iterations" and "Restarts" have been sufficient or not.
(ii) the iPMM motif model, with exactly the output as returned by **SimpleMoDe**.  