**ScanApp** scans a given data set for high-scoring occurrences of a previously learned iPMM.
In order to assess which likelihood value is sufficient for declaring a hit, it determines a threshold based on a user-specified false positive rate (FPR) on a negative data set.
A negative data set can be either given as user input or it is constructed as a randomized version of the positive data set.

As "Input model" any successfully learned iPMM can be used and it does not matter which of the learned tools has produced it.

If the content of the "Input data" file starts with '>', it is interpreted as FastA file. Otherwise it is interpreted as plain text, where every line contains a single sequence. 
The input expects upper- and lower case letters of the standard DNA alphabet {A,C,G,T}. If other symbols from the IUPAC code (such as N) are encountered, they are replaced by a random sample from the distribution of {A,C,G,T} in the data set. 

"Background" specificies how the background data set for threshold-selection is obtained. 
It is a selection parameter, which means that different parameters need to be specified depending on the selection made.
If the selection is *Generating*, a background data set is artificially contructed by
(i) learning a homogeneous Markov chain of user-specified "Order" from the "Input data" and (ii) generating a sample from that chain. 
The size of the background data set equals the size of the input data set multiplied by the value of "Size factor". 
Note that a larger background data set always yields a more accurate estimation of the desired threshold, but the computation of and sorting of all likelihood values can use a critical amount of time and memory. 
If the selection is *From file*, a background data set which is assumed not to contain instances of the motif that is scanned for, has to be provided as "Data file" and here the same format restrictions as for "Input data" above apply. 

The tool then computes internally likelihood-threshold for scoring the positive data that yields a given "FPR" (false positive rate) on the background data.

"Both strands" determines whether the scan should involve both strands or not. 

If no "Name" is specified, it is set by default to "SequenceScan(*a*)", where *a* is the FPR.

The tool returns
(i) Sequence ID, position, strand orientation, and likelihood-score of every hit in the "Input data"
(ii) The corresponding binding sites extracted from "Input data", aligned, and put into the same strand orientation.
Note: The tool predicts solely based on the likelihood-scores, irrespective of putative overlaps among binding sites. If overlapping sites are not desired the result has to be filtered by the user accordingly.   