Metadata-Version: 2.1
Name: multiPrime
Version: 2.3.4
License: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE

# multiPrime

`multiPrime is an error-tolerant primer design tool for broad-spectrum pathogens detection. 
It proposes a solution for the minimum degeneracy degenerate primer design with error (MD-EDPD).` 

## 1. Install

> pip

```
pip3 install multiPrime
```

+ `pip` `python >=3.9`



## 2. Usage

```
$ multiPrime -h 
```
Parameters：

| Parameters    | Description                                                 |
|---------------|-------------------------------------------------------------|
| DPrime        | Degenerate primer design through MD-EDPD or MD-DPD.         |
| Ppair         | Primer pair selection from the result of multiPrime DPrime. |
| Perfect       | Extract primer-contained sequences with non-mismatches.     |
| Errors        | Extract primer-contained sequences with errors.             |
```
multiPrime DPrime -i input -o output
           Options: { -l [18] -n [4] -d [10] -v [1] -g [0.2,0.7] -f [0.8] -c [4] -p [10] -a [4] }
```
Parameters：

| Parameters      | Description                                                                                                                                                                                                                                                                              |
|-----------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| -i/--input      | Input file: Result of multi-alignment. (muscle, mafft or others)                                                                                                                                                                                                                         |
| -l/--plen       | Length of primer. Default: 18                                                                                                                                                                                                                                                            |
| -n/--dnum       | Number of degenerate. Default: 4.                                                                                                                                                                                                                                                        |
| -v/--variation  | Max mismatch number of primer. Default: 1.                                                                                                                                                                                                                                               |
| -e/--entropy    | Entropy is actually a measure of disorder. This parameter is used to judge whether the window is conservation. Entropy of primer-length window. Default: 3.6.                                                                                                                            |
| -g/--gc         | Filter primers by GC content. Default [0.2,0.7].                                                                                                                                                                                                                                         |
| -s/--size       | Number of degenerate. Default: 4.                                                                                                                                                                                                                                                        |
| -f/--fraction   | Filter primers by match fraction (Coverage with errors). Default: 0.8.                                                                                                                                                                                                                   |
| -c/--coordinate | Mismatch index is not allowed to locate in start or stop. otherwise, it won't be regard as the mis-coverage. With this param, you can control the index of Y-distance (number=variation and position of mismatch) when calculate coverage with error.Default: 4.                         |
| -p/--proc       | Number of process to launch. Default: 20.                                                                                                                                                                                                                                                |
| -a/--away       | Filter hairpin structure, which means distance of the minimal paired bases. Default: 4. Example:(number of X) AGCT[XXXX]AGCT. Primers should not have complementary sequences (no consecutive 4 bp complementarities),otherwise the primers themselves will fold into hairpin structure. |
| -o/--out        | Output file: candidate primers. e.g.  [*].candidate.primers.out.                                                                                                                                                                                                                         |
```
multiPrime Ppair -i input -r reference -o output
           Options: {-f [0.6] -m [500] -n [200] -e [4] -p [9] -s [250,500] -g [0.4,0.6] -d [4] -a ","}
```
Parameters：

| Parameters    | Description                                                                                                                                                                                                         |
|---------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| -i/--input    | Input file: output of multiPrime DPrime.                                                                                                                                                                            |
| -r/--ref      | Reference sequence file: all the sequence in 1 fasta, for example: (Cluster_96_171.tfa).                                                                                                                            |
| -g/--gc       | Filter primers by GC content. Default [0.2,0.7].                                                                                                                                                                    |
| -f/--fraction | Filter primers by match fraction. Default: 0.6. Sometimes you need a small fraction to get output.                                                                                                                  |
| -e/--end      | Filter primers by degenerate base position. e.g. [-e 4] means I dont want degenerate base appear at the end four bases when primer pre-filter. Default: 4.                                                          |
| -s/--size     | Filter primers by PRODUCT size. Default [250,500].                                                                                                                                                                  |
| -d/--dist     | Filter param of hairpin, which means distance of the minimal paired bases. Default: 4. Example:(number of X) AGCT[XXXX]AGCT.                                                                                        |
| -t/--tm       | Difference of Tm between primer-F and primer-R. Default: 5.                                                                                                                                                         |
| -p/--proc     | Number of process to launch. Default: 20.                                                                                                                                                                           |
| -a/--adaptor  | Adaptor sequence, which is used for NGS next. Hairpin or dimer detection for [adaptor--primer]. example: TCTTTCCCTACACGACGCTCTTCCGATCT,TCTTTCCCTACACGACGCTCTTCCGATCT (Default). If you dont want adaptor, use [","] |
| -m/--maxseq   | Limit of sequence number. Default: 0. If 0, then all sequence will take into account. This param should consistent with [max_seq] in multi-alignment.                                                               |
| -o/--out      | Output file: candidate primer pairs. e.g.  [*].candidate.primers.txt.                                                                                                                                               |
```
multiPrime Perfect -i [input] -p [10] -f [format] -o [output] -s [Coverage.xls]
```
Parameters：

| Parameters   | Description                                                                                                                                                                                                    |
|--------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| -i/--input   | Input file: Primer file. One of the followed three types: final_maxprimers_set.xls (see output of multiPrime in github (https://github.com/joybio/multiPrime)); primer.fa (primer fasta) or primer_F,primer_R. |
| -r/--ref     | Sequence file: all the input sequences in 1 fasta.                                                                                                                                                             |
| -f/--format  | Format of primer file: xls or fa or seq; default: xls, indicate final_maxprimers_set.xls. xls: final_primer_set.xls; fa:fasta format or seq: sequence format, comma seperate. e.g. primer_F,Primer_R.          |
| -p/--process | Number of process to launch. Default: 20.                                                                                                                                                                      |
| -o/--out     | Output_dir. default: PCR_product.                                                                                                                                                                              |
| -s/--stast   | Stast information: number of coverage and total. Default: Coverage.xls.                                                                                                                                        |
```
multiPrime Errors -i [input] -r [bowtie index] -l [150,2000] -p [10]-o [output]
```
Parameters：

| Parameters   | Description                                                                                                                |
|--------------|----------------------------------------------------------------------------------------------------------------------------|
| -i/--input   | input file: primer.fa.                                                                                                     |
| -r/--ref     | reference file: bowtie index.                                                                                              |
| -l/--len     | Length of primer, which is used for mapping. Default: 18                                                                   |
| -t/--term    | Position of mismatch is not allowed in the 3 term of primer. Default: 4                                                    |
| -b/--bowtie  | bowtie or bowtie2 was employed for mapping. Default: bowtie2                                                               |
| -m/--seedmms | Bowtie: Mismatches in seed (can be 0 - 3, default: -n 1).Bowtie2: Gap or mismatches in seed (can be 0 - 1, default: -n 1). |
| -p/--process | Number of process to launch. Default: 20.                                                                                  |
| -o/--out     | Output file: PCR product with primers.                                                                                     |


## 3. Results

multiPrime DPrime
+ `output`：Information of primer.
+ `output.gap_seq_id_json`: Positions and non-contained sequences caused by errors (number of errors are greater than threshold).
+ `output.non_coverage_seq_id_json`: Positions and non-contained sequences.

multiPrime Ppair 
+ `output`：*.candidate.primers.txt

multiPrime Perfect 
+ `output`：PCR_product
+ `Coverage.xls`：Total coverage for all primers.

multiPrime Errors 
+ `output`：PCR product with primer pairs.
+ `output.pair.num`：Target amplicon number with primer pairs.
+ `others`：Temp files.

## 4. test dir


multiPrime/example


## 5. Contact


Please send comments, suggestions, bug reports and bug fixes to 1806389316@pku.edu.cn
