## Introduction

**mskilab-org/nf-gOS** is a bioinformatics pipeline from [`mskilab-org`](https://www.mskilab.org/) for running [`JaBbA`](https://github.com/mskilab-org/JaBbA/), our algorithm for MIP based joint inference of copy number and rearrangement state in cancer whole genome sequence data. This pipeline runs all the pre-requisite tools (among others) and generates the necessary inputs for running JaBbA and loading into [case-reports](https://github.com/mskilab-org/case-report), our clinical front-end. It is designed to take paired tumor-normal samples or tumor-only samples as input.

## Workflow Summary:
1. Align to Reference Genome (currently supports `BWA-MEM`, `BWA-MEM2`, and GPU accelerated `fq2bam`).
2. Quality Control (using [`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), [`Picard CollectWGSMetrics`](https://gatk.broadinstitute.org/hc/en-us/articles/360037269351-CollectWgsMetrics-Picard), [`Picard CollectMultipleMetrics`](https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard), and [`GATK4 EstimateLibraryComplexity`](https://gatk.broadinstitute.org/hc/en-us/articles/360037428891-EstimateLibraryComplexity-Picard))
4. Mark Duplicates (using [`GATK MarkDuplicates`](https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard))
5. Base recalibration (using [`GATK BaseRecalibrator`](https://gatk.broadinstitute.org/hc/en-us/articles/360036898312-BaseRecalibrator))
6. Apply BQSR (using [`GATK ApplyBQSR`](https://gatk.broadinstitute.org/hc/en-us/articles/360037055712-ApplyBQSR))
7. Perform structural variant calling (using [`GRIDSS`](https://github.com/PapenfussLab/gridss))
8. Perform pileups (using [`AMBER`](https://github.com/hartwigmedical/hmftools/blob/master/amber/README.md))
9. Generate raw coverages and correct for GC & Mappability bias (using [`fragCounter`](https://github.com/mskilab-org/fragCounter))
10. Remove biological and technical noise from coverage data. (using [`Dryclean`](https://github.com/mskilab-org/dryclean))
11. Perform segmentation using tumor/normal ratios of corrected read counts, (using the `CBS` (circular binary segmentation) algorithm)
12. Purity & ploidy estimation (using [`PURPLE`](https://github.com/hartwigmedical/hmftools/blob/master/purple/README.md))
13. Junction copy number estimation and event calling (using [`JaBbA`](https://github.com/mskilab-org/JaBbA/)
14. Call SNVs and indels (using [`SAGE`](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md))
15. Annotate variants (using [`Snpeff`](https://pcingola.github.io/SnpEff/))
16. Assign mutational signatures (using [`SigProfiler`](https://github.com/AlexandrovLab/SigProfilerAssignment/))
17. Detect HRD (Homologous Recombination Deficiency) (using [`HRDetect`](https://github.com/Nik-Zainal-Group/signature.tools.lib) and our [`OnenessTwoness`](https://github.com/mskilab-org/onenesstwoness) classifier)


## Usage

### Setting up the ***samplesheet.csv*** file for input:

You need to create a samplesheet with information regarding the samples you
want to run the pipeline on. You need to specify the path of your
**samplesheet** using the `--input` flag to specify the location. Make sure the
input `samplesheet.csv` file is a *comma-separated* file and contains the
headers discussed below. *It is highly recommended to provide the **absolute
path** for inputs inside the samplesheet rather than relative paths.*

For paired tumor-normal samples, use the same `patient` ID, but different
`sample` names. Indicate their respective tumor/normal `status`, where **1** in
the `status` field indicates a tumor sample, and **0** indicates a normal
sample. You may pass multiple `sample` IDs per patient, `nf-gOS` will
consider them as separate samples belonging to the same patient and output the
results accordingly.

Specify the desired output root directory using the `--outdir` flag.
The outputs will be organized first by `tool` and then `sample`.

The input samplesheet should look like this:

```csv
patient,sex,status,sample,lane,fastq_1,fastq_2
TCXX49,XX,0,TCXX49_N,lane_1,/path/to/fastq_1.fq.gz,/path/to/fastq_2.gz
```

Each row represents a pair of fastq files (paired end) for a single sample (in
this case a normal sample, status: 0).

### Discussion of expected fields in input file and expected inputs for each `--step`

A typical sample sheet can populate with all or some of the column names as
shown below. The pipeline will use the information provided in the samplesheet
and the tools specified in the run to parsimoniously run the steps of the
pipeline to generate all remaining outputs.

**N.B You do not need to supply all the columns in the table below. The table represents all the possible inputs that can be passed. If you are starting from BAMs just pass `bam` and `bai` columns. If you are starting from FASTQs, pass `fastq1` (and `fastq2` for paired reads). If you have already generated other outputs, you may pass them as well to prevent the pipeline from running tools for which you already have outputs.**

| Column Name         | Description                                                                                                                                  |
|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
| patient             | (required) Patient or Sample ID. This should differentiate each patient/sample. *Note*: Each patient can have multiple sample names.          |
| sample              | (required) Sample ID for each Patient. Should differentiate between tumor and normal (e.g `sample1_t` vs. `sample1_n`). Sample IDs should be unique to Patient IDs |
| lane                | If starting with FASTQ files, and if there are multiple lanes for each sample for each patient, mention lane name.                            |
| sex                 | If known, please provide the sex for the patient. For instance if **Male** type XY, else if **Female** type XX, otherwise put NA.             |
| status              | (required) This should indicate if your sample is **tumor** or **normal**. For **normal**, write 0, and for **tumor**, write 1.              |
| fastq_1             | Full path to FASTQ file read 1. The extension should be `.fastq.gz` or `.fq.gz`.                                                              |
| fastq_2             | Full path to FASTQ file read 2 (if paired reads). The extension should be `.fastq.gz` or `.fq.gz`.                                            |
| bam                 | Full path to BAM file. The extension should be `.bam`.                                                                                        |
| bai                 | Full path to BAM index file. The extension should be `.bam.bai`.                                                                              |
| hets                | Full path to sites.txt file.                                                                                                                  |
| amber_dir           | Full path to AMBER output directory.                                                                                                          |
| frag_cov            | Full path to the fragCounter coverage file.                                                                                                   |
| dryclean_cov        | Full path to the Dryclean corrected coverage file.                                                                                            |
| ploidy              | Ploidies for each sample.                                                                                                                     |
| seg                 | Full path to the CBS segmented file.                                                                                                          |
| nseg                | Full path to the CBS segmented file for normal samples.                                                                                       |
| vcf                 | Full path to the GRIDSS VCF file.                                                                                                             |
| vcf_tbi             | Full path to the GRIDSS VCF index file.                                                                                                       |
| jabba_rds           | Full path to the JaBbA RDS (`jabba.simple.rds`) file.                                                                                         |
| jabba_gg            | Full path to the JaBbA gGraph (`jabba.gg.rds`) file.                                                                                          |
| ni_balanced_gg      | Full path to the non-integer balanced gGraph (`non_integer.balanced.gg.rds`) file.                                                            |
| lp_phased_gg        | Full path to the LP phased gGraph (`lp_phased.balanced.gg.rds`) file.                                                                         |
| events              | Full path to the events file.                                                                                                                 |
| fusions             | Full path to the fusions file.                                                                                                                |
| snv_somatic_vcf     | Full path to the somatic SNV VCF file.                                                                                                        |
| snv_somatic_tbi     | Full path to the somatic SNV VCF index file.                                                                                                  |
| snv_germline_vcf    | Full path to the germline SNV VCF file.                                                                                                       |
| snv_germline_tbi    | Full path to the germline SNV VCF index file.                                                                                                 |
| variant_somatic_ann | Full path to the somatic SNV annotated VCF file.                                                                                              |
| variant_somatic_bcf | Full path to the somatic SNV BCF file.                                                                                                        |
| variant_germline_ann| Full path to the germline SNV annotated VCF file.                                                                                             |
| variant_germline_bcf| Full path to the germline SNV BCF file.                                                                                                       |
| snv_multiplicity    | Full path to the SNV multiplicity file.                                                                                                       |
| sbs_signatures      | Full path to the SBS signatures file.                                                                                                         |
| indel_signatures    | Full path to the indel signatures file.                                                                                                       |
| signatures_matrix   | Full path to the signatures matrix file.                                                                                                      |
| hrdetect            | Full path to the HRDetect file.                                                                                                               |

## Tumor-Only Samples

For tumor-only samples, simply add the flag `--tumor_only true` to params.json. The pipeline will then run in tumor-only mode.


## How to modify default parameters

The pipeline uses a `params.json` file to specify the parameters for the pipeline. You can modify the parameters in the `params.json` file to suit your needs. The parameters (files and values) are given below:

Values:
```
	postprocess_bams           = true
    tumor_only                 = false

	// References
	genome                     = 'GATK.GRCh37'
	igenomes_base              = 's3://ngi-igenomes/igenomes'
	igenomes_ignore            = false
	mski_base                  = 's3://mskilab-pipeline'
	save_reference             = false
	build_only_index           = false                            // Only build the reference indexes
	download_cache             = false                            // Do not download annotation cache

	// Options to consider
	// Main options
	no_intervals               = false                            // Intervals will be built from the fasta file
	nucleotides_per_second     = 200000                           // Default interval size
	tools                      = null                             // No default Variant_Calling or Annotation tools
	skip_tools                 = null                             // All tools (markduplicates + baserecalibrator + QC) are used by default
	split_fastq                = 0                                // FASTQ files will not be split by default by FASTP, set to 50000000 for 50M reads/fastq

	// Modify FASTQ files (trim/split) with FASTP
	trim_fastq                 = false                            // No trimming by default
	clip_r1                    = 0
	clip_r2                    = 0
	three_prime_clip_r1        = 0
	three_prime_clip_r2        = 0
	trim_nextseq               = 0
	save_trimmed               = false
	save_split_fastqs          = false

	// Alignment
	aligner                    = 'fq2bam'                       // Default is gpu-accelerated fq2bam; bwa-mem, bwa-mem2 and dragmap can be used too
    fq2bam_mark_duplicates     = true                            // Whether fq2bam should mark duplicates, set false if not using fq2bam
    fq2bam_low_memory          = false                           // Set to true if using fq2bam with gpus that have <24GB memory
    optical_duplicate_pixel_distance = 2500                      // For computing optical duplicates, 2500 for NovaSeqX+
	save_mapped                = false                            // Mapped BAMs are saved
	save_output_as_bam         = true                            // Output files from alignment are saved as bam by default and not as cram files
	seq_center                 = null                            // No sequencing center to be written in read group CN field by aligner
	seq_platform               = null                            // Default platform written in read group PL field by aligner, null by default.

	// Structural Variant Calling
	error_rate                 = 0.01                            // Default error_rate for Svaba

	//indel_mask                 = null                            // Must provide blacklist bed file for indels based on genome to run Svaba

	// SV filtering (tumor only)
	pad_junc_filter            = 1000                            // Default Padding for SV Junction filtering

    // AMBER options
    target_bed_amber = null   // pass a bed file with target regions for running AMBER on targeted sequencing sample

    // HetPileups options
    filter_hets      = "TRUE"
    max_depth   = 1000

    // SigprofilerAssignment options
    sigprofilerassignment_cosmic_version = 3.4

	// fragCounter options
	midpoint_frag              = "TRUE"                           // If TRUE only count midpoint if FALSE then count bin footprint of every fragment interval: Default=TRUE
	windowsize_frag            = 1000                            // Window / bin size : Default= 200 in program but dryclean uses 1000 binsize, hence set to 1000
	minmapq_frag               = 60                              // Minimal map quality : Default = 1
	paired_frag                = "TRUE"                           // Is the dataset paired : Default = TRUE
	exome_frag                 = "FALSE"	                        // Use exons as bins instead of fixed window : Default = FALSE

    // Dryclean options
    center_dryclean                    = "TRUE"
    cbs_dryclean                         = "FALSE"
    cnsignif_dryclean                    = 0.00001
    wholeGenome_dryclean                 = "TRUE"
    blacklist_dryclean                   = "FALSE"
    blacklist_path_dryclean              = "NA"
    germline_filter_dryclean             = "FALSE"
    germline_file_dryclean               = "NA"
    human_dryclean                       = "TRUE"
    field_dryclean                       = "reads"

	// ASCAT options
	field_ascat                         = "foreground"
	hets_thresh_ascat                   = 0.2
	penalty_ascat                       = 70
	gc_correct_ascat                    = "TRUE"
	rebin_width_ascat                   = 50000
	from_maf_ascat                      = "FALSE"

    // CBS options
    cnsignif_cbs                        = 0.00001
    field_cbs                           = "foreground"
    name_cbs                            = "tumor"

	// JaBbA options
    blacklist_junctions_jabba = "NULL" // Rearrangement junctions to be excluded from consideration
    geno_jabba = "FALSE" // Whether the junction has genotype information
    indel_jabba = "exclude" // Decision to 'exclude' or 'include' small isolated INDEL-like events
    tfield_jabba = "tier" // Tier confidence meta data field in ra
    iter_jabba = 2 // Number of extra re-iterations allowed to rescue lower confidence junctions
    rescue_window_jabba = 10000 // Window size in bp to look for lower confidence junctions
    rescue_all_jabba = "TRUE" // If TRUE, will put all tiers in the first round of iteration
    nudgebalanced_jabba = "TRUE" // Whether to add a small incentive for chains of quasi-reciprocal junctions
    edgenudge_jabba = 0.1 // Hyper-parameter of how much to nudge or reward aberrant junction incorporation
    strict_jabba = "FALSE" // If TRUE, will only include junctions that exactly overlap segments
    allin_jabba = "FALSE" // If TRUE, will put all tiers in the first round of iteration
    field_jabba = "foreground" // Name of the metadata column of coverage that contains the data
    maxna_jabba = 0.9 // Any node with more NA than this fraction will be ignored
    ploidy_jabba = "NA" // Ploidy guess, can be a length 2 range
    purity_jabba = "NA" // Purity guess, can be a length 2 range
    pp_method_jabba = "ppgrid" // Method to infer purity and ploidy if not both given
    cnsignif_jabba = 0.00001 // Alpha value for CBS
    slack_jabba = 20 // Slack penalty to apply per loose end
    linear_jabba = "FALSE" // If TRUE, will use L1 loose end penalty
    tilim_jabba = 7200 // Time limit for JaBbA MIP
    epgap_jabba = 0.000001 // Threshold for calling convergence
    fix_thres_jabba = -1 // Not explicitly defined in the help text
    lp_jabba = "TRUE" // Run as LP (linear program instead of quadratic program)
    ism_jabba = "TRUE" // Include infinite sites constraints
    filter_loose_jabba = "FALSE" // Not explicitly defined in the help text
    gurobi_jabba = "FALSE" // Use Gurobi for MIP optimization instead of CPLEX
    verbose_jabba = "TRUE"

    // Allelic CN (Non-Integer Balance)
	field_non_integer_balance = "foreground"
	hets_thresh_non_integer_balance = 0.2
	overwrite_non_integer_balance = "TRUE"
	lambda_non_integer_balance = 20
	allin_non_integer_balance = "TRUE"
	fix_thresh_non_integer_balance = 10
	nodebounds_non_integer_balance = "TRUE"
	ism_non_integer_balance = "FALSE"
	epgap_non_integer_balance = 0.000001
	tilim_non_integer_balance = 6000
	gurobi_non_integer_balance = "FALSE"
	pad_non_integer_balance = 101

    // Allelic CN (LP-Phased Balance)
	lambda_lp_phased_balance = 100
	cnloh_lp_phased_balance = "TRUE"
	major_lp_phased_balance = "TRUE"
	allin_lp_phased_balance = "TRUE"
	marginal_lp_phased_balance = "TRUE"
	from_maf_lp_phased_balance = "FALSE"
	ism_lp_phased_balance = "FALSE"
	epgap_lp_phased_balance = 0.001
	hets_thresh_lp_phased_balance = 0.2
	min_bins_lp_phased_balance = 3
	min_width_lp_phased_balance = 0
	trelim_lp_phased_balance = 32000
	reward_lp_phased_balance = 10
	nodefileind_lp_phased_balance = 3
	tilim_lp_phased_balance = 6000

    // HRDetect
    hrdetect_mask = null
```

Files:
```
params {
    // illumina iGenomes reference file paths
    genomes {
	'GATK.GRCh37' {
	    fasta                            = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Sequence/WholeGenomeFasta/human_g1k_v37_decoy.fasta"
	    fasta_fai                        = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Sequence/WholeGenomeFasta/human_g1k_v37_decoy.fasta.fai"
        msisensorpro_list                = "${params.mski_base}/msisensor/hg19/human_g1k_v37_decoy.list"
	    chr_dir                          = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Sequence/Chromosomes"
	    dict                             = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Sequence/WholeGenomeFasta/human_g1k_v37_decoy.dict"
	    bwa                              = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Sequence/BWAIndex/"
	    dbsnp                            = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/dbsnp_138.b37.vcf.gz"
	    dbsnp_tbi                        = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/dbsnp_138.b37.vcf.gz.tbi"
        dbsnp_vqsr                       = '--resource:dbsnp,known=false,training=true,truth=false,prior=2.0 dbsnp_138.b37.vcf.gz'
	    known_snps                       = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/1000G_phase1.snps.high_confidence.b37.vcf.gz"
	    known_snps_tbi                   = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/1000G_phase1.snps.high_confidence.b37.vcf.gz.tbi"
	    known_indels                     = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/{1000G_phase1,Mills_and_1000G_gold_standard}.indels.b37.vcf.gz"
	    known_indels_tbi                 = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/{1000G_phase1,Mills_and_1000G_gold_standard}.indels.b37.vcf.gz.tbi"
        germline_resource                = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/af-only-gnomad.raw.sites.vcf.gz"
        germline_resource_tbi            = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/af-only-gnomad.raw.sites.vcf.gz.tbi"
        intervals                        = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/intervals/wgs_calling_regions_Sarek.list"
        mappability                      = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/Control-FREEC/out100m2_hg19.gem"
        ascat_alleles                    = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/ASCAT/G1000_alleles_hg19.zip"
        ascat_genome                     = 'hg19'
        ascat_loci                       = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/ASCAT/G1000_loci_hg19.zip"
        ascat_loci_gc                    = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/ASCAT/GC_G1000_hg19.zip"
        ascat_loci_rt                    = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/ASCAT/RT_G1000_hg19.zip"
	    snpeff_db                        = '87'
        snpeff_genome                    = 'GRCh37'
        vep_cache_version                = '111'
        vep_genome                       = 'GRCh37'
        vep_species                      = 'homo_sapiens'
        indel_mask                       = "${params.mski_base}/SVABA/hg19/snowman_blacklist.bed"
        germ_sv_db                       = "${params.mski_base}/SVABA/hg19/snowman_germline_mini_160413.bed"
        simple_seq_db                    = "${params.mski_base}/SVABA/hg19/repeat_masker_hg19_Simple.bed"
        blacklist_gridss                 = "${params.mski_base}/GRIDSS/blacklist/hg19/human_g1k_v37_decoy.fasta.bed"
        pon_gridss                       = "${params.mski_base}/GRIDSS/pon/hg19/"
        gnomAD_sv_db                     = "${params.mski_base}/junction_filter/gnomAD/hg19/GRCh19.gnomad_v2.1_sv.merged.rds"
        junction_pon_gridss              = "${params.mski_base}/junction_filter/GRIDSS/hg19/ggnome_gridss_pon_hg19_new.rds"
        junction_pon_svaba               = "${params.mski_base}/junction_filter/SvABA/hg19/ggnome_svaba_pon_hg19.rds"
        gcmapdir_frag                    = "${params.mski_base}/fragcounter/gcmapdir/hg19/"
        build_dryclean                   = 'hg19'
        hapmap_sites                     = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/hapmap_3.3.b37.vcf.gz"
        ref_genome_version               = "37"
        genome_ver_amber                 = "V37"
        het_sites_amber                  = "${params.mski_base}/PURPLE/hg19/AmberGermlineSites.37.tsv.gz"
        gc_profile                       = "${params.mski_base}/PURPLE/hg19/GC_profile.1000bp.37.cnp"
        diploid_bed                       = "${params.mski_base}/PURPLE/hg19/DiploidRegions.37.bed.gz"
        sigprofilerassignment_genome     = "GRCh37"
        ensembl_data_dir                 = "${params.mski_base}/SAGE/hg19/ensembl_data/"
        somatic_hotspots                 = "${params.mski_base}/SAGE/hg19/KnownHotspots.somatic.37.vcf.gz"
        germline_hotspots                = "${params.mski_base}/SAGE/hg19/KnownHotspots.germline.37.vcf.gz"
        panel_bed                        = "${params.mski_base}/SAGE/hg19/ActionableCodingPanel.37.bed.gz"
        high_confidence_bed              = "${params.mski_base}/SAGE/hg19/NA12878_GIAB_highconf_IllFB-IllGATKHC-CG-Ion-Solid_ALLCHROM_v3.2.2_highconf.bed.gz"
        ensembl_data_resources           = "${params.mski_base}/SAGE/hg19/ensembl_data/"
        gnomAD_snv_db                    = "${params.mski_base}/SAGE/hg19/PON_tumoronly/GRCh19.gnomad.snv.merged.vcf.gz"
        gnomAD_snv_db_tbi                = "${params.mski_base}/SAGE/hg19/PON_tumoronly/GRCh19.gnomad.snv.merged.vcf.gz.tbi"
        sage_germline_pon                = "${params.mski_base}/SAGE/hg19/PON_tumoronly/SAGE_germline_PON_hg19.vcf.gz"
        sage_germline_pon_tbi            = "${params.mski_base}/SAGE/hg19/PON_tumoronly/SAGE_germline_PON_hg19.vcf.gz.tbi"
        sage_pon 			             = "${params.mski_base}/PAVE/hg19/SageGermlinePon.1000x.37.tsv.gz"
        sage_blocklist_regions 			 = "${params.mski_base}/PAVE/hg19/KnownBlacklist.germline.37.bed"
        sage_blocklist_sites 			 = "${params.mski_base}/PAVE/hg19/KnownBlacklist.germline.37.vcf.gz"
        clinvar_annotations 			 = "${params.mski_base}/PAVE/hg19/clinvar.37.vcf.gz"
        segment_mappability 			 = "${params.mski_base}/PAVE/hg19/mappability_150.37.bed.gz"
        driver_gene_panel 			     = "${params.mski_base}/PAVE/hg19/DriverGenePanel.37.tsv"
        gnomad_resource 			     = "${params.mski_base}/PAVE/hg19/gnomad_variants_v37.csv.gz"
        pon_dryclean                     = "${params.mski_base}/dryclean/pon/hg19/fixed.detergent.rds"
        blacklist_coverage_jabba         = "${params.mski_base}/JaBbA/blacklist_coverage/hg19/maskA_re.rds"
        gencode_fusions                  = "${params.mski_base}/fusions/hg19/gencode.v29lift37.annotation.nochr.rds"
        build_non_integer_balance        = "hg19"
        mask_non_integer_balance         = "${params.mski_base}/allelic_cn/non_integer_balance/hg19/mask_with_segdups.rds"
        mask_lp_phased_balance           = "${params.mski_base}/allelic_cn/lp_phased_balance/lp_phased_balance_maskA_re.rds"
        ref_hrdetect                     = "hg19"
	}
	'GATK.GRCh38' {
        fasta                        = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.fasta"
        fasta_fai                    = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.fasta.fai"
        chr_dir                      = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Sequence/Chromosomes"
        dict                         = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.dict"
        bwa                          = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Sequence/BWAIndex/"
        bwamem2                      = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Sequence/BWAmem2Index/"
        cf_chrom_len                 = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Sequence/Length/Homo_sapiens_assembly38.len"
        dbsnp                        = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/dbsnp_146.hg38.vcf.gz"
        dbsnp_tbi                    = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/dbsnp_146.hg38.vcf.gz.tbi"
        known_snps                   = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/1000G_omni2.5.hg38.vcf.gz"
        known_snps_tbi               = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/1000G_omni2.5.hg38.vcf.gz.tbi"
        known_indels                 = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/{Mills_and_1000G_gold_standard.indels.hg38,beta/Homo_sapiens_assembly38.known_indels}.vcf.gz"
        known_indels_tbi             = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/{Mills_and_1000G_gold_standard.indels.hg38,beta/Homo_sapiens_assembly38.known_indels}.vcf.gz.tbi"
        germline_resource            = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/af-only-gnomad.hg38.vcf.gz"
        germline_resource_tbi        = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/af-only-gnomad.hg38.vcf.gz.tbi"
        intervals                    = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/intervals/wgs_calling_regions_noseconds.hg38.bed"
        mappability                  = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/Control-FREEC/out100m2_hg38.gem"
        pon                          = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/1000g_pon.hg38.vcf.gz"
        pon_tbi                      = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/1000g_pon.hg38.vcf.gz.tbi"
        ascat_alleles                = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/ASCAT/G1000_alleles_hg38.zip"
        ascat_genome                 = 'hg38'
        ascat_loci                   = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/ASCAT/G1000_loci_hg38.zip"
        ascat_loci_gc                = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/ASCAT/GC_G1000_hg38.zip"
        ascat_loci_rt                = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/ASCAT/RT_G1000_hg38.zip"
        snpeff_db                    = 105
        snpeff_genome                = 'GRCh38'
        vep_cache_version            = '110'
        vep_genome                   = 'GRCh38'
        vep_species                  = 'homo_sapiens'
        indel_mask                   = "${params.mski_base}/SVABA/hg38/snowman_blacklist.hg38.bed"
        germ_sv_db                   = "${params.mski_base}/SVABA/hg38/snowman_germline_mini_hg38.bed"
        simple_seq_db                = "${params.mski_base}/SVABA/hg38/repeat_masker_hg38_simple.bed"
        blacklist_gridss             = "${params.mski_base}/GRIDSS/blacklist/hg38/ENCFF356LFX_hg38.bed"
        pon_gridss                   = "${params.mski_base}/GRIDSS/pon/hg38/"
        gnomAD_sv_db                 = "${params.mski_base}/junction_filter/gnomAD/hg38/gnomad.v4.0.sv.rds"
        junction_pon_gridss          = "${params.mski_base}/junction_filter/GRIDSS/hg38/atac_gridss_pon_unfiltered_updated_hg38.rds"
        junction_pon_svaba           = "${params.mski_base}/junction_filter/SvABA/hg38/atac_svaba_pon_unfiltered_updated.rds"
        gcmapdir_frag                = "${params.mski_base}/fragcounter/gcmapdir/hg38/"
        build_dryclean               = 'hg38'
        hapmap_sites                 = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/hapmap_3.3.hg38.vcf.gz"
        ref_genome_version               = "38"
        genome_ver_amber                 = "V38"
        het_sites_amber                  = "${params.igenomes_base}/PURPLE/hg38/AmberGermlineSites.38.tsv.gz"
        gc_profile                       = "${params.igenomes_base}/PURPLE/hg38/GC_profile.1000bp.38.cnp"
        diploid_bed                       = "${params.mski_base}/PURPLE/hg38/DiploidRegions.38.bed.gz"
        sigprofilerassignment_genome     = "GRCh38"
        ensembl_data_dir                 = "${params.mski_base}/SAGE/hg38/ensembl_data/"
        somatic_hotspots                 = "${params.mski_base}/SAGE/hg38/KnownHotspots.somatic.38.vcf.gz"
        germline_hotspots                = "${params.mski_base}/SAGE/hg38/KnownHotspots.germline.38.vcf.gz"
        panel_bed                        = "${params.mski_base}/SAGE/hg38/ActionableCodingPanel.38.bed.gz"
        high_confidence_bed              = "${params.mski_base}/SAGE/hg38/NA12878_GIAB_highconf_IllFB-IllGATKHC-CG-Ion-Solid_ALLCHROM_v3.2.2_highconf.bed.gz"
        ensembl_data_resources           = "${params.mski_base}/SAGE/hg38/ensembl_data/"
        gnomAD_snv_db                    = "${params.mski_base}/SAGE/hg38/PON_tumoronly/GRCh19.gnomad.snv.merged.vcf.gz"
        gnomAD_snv_db_tbi                = "${params.mski_base}/SAGE/hg38/PON_tumoronly/GRCh19.gnomad.snv.merged.vcf.gz.tbi"
        sage_germline_pon                = "${params.mski_base}/SAGE/hg38/PON_tumoronly/SAGE_germline_PON_hg38.vcf.gz"
        sage_germline_pon_tbi            = "${params.mski_base}/SAGE/hg38/PON_tumoronly/SAGE_germline_PON_hg38.vcf.gz.tbi"
        sage_pon 			             = "${params.mski_base}/PAVE/hg38/SageGermlinePon.1000x.38.tsv.gz"
        sage_blocklist_regions 			 = "${params.mski_base}/PAVE/hg38/KnownBlacklist.germline.38.bed"
        sage_blocklist_sites 			 = "${params.mski_base}/PAVE/hg38/KnownBlacklist.germline.38.vcf.gz"
        clinvar_annotations 			 = "${params.mski_base}/PAVE/hg38/clinvar.38.vcf.gz"
        segment_mappability 			 = "${params.mski_base}/PAVE/hg38/mappability_150.38.bed.gz"
        driver_gene_panel 			     = "${params.mski_base}/PAVE/hg38/DriverGenePanel.38.tsv"
        gnomad_resource 			     = "${params.mski_base}/PAVE/hg38/gnomad_variants_v38.csv.gz"
        pon_dryclean                 = "${params.mski_base}/dryclean/pon/hg38/detergent.rds"
        blacklist_coverage_jabba     = "${params.mski_base}/JaBbA/blacklist_coverage/hg38/hg38.coverage.mask.rds"
        build_non_integer_balance    = "hg38"
        mask_non_integer_balance     = "${params.mski_base}/allelic_cn/non_integer_balance/hg38/mask_with_segdups.rds"
        mask_lp_phased_balance       = "${params.mski_base}/allelic_cn/lp_phased_balance/lp_phased_balance_maskA_re.rds"
        ref_hrdetect                     = "hg38"
	}
```

To modify the parameters, simply change/add the values in the `params.json` file
e.g "pon_dryclean": "new/path/to/pon.rds"
e.g "rescue_window_jabba": 5000
