Metadata-Version: 2.1
Name: qbatch
Version: 2.2.1
Summary: Execute shell command lines in parallel on Slurm, S(un|on of) Grid Engine (SGE) and PBS/Torque clusters
Home-page: https://github.com/pipitone/qbatch
Author: Jon Pipitone, Gabriel A. Devenyi
Author-email: jon@pipitone.ca, gdevenyi@gmail.com
License: Unlicense
Description: # qbatch
        Execute shell command lines in parallel on Slurm, S(on) of Grid Engine (SGE),
        PBS/Torque clusters
        
        [![Travis CI build status](https://travis-ci.org/pipitone/qbatch.svg?branch=master)](https://travis-ci.org/pipitone/qbatch)
        
        qbatch is a tool for executing commands in parallel across a compute cluster.
        It takes as input a list of **commands** (shell command lines or executable
        scripts) in a file or piped to ``qbatch``. The list of commands are divided into
        arbitrarily sized **chunks** which are submitted as jobs to the cluster either as
        individual submissions or an array. Each job runs the commands in its chunk in
        parallel according to **cores**. Commands can also be run locally on systems
        with no cluster capability via gnu-paralel.
        
        ``qbatch`` can also be used within python using the ``qbatch.qbatchParser`` and
        ``qbatch.qbatchDriver`` functions. ``qbatchParser`` will accept a list of
        command line options identical to the shell interface, parse, and submit jobs.
        The ``qbatchDriver`` interface will accept key-value pairs
        corresponding to the outputs of the argument parser, and additionally, the
        ``task_list`` option, providing a list of strings of commands to run.
        
        ## Installation
        
        ```sh
        $ pip install qbatch
        ```
        
        ## Dependencies
        ``qbatch`` requires python (>2.7) and [GNU Parallel](https://gnu.org/s/parallel).
        For Torque/PBS and gridengine clusters, ``qbatch`` requires the ``qsub`` and
        ``qstat`` commands. For Slurm workload manager, ``qbatch`` requires the
        ``sbatch`` and ``squeue`` commands.
        
        ## Environment variable defaults
        qbatch supports several environment variables to customize defaults for your
        local system.
        
        ```sh
        $ export QBATCH_PPJ=12                   # requested processors per job
        $ export QBATCH_CHUNKSIZE=$QBATCH_PPJ    # commands to run per job
        $ export QBATCH_CORES=$QBATCH_PPJ        # commonds to run in parallel per job
        $ export QBATCH_NODES=1                  # number of compute nodes to request for the job, typically for MPI jobs
        $ export QBATCH_MEM="0"                  # requested memory per job
        $ export QBATCH_MEMVARS="mem"            # memory request variable to set
        $ export QBATCH_SYSTEM="pbs"             # queuing system to use ("pbs", "sge","slurm", or "local")
        $ export QBATCH_NODES=1                  # (PBS-only) nodes to request per job
        $ export QBATCH_SGE_PE="smp"             # (SGE-only) parallel environment name
        $ export QBATCH_QUEUE="1day"             # Name of submission queue
        $ export QBATCH_OPTIONS=""               # Arbitrary cluster options to embed in all jobs
        $ export QBATCH_SCRIPT_FOLDER=".qbatch/" # Location to generate jobfiles for submission
        $ export QBATCH_SHELL="/bin/sh"          # Shell to use to evaluate jobfile
        ```
        
        ## Command line help
        
        ```
        usage: qbatch [-h] [-w WALLTIME] [-c CHUNKSIZE] [-j CORES] [--ppj PPJ]
                      [-N JOBNAME] [--mem MEM] [-q QUEUE] [-n] [-v] [--version]
                      [--depend DEPEND] [-d WORKDIR] [--logdir LOGDIR] [-o OPTIONS]
                      [--header HEADER] [--footer FOOTER] [--nodes NODES]
                      [--sge-pe SGE_PE] [--memvars MEMVARS]
                      [--pbs-nodes-spec PBS_NODES_SPEC] [-i]
                      [-b {pbs,sge,slurm,local,container}] [--env {copied,batch,none}]
                      [--shell SHELL]
                      ...
        
        Submits a list of commands to a queueing system. The list of commands can be
        broken up into 'chunks' when submitted, so that the commands in each chunk run
        in parallel (using GNU parallel). The job script(s) generated by qbatch are
        stored in the folder .qbatch/
        
        positional arguments:
          command_file          An input file containing a list of shell commands to
                                be submitted, - to read the command list from stdin or
                                -- followed by a single command
        
        optional arguments:
          -h, --help            show this help message and exit
          -w WALLTIME, --walltime WALLTIME
                                Maximum walltime for an array job element or
                                individual job (default: None)
          -c CHUNKSIZE, --chunksize CHUNKSIZE
                                Number of commands from the command list that are
                                wrapped into each job (default: 1)
          -j CORES, --cores CORES
                                Number of commands each job runs in parallel. If the
                                chunk size (-c) is smaller than -j then only chunk
                                size commands will run in parallel. This option can
                                also be expressed as a percentage (e.g. 100%) of the
                                total available cores (default: 1)
          --ppj PPJ             Requested number of processors per job (aka ppn on
                                PBS, slots on SGE, cpus per task on SLURM). Cores can
                                be over subscribed if -j is larger than --ppj (useful
                                to make use of hyper-threading on some systems)
                                (default: 1)
          -N JOBNAME, --jobname JOBNAME
                                Set job name (defaults to name of command file, or
                                STDIN) (default: None)
          --mem MEM             Memory required for each job (e.g. --mem 1G). This
                                value will be set on each variable specified in
                                --memvars. To not set any memory requirement, set this
                                to 0 (default: 0)
          -q QUEUE, --queue QUEUE
                                Name of queue to submit jobs to (defaults to no queue)
                                (default: None)
          -n, --dryrun          Dry run; Create jobfiles but do not submit or run any
                                commands (default: False)
          -v, --verbose         Verbose output (default: False)
          --version             show program's version number and exit
        
        advanced options:
          --depend DEPEND       Wait for successful completion of job(s) with name
                                matching given glob pattern or job id matching given
                                job id(s) before starting (default: None)
          -d WORKDIR, --workdir WORKDIR
                                Job working directory (default:
                                current working directory)
          --logdir LOGDIR       Directory to save store log files (default:
                                {workdir}/logs)
          -o OPTIONS, --options OPTIONS
                                Custom options passed directly to the queuing system
                                (e.g --options "-l vf=8G". This option can be given
                                multiple times (default: [])
          --header HEADER       A line to insert verbatim at the start of the script,
                                and will be run once per job. This option can be given
                                multiple times (default: None)
          --footer FOOTER       A line to insert verbatim at the end of the script,
                                and will be run once per job. This option can be given
                                multiple times (default: None)
          --nodes NODES         (PBS and SLURM only) Nodes to request per job
                                (default: 1)
          --sge-pe SGE_PE       (SGE-only) The parallel environment to use if more
                                than one processor per job is requested (default: smp)
          --memvars MEMVARS     A comma-separated list of variables to set with the
                                memory limit given by the --mem option (e.g.
                                --memvars=h_vmem,vf) (default: mem)
          --pbs-nodes-spec PBS_NODES_SPEC
                                (PBS-only) String to be inserted into nodes= line of
                                job (default: None)
          -i, --individual      Submit individual jobs instead of an array job
                                (default: False)
          -b {pbs,sge,slurm,local,container}, --system {pbs,sge,slurm,local,container}
                                The type of queueing system to use. 'pbs' and 'sge'
                                both make calls to qsub to submit jobs. 'slurm' calls
                                sbatch. 'local' runs the entire command list (without
                                chunking) locally. 'container' creates a joblist and
                                metadata file, to pass commands out of a container to
                                a monitoring process for submission to a batch system.
                                (default: local)
          --env {copied,batch,none}
                                Determines how your environment is propagated when
                                your job runs. "copied" records your environment
                                settings in the job submission script, "batch" uses
                                the cluster's mechanism for propagating your
                                environment, and "none" does not propagate any
                                environment variables. (default: copied)
          --shell SHELL         Shell to use for spawning jobs and launching single
                                commands (default: /bin/sh)
        ```
        
        ## Some examples:
        ```sh
        # Submit an array job from a list of commands (one per line)
        # Generates a job script in ./.qbatch/ and job logs appear in ./logs/\
        # All defaults are inherited from QBATCH_* environment variables
        $ qbatch commands.txt
        
        # Submit a single command to the cluster
        $ qbatch -- echo hello
        
        # Set the walltime for each job
        $ qbatch -w 3:00:00 commands.txt
        
        # Run 24 commands per job
        $ qbatch -c24 commands.txt
        
        # Pack 24 commands per job, run 12 in parallel at a time
        $ qbatch -c24 -j12 commands.txt
        
        # Start jobs after successful completion of existing jobs with names starting with "stage1_"
        $ qbatch --afterok 'stage1_*' commands.txt
        
        # Pipe a list of commands to qbatch
        $ parallel echo process.sh {} ::: *.dat | qbatch -
        
        # Run jobs locally with GNU Parallel, 12 commands in parallel
        $ qbatch -b local -j12 commands.txt
        
        # Many options don't make sense locally: chunking, individual vs array, nodes,
        # ppj, highmem, and afterok are ignored
        ```
        
        A python script example:
        ```python
        # Submit jobs to a cluster using the QBATCH_* environment defaults
        import qbatch
        task_list = ['echo hello', 'echo hello2']
        qbatch.qbatchDriver(task_list = task_list)
        
        ```
        
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: Public Domain
Classifier: Natural Language :: English
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: System :: Clustering
Classifier: Topic :: System :: Distributed Computing
Classifier: Topic :: Utilities
Description-Content-Type: text/markdown
