Metadata-Version: 2.4
Name: opentargets-otter
Version: 25.0.15
Summary: Open Targets Task ExcutoR
Author-email: Open Targets Core Team <devs@opentargets.org>
License: Apache-2.0
License-File: LICENSE
Requires-Python: >=3.11
Requires-Dist: filelock==3.18.0
Requires-Dist: google-cloud-storage==3.1.1
Requires-Dist: loguru==0.7.3
Requires-Dist: pydantic==2.11.7
Requires-Dist: pyyaml==6.0.2
Requires-Dist: requests==2.32.4
Requires-Dist: urllib3==2.5.0
Provides-Extra: docs
Requires-Dist: autodoc-pydantic>=2.2.0; extra == 'docs'
Requires-Dist: esbonio>=0.16.5; extra == 'docs'
Requires-Dist: sphinx-autobuild>=2024.10.3; extra == 'docs'
Requires-Dist: sphinx-issues>=5.0.1; extra == 'docs'
Requires-Dist: sphinx-rtd-theme>=3.0.2; extra == 'docs'
Requires-Dist: sphinx>=8.2.3; extra == 'docs'
Provides-Extra: test
Requires-Dist: coverage==7.9.1; extra == 'test'
Requires-Dist: freezegun==1.5.2; extra == 'test'
Requires-Dist: pytest-mock==3.14.1; extra == 'test'
Requires-Dist: pytest==8.4.1; extra == 'test'
Description-Content-Type: text/markdown

# Otter — Open Targets' Task ExecutoR

[![pypi](https://raster.shields.io/pypi/v/opentargets-otter.png)](https://pypi.org/project/opentargets-otter/)
[![docs status](https://github.com/opentargets/otter/actions/workflows/docs.yaml/badge.svg)](https://opentargets.github.io/otter)
[![build](https://github.com/opentargets/otter/actions/workflows/ci.yaml/badge.svg)](https://github.com/opentargets/otter/actions/workflows/ci.yaml)
[![license](https://img.shields.io/github/license/opentargets/otter.svg)](LICENSE)

Otter is a the task execution framework used in the Open Targets data Pipeline.

It provides an easy to use API to implement generic tasks that are then used by
describing the flow in a YAML configuration file.

Take a look at a [Simple example](https://opentargets.github.io/otter/#otter-example).


## Features

This is a list of what you get for free by using Otter:
  * **Parallel execution**: Tasks are run in parallel, and Otter will take care of
    the dependency planning.
  * **Declarative configuration**: Steps are described in a YAML file, as list of
    tasks with different specifications. The task themselves are implemented
    in Python enabling a lot of flexibility.
  * **Logging**: Otter uses the [loguru library](https://github.com/delgan/loguru)
    for logging. It handles all the logging related the task flow, and also logs
    into the manifest (see next item).
  * **Manifest**: Otter manages a manifest file that describes a pipeline run. It
    is used to both for debugging and for tracking the provenance of the data. A series of simple JQ queries can be used to extract information from it (see Useful JQ queries).
  * **Error management**: Otter will stop the execution of the pipeline if a task fails,
    and will log the error in the manifest.
  * **Scratchpad**: A place to store variables that can be overwritten into the
    configuration file (something like a very simple templating engine), enabling
    easy parametrization of runs, and passing of data between tasks.
  * **Utilities**: Otter provides interfaces to use Google Cloud Storage and other
    remote storage services, and a bunch of utilities to help you write tasks.


## Documentation

See it in [here](https://opentargets.github.io/otter).


## Development

> [!IMPORTANT]
> Remember to run `make dev` before starting development. This will set up a very
> simple git hook that does a few checks before committing.
