Metadata-Version: 2.1
Name: kfp-tekton
Version: 1.3.0
Summary: Tekton Compiler for Kubeflow Pipelines
Home-page: https://github.com/kubeflow/kfp-tekton/
Author: kubeflow.org
License: Apache 2.0
Platform: UNKNOWN
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.6.1
Description-Content-Type: text/markdown

[![PyPI](https://img.shields.io/pypi/v/kfp-tekton?label=PyPI)](https://pypi.org/project/kfp-tekton/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/kfp-tekton?label=Downloads)](https://pypi.org/project/kfp-tekton/#files)
[![PyPI - License](https://img.shields.io/pypi/l/kfp-tekton?label=License)](https://www.apache.org/licenses/LICENSE-2.0)


<h1><a id="kubeflow-pipelines-sdk-for-tekton">Kubeflow Pipelines SDK for Tekton</a></h1>

The Kubeflow Pipelines [SDK](https://www.kubeflow.org/docs/components/pipelines/sdk/sdk-overview/)
allows data scientists to define end-to-end machine learning and data pipelines.
The output of the Kubeflow Pipelines SDK compiler is YAML for [Argo](https://github.com/argoproj/argo-workflows).

The `kfp-tekton` SDK is extending the `Compiler` and the `Client` of the Kubeflow
Pipelines SDK to generate [Tekton](https://github.com/tektoncd/pipeline) YAML
and to subsequently upload and run the pipeline with the Kubeflow Pipelines engine
backed by Tekton.

<h2><a id="table-of-contents">Table of Contents</a></h2>

<!-- START of ToC generated by running ./tools/mdtoc.sh sdk/README.md -->

  - [SDK Packages Overview](#sdk-packages-overview)
  - [Project Prerequisites](#project-prerequisites)
  - [Installation](#installation)
  - [Compiling a Kubeflow Pipelines DSL Script](#compiling-a-kubeflow-pipelines-dsl-script)
  - [Big data passing workspace configuration](#big-data-passing-workspace-configuration)
  - [Running the Compiled Pipeline on a Tekton Cluster](#running-the-compiled-pipeline-on-a-tekton-cluster)
  - [List of Available Features](#list-of-available-features)
  - [List of Helper Functions for Python Kubernetes Client](#list-of-helper-functions-for-python-kubernetes-client)
  - [Tested Pipelines](#tested-pipelines)
  - [Troubleshooting](#troubleshooting)

<!-- END of ToC generated by running ./tools/mdtoc.sh sdk/README.md -->


<h2><a id="sdk-packages-overview">SDK Packages Overview</a></h2>

The `kfp-tekton` SDK is an extension to the [Kubeflow Pipelines SDK](https://www.kubeflow.org/docs/pipelines/sdk/sdk-overview/)
adding the `TektonCompiler` and the `TektonClient`:

* `kfp_tekton.compiler` includes classes and methods for compiling pipeline Python DSL into a Tekton PipelineRun YAML spec. The methods in this package
  include, but are not limited to, the following:

  * `kfp_tekton.compiler.TektonCompiler.compile` compiles your Python DSL code
    into a single static configuration (in YAML format) that the Kubeflow Pipelines service
    can process. The Kubeflow Pipelines service converts the static
    configuration into a set of Kubernetes resources for execution.

* `kfp_tekton.TektonClient` contains the Python client libraries for the [Kubeflow Pipelines API](https://www.kubeflow.org/docs/pipelines/reference/api/kubeflow-pipeline-api-spec/).
  Methods in this package include, but are not limited to, the following:

  * `kfp_tekton.TektonClient.upload_pipeline` uploads a local file to create a new pipeline in Kubeflow Pipelines.
  * `kfp_tekton.TektonClient.create_experiment` creates a pipeline
    [experiment](https://www.kubeflow.org/docs/pipelines/concepts/experiment/) and returns an
    experiment object.
  * `kfp_tekton.TektonClient.run_pipeline` runs a pipeline and returns a run object.
  * `kfp_tekton.TektonClient.create_run_from_pipeline_func` compiles a pipeline
    function and submits it for execution on Kubeflow Pipelines.
  * `kfp_tekton.TektonClient.create_run_from_pipeline_package` runs a local
    pipeline package on Kubeflow Pipelines.


<h2><a id="project-prerequisites">Project Prerequisites</a></h2>

 - Python: `3.7` or later
 - Tekton: [`v0.36.0`](https://github.com/tektoncd/pipeline/releases/tag/v0.36.0) or [later](https://github.com/tektoncd/pipeline/releases/latest)
 - Tekton CLI: [`0.23.1`](https://github.com/tektoncd/cli/releases/tag/v0.23.1)
 - Kubeflow Pipelines: [KFP with Tekton backend](https://github.com/kubeflow/kfp-tekton/blob/master/guides/kfp_tekton_install.md)

Follow the instructions for [installing project prerequisites](https://github.com/kubeflow/kfp-tekton/blob/master/sdk/python/README.md#development-prerequisites)
and take note of some important caveats.


<h2><a id="installation">Installation</a></h2>

You can install the latest release of the `kfp-tekton` compiler from
[PyPi](https://pypi.org/project/kfp-tekton/). We recommend to create a Python
virtual environment first:

    python3 -m venv .venv
    source .venv/bin/activate

    pip install kfp-tekton

Alternatively you can install the latest version of the `kfp-tekton` compiler
from the source by cloning the repository [https://github.com/kubeflow/kfp-tekton](https://github.com/kubeflow/kfp-tekton):

1. Clone the `kfp-tekton` repo:

   ```
   git clone https://github.com/kubeflow/kfp-tekton.git
   cd kfp-tekton
   ```

2. Setup Python environment with Conda or a Python virtual environment:

   ```
   python3 -m venv .venv
   source .venv/bin/activate
   ```

3. Build the compiler:

   ```
   pip install -e sdk/python
   ```

4. Run the compiler tests (optional):

   ```
   pip install pytest
   make test
   ```

<h2><a id="compiling-a-kubeflow-pipelines-dsl-script">Compiling a Kubeflow Pipelines DSL Script</a></h2>

The `kfp-tekton` Python package comes with the `dsl-compile-tekton` command line
executable, which should be available in your terminal shell environment after
installing the `kfp-tekton` Python package.

If you cloned the `kfp-tekton` project, you can find example pipelines in the
`samples` folder or under `sdk/python/tests/compiler/testdata` folder.

    dsl-compile-tekton \
        --py sdk/python/tests/compiler/testdata/parallel_join.py \
        --output pipeline.yaml


**Note**: If the KFP DSL script contains a `__main__` method calling the
`kfp_tekton.compiler.TektonCompiler.compile()` function:

```Python
if __name__ == "__main__":
    from kfp_tekton.compiler import TektonCompiler
    TektonCompiler().compile(pipeline_func, "pipeline.yaml")
```

... then the pipeline can be compiled by running the DSL script with `python3`
executable from a command line shell, producing a Tekton YAML file `pipeline.yaml`
in the same directory:

    python3 pipeline.py


<h2><a id="big-data-passing-workspace-configuration">Big data passing workspace configuration</a></h2>

When [big data files](https://github.com/kubeflow/kfp-tekton/blob/master/samples/big_data_passing/big_data_passing_description.ipynb)
are defined in KFP. Tekton will create a workspace to share these big data files
among tasks that run in the same pipeline. By default, the workspace is a
Read Write Many PVC with 2Gi storage using the kfp-csi-s3 storage class to push artifacts to S3.
But you can change these configuration using the environment variables below:

```shell
export DEFAULT_ACCESSMODES=ReadWriteMany
export DEFAULT_STORAGE_SIZE=2Gi
export DEFAULT_STORAGE_CLASS=kfp-csi-s3
```

To pass big data using cloud provider volumes, it's recommended to use the
[volume_based_data_passing_method](https://github.com/kubeflow/kfp-tekton/blob/master/sdk/python/tests/compiler/testdata/artifact_passing_using_volume.py)
for both Tekton and Argo runtime.


<h2><a id="running-the-compiled-pipeline-on-a-tekton-cluster">Running the Compiled Pipeline on a Tekton Cluster</a></h2>

After compiling the `sdk/python/tests/compiler/testdata/parallel_join.py` DSL script
in the step above, we need to deploy the generated Tekton YAML to Kubeflow Pipeline engine.

You can run the pipeline directly using a pre-compiled file and KFP-Tekton SDK. For more details, please look at the [KFP-Tekton user guide SDK documentation](https://github.com/kubeflow/kfp-tekton/blob/master/guides/kfp-user-guide#2-run-pipelines-using-the-kfp_tektontektonclient-in-python)

```python
experiment = kfp_tekton.TektonClient.create_experiment(name=EXPERIMENT_NAME, namespace=KUBEFLOW_PROFILE_NAME)
run = client.run_pipeline(experiment.id, 'parallal-join-pipeline', 'pipeline.yaml')
```

You can also deploy directly on Tekton cluster with `kubectl`. The Tekton server will automatically start a pipeline run.
We can then follow the logs using the `tkn` CLI.

    kubectl apply -f pipeline.yaml

    tkn pipelinerun logs --last --follow

Once the Tekton Pipeline is running, the logs should start streaming:

    Waiting for logs to be available...

    [gcs-download : main] With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate

    [gcs-download-2 : main] I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath

    [echo : main] Text 1: With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
    [echo : main]
    [echo : main] Text 2: I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
    [echo : main]


<h2><a id="list-of-available-features">List of Available Features</a></h2>

To understand how each feature is implemented and its current status, please visit
the [FEATURES](https://github.com/kubeflow/kfp-tekton/blob/master/sdk/FEATURES.md) doc.


<h2><a id="list-of-helper-functions-for-python-kubernetes-client">List of Helper Functions for Python Kubernetes Client</a></h2>

KFP Tekton provides a list of common Kubernetes client helper functions to simplify
the process of creating certain Kubernetes resources. please visit the
[K8S_CLIENT_HELPER](https://github.com/kubeflow/kfp-tekton/blob/master/sdk/K8S_CLIENT_HELPER.md) doc for more details.


<h2><a id="tested-pipelines">Tested Pipelines</a></h2>

We are [testing the compiler](https://github.com/kubeflow/kfp-tekton/blob/master/sdk/python/tests/README.md) on more than 80 pipelines
found in the Kubeflow Pipelines repository, specifically the pipelines in KFP compiler
`testdata` folder, the KFP core samples and the samples contributed by third parties.

A report card of Kubeflow Pipelines samples that are currently supported by the `kfp-tekton`
compiler can be found [here](https://github.com/kubeflow/kfp-tekton/blob/master/sdk/python/tests/test_kfp_samples_report.txt).
If you work on a PR that enables another of the missing features please ensure that
your code changes are improving the number of successfully compiled KFP pipeline samples.


<h2><a id="troubleshooting">Troubleshooting</a></h2>

- When you encounter ServiceAccount related permission issues, refer to the
  ["Service Account and RBAC" doc](https://github.com/kubeflow/kfp-tekton/blob/master/sdk/sa-and-rbac.md)

- If you run into the error `bad interpreter: No such file or director` when trying
  to use Python's venv, remove the current virtual environment in the `.venv` directory
  and create a new one using `virtualenv .venv`

