Metadata-Version: 2.1
Name: kubernetes-job
Version: 0.2.0
Summary: Simple Kubernetes job creation; a Python library for starting a Kubernetes batch job as a normal Python function call.
Home-page: https://gitlab.com/roemer/kubernetes-job
Author: Roemer Claasen
Author-email: roemer.claasen@gmail.com
License: UNKNOWN
Project-URL: Documentation, https://kubernetes-job.readthedocs.io
Description: # Kubernetes-job: simple Kubernetes job creation 
        
        A library for starting a Kubernetes batch job as a normal Python function call. 
        
        ## Installation
        
        Kubernetes-job can be installed using Pip:
        
        ```bash
        pip install kubernetes-job
        ```
        
        ## Quick start
        
        ```python
        from kubernetes_job import JobManager
        
        
        def add(a, b):
            return a + b
        
        
        manager = JobManager(k8s_client=k8s_client, k8s_job_spec='job.yaml', namespace='default')
        job = manager.create_job(add, 1, 2)
        ```
        
        The `JobManager` will now create a Kubernetes job using the basic job specification in the `job.yaml` file. 
        The call to `add` is then passed on to the new job node, where the function is subsequently executed.   
        
        The `job.yaml` file should be adjusted to your needs. 
        This is the place to put Kubernetes node selectors, Docker base images, etc. etc. 
        Please refer to the [Kubernetes documentation](https://kubernetes.io/docs/concepts/workloads/controllers/job/) for details. 
        
        **Please note:** This is a very silly example, for two obvious reasons. 
        
        First, *`add` will take a very short time to complete*, and is therefore not a function 
        you would want to spawn a Kubernetes job for. 
        A job should be created for a task that is not easily performed on the calling machine. 
        A good example would be training Machine Learning models on a heavy CUDA node, 
        started from a web server node with modest resources.
        
        Second, *Kubernetes jobs do not return values!* This means the result of this addition will be lost. 
        In a Kubernetes job, it is up to the job to save its work. 
        In this case, the result of `(1 + 2)` will be lost for humanity.   
        
        ## Configuration
        
        KubernetesJobs does not need much configuration. Basically, there are 3 things to be done:
        
        1. Configuring the Kubernetes job spec template (e.g. `job.yaml`);
        1. Initializing a Kubernetes `ApiClient`  
        1. Initializing the `JobManager`
        
        ### Configuring the Kubernetes job spec template (e.g. `job.yaml`)
        
        When KubernetesJob spawns a new job, the Kubernetes job spec template is used as the base configuration for the new job.
        
        This is an example:
        
        ```yaml
        apiVersion: batch/v1
        kind: Job
        metadata:
          # job name; a unique id will be added when launching a new job based on this template
          name: kubernetes-job
        spec:
        
          # Try 1 time to execute this job
          backoffLimit: 1
        
          # Active deadline (timeout), in a number of seconds.
          activeDeadlineSeconds: 3600
        
          # Clean up pods and logs after finishing the job
          ttlSecondsAfterFinished: 3600
        
          template:
            spec:
              containers:
              - name: kubernetes-job
                image: registry.gitlab.com/roemer/kubernetes-job:latest
              restartPolicy: Never
        ```
        Please adjust this template to your needs by specifying the right container image, job deadlines, etc. 
        The [Kubernetes documentation](https://kubernetes.io/docs/concepts/workloads/controllers/job/) contains more information.
        
        When KubernetesJob spawns a new job, three things are added to the template:
        
        1. A unique name, generated by adding a timestamp;
        1. The function call, serialized (using Pickle), added as an environment variable; 
        1. A `cmd` entry calling `JobManager.execute`.
        
        A working example can be found in the [`test/` directory]('test/').
        
        **Please note:** 
        Make sure the Docker image in the job template contains the same packaged Python 
        software as the process creating the job! 
        Otherwise the function cannot be executed in the new job pod.  
        
        ### Initializing a Kubernetes ApiClient  
        
        There are several ways to configure the Kubernetes client. Probably the easiest
        way is to use a **bearer token**.  This can be done as follows:
        
        ```python
        from kubernetes import client
        
        configuration = client.Configuration()
        configuration.api_key["authorization"] = '<token>'
        configuration.api_key_prefix['authorization'] = 'Bearer'
        configuration.host = 'https://<endpoint_of_api_server>'
        configuration.ssl_ca_cert = '<path_to_cluster_ca_certificate>'
        
        k8s_client = client.ApiClient(configuration)
        ```
        
        How the correct settings for `token`, `endpoint_of_api_server`, 
        and the cluster CA certificates can be retrieved is explained in the section below. 
        
        Another possibility is to use an existing Kubectl configuration. This might be the best solution for testing purposes:
        
        ```python
        from kubernetes import client, config
        
        # Configs can be set in Configuration class directly or using helper utility
        config.load_kube_config()
        
        k8s_client = client.ApiClient()
        ```
        
        Please refer to [Python Kubernetes documentation](https://github.com/kubernetes-client/python) for more details.
        
        ### Initializing the `JobManager`
        
        The `JobManager` must be supplied a `yaml template file` (see above) and the Kubernetes client.
        
        ```python
        from pathlib import Path
        from kubernetes_job import JobManager
        
        # Path to worker configuration
        yaml_spec = Path(__file__).parent / 'job.yml'
        
        # initialize the job manager
        manager = JobManager(k8s_client=k8s_client, k8s_job_spec=yaml_spec, namespace='default')
        ```
        
        **Please note:** 
        The `k8s_job_spec` may be a path to a file, or a `dict` instance. 
        The latter is handy for generating configuration on the fly!  
        
        ## API
        
        ### Create a new job
        A job can be started by invoking `create_job` on the `JobManager` instance:
        
        ```python
        # function to pass to the job
        def add(a, b):
            result = a + b
            print(result)
            return result
        
        # create a new job
        job = manager.create_job(add, 123, 456)
        ```
        
        `create_job` takes a *function pointer*. This function pointer and all arguments 
        (`*args` and `**kwargs`) are then "pickled", and merged in the [job template](test/job.yml).
        
        Our job is now running on the Kubernetes cluster!
        
        ### Listing jobs 
        
        ```python
        # list all jobs
        for job in manager.list_jobs():
            print(f"Found: {job.metadata.name}")
        ```
        
        ### Retrieving job status
        
        ```python
        from kubernetes_job import is_active, is_succeeded, is_failed, is_completed 
        
        # get the status of a job
        job = manager.read_job(name)
        print(f"Running: {is_active(job)} Succeeded: {is_succeeded(job)} Failed: {is_failed(job)} Completed: {is_completed(job)}")
        ```
        
        ### Deleting jobs
        ```python
        # delete a job
        manager.delete_job(name)
        ```
        
        
        ## Configuring Kubernetes for token-based authentication
        
        ### Create a service account
        First, create a service account: 
        
        ```bash
        # Create a service account
        kubectl create -f service_account.yml --k8s_namespace=default
        ```
        
        An example of `service_account.yml` can be found [here](test/service_account.yml)
        
        Kubernetes generates a unique name for the new service account. 
        We need to retrieve that unique name, and to do that, we need to ask Kubernetes for its secrets:
        
        ```bash
        # retrieve secret 
        kubectl get secrets --k8s_namespace=default | grep kubernetes-job-service-account
        ```
        
        This returns something like this:
        
        ```
        kubernetes-job-service-account-token-XXXXX   kubernetes.io/service-account-token   3      66s
        ```
        
        **kubernetes-job-service-account-token-XXXXX** is the name generated by Kubernetes.
        
        ### Retrieving the access token
        Now we are able to retrieve the access token for this service account:
        
        ```bash 
        kubectl describe secret/kubernetes-job-service-account-token-XXXXX | grep token
        ```
        
        This returns something like:
        
        ```
        token:      <token>
        ```
        
        This token is the one we're looking for.
        
        ### Cluster endpoint and cluster CA certificates
        
        To connect to the cluster we also need the **cluster endpoint** and the **CA certificates**. 
        Both can easily be retrieved through the Kubernetes dashboard, through the "cluster details" page.
        
        
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
