Metadata-Version: 2.1
Name: cognite-extractor-manager
Version: 0.1.2
Summary: A project manager for Python based extractors
License: Apache-2.0
Author: Mathias Lohne
Author-email: mathias.lohne@cognite.com
Requires-Python: >=3.6.1,<4.0.0
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: cognite-extractor-utils (>=1.5.3,<2.0.0)
Requires-Dist: requests (>=2.26.0,<3.0.0)
Requires-Dist: termcolor (>=1.1.0,<2.0.0)
Description-Content-Type: text/markdown

# `cogex`

`cogex` is a tool for managing extractors for Cognite Data Fusion written in Python. It provides
utilities for initializing a new extractor project and building self-contained executables of Python
based extractors.


## Important note for users running `pyenv`

`pyenv` is a neat tool for managing Python installations.

Since `cogex` uses PyInstaller to build executables, we need Python to be installed with a shared
instance of `libpython`, which `pyenv` does not do by default. To fix this, make sure to add the
`--enable-shared` flag when installing new Python versions with `pyenv`, like so:

```bash
env PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install 3.9.0
```

You can read more about it in the [PyInstaller documentation](https://pyinstaller.readthedocs.io/en/stable/development/venv.html#pyenv-and-pyinstaller)


## Overview of features


### Start a new extractor project

To start a new extractor project, move to the desired directory and run

```bash
cogex init
```

You will first be prompted for some information, before `cogex` will initialize a new project.


### Add dependencies

Extractor projects initiated with `cogex` will use `poetry` for managing dependencies. Running
`cogex init` will automatically install the Cognite SDK and extractor-utils framework, but if your
extractor needs any other dependency, simply add them using `poetry`, like so:

```bash
poetry add requests
```


### Type checking and code style

It is recommended that you run code checkers on your extractor, in particular:

 * `black` is an opinionated code style checker that will enforce a consistent code style throughout
   your project. This is useful to avoid unecessary changes and minimizing PR diffs.
 * `isort` is a tool that sorts your imports, also contributing to a consistent code style and
   minimal PR diffs.
 * `mypy` is a static type checker for Python which ensures that you are not making any type errors
   in your code that would go unnoticed before suddently breaking your extractor in production.

`cogex` will install all of these, and automatically run them on every commit. If you for some
reason need to perform a commit despite one of these failing, you can run `git commit --no-verify`,
although this is not recommended.


### Build and package an extractor project

It is not always an option to rely on a Python installation at the machine your extractor will be
deployed at. For those scenarios it is useful to package the extractor, including its dependencies
and the Python runtime, into a single self-contained executable. To do this, run

```bash
cogex build
```

This will create a new executable (for the operating system you ran `cogex build` from) in the
`dist` directory.


### Creating a new version of your extractor

To keep track of which version of the code base is running at a given deployment it is very useful
to version your extractor. When releasing a new version, run

```bash
poetry version [patch/minor/major]
```

To automatically bump the corresponding version number. Note that this only updates the version
number in `pyproject.toml`. When running `cogex build` this new version number will be propagated
through the rest of the code base.

Any extractor project should follow semantic versioning, which means you should bump

 * `patch` for any minor bug fixes or improvements
 * `minor` for new features or bigger improvements that __doesn't__ break compatability
 * `major` for new feature or improvements that breaks compatability with previous versions, in
   other words for those scenarios where the new version is not a drop-in replacement for an old
   version. For example:
   - When adding a new required config field
   - When removing a config field
   - When changing defaults in a way that could break existing deployments

