# -*- coding: utf-8 -*-
from setuptools import setup

packages = \
['carling', 'carling.io', 'carling.iter_utils', 'carling.test_utils']

package_data = \
{'': ['*']}

install_requires = \
['apache-beam>=2.33.0,<3.0.0', 'deepdiff>=5.6.0,<6.0.0']

setup_kwargs = {
    'name': 'carling',
    'version': '0.3.2',
    'description': 'Useful transforms for supporting apache beam pipelines.',
    'long_description': '# Carling\n\n[![CI](https://github.com/mc-digital/carling/actions/workflows/ci.yml/badge.svg)](https://github.com/mc-digital/carling/actions?query=workflow%3ACI)\n[![versions](https://img.shields.io/pypi/pyversions/carling.svg)](https://pypi.org/project/carling/)\n[![pypi](https://img.shields.io/pypi/v/carling)](https://pypi.org/project/carling/)\n[![license](https://img.shields.io/pypi/l/carling)](https://github.com/mc-digital/carling/blob/main/LICENSE)\n\nVia [Wikipedia](<https://en.wikipedia.org/wiki/Carling_(sailing)>):\n\n> Carlings are pieces of timber laid fore and aft under the deck of a ship, from one beam to another.\n> They serve as a foundation for the whole body of the ship.\n\nUseful transforms for supporting our apache beam pipelines.\n\n## Mapping transform utils\n\n#### `carling.Label`\n\nLabels all elements.\n\n#### `carling.Select`\n\nRemoves all columns which are not specified in `*keys`.\n\n#### `carling.Project`\n\nTransforms each element into a tuple of values of the specified columns.\n\n#### `carling.IndexBy`\n\nTransforms each element `V` into a tuple `(K, V)`.\n\n`K` is the projection of `V` by `*keys`, which is equal to the tuple\nproduced by the `Project` transform.\n\n#### `carling.Stringify`\n\nTransforms each element into its JSON representation.\n\n#### `carling.IndexBySingle`\n\nTransforms each element `V` into a tuple `(K, V)`.\n\nThe difference between `IndexBySingle(key)` and `IndexBy(key)` with a single\nkey is as follows:\n\n- `IndexBySingle` produces the index as a plain value.\n- `IndexBy` produces the index as a single-element tuple.\n\n#### `carling.RenameFromTo`\n\nRename columns according to `from_to_key_mapping`.\n\n#### `carling.Exclude`\n\nRemoves all columns specified in `*keys`.\n\n## Grouping transform utils\n\nGeneric grouping transform utils\n\n#### `carling.UniqueOnly`\n\nProduces elements that are the only elements per key after deduplication.\n\nGiven a `PCollection` of `(K, V)`,\nthis transform produces the collection of all `V`s that do not share\nthe same corresponding `K`s with any other elements after deduplicating\nall equivalent `(K, V)` pairs.\n\nThis transform is equivalent to `SingletonOnly` with `apache_beam.Distinct`.\n\n`[(1, "A"), (2, "B1"), (2, "B2"), (3, "C"), (3, "C"), (4, "A")]` will be\ntransformed into `["A", "C", "A"]`.\n\n#### `carling.SingletonOnly`\n\nProduces elements that are the only elements per key.\n\nGiven a `PCollection` of `(K, V)`,\nthis transform produces the collection of all `V`s that do not share\nthe same corresponding `K`s with any other elements.\n\n`[(1, "A"), (2, "B1"), (2, "B2"), (3, "C"), (3, "C"), (4, "A")]` will be\ntransformed into `["A", "A"]`.\n\n#### `carling.Intersection`\n\nProduces the intersection of given `PCollection`s.\n\nGiven a list of `PCollection`s,\nthis transform produces every element that appears in all collections of\nthe list.\nElements are deduplicated before taking the intersection.\n\n#### `carling.FilterByKey`\n\nFilters elements by their keys.\n\nThe constructor receives one or more `PCollection`s of `K`s,\nwhich are regarded as key lists.\nGiven a `PCollection` of `(K, V)`,\nthis transform discards all elements with `K`s that do not appear\nin the key lists.\n\nIf multiple collections are given to the constructor,\nthis transform treats the intersection of them as the key list.\n\n#### `carling.FilterByKeyUsingSideInput`\n\nFilters a single collection by a single lookup collection, using a common key.\n\nGiven: - a `PCollection` (lookup_entries) of `(V)`, as a lookup collection - a `PCollection` (pcoll) of `(V)`, as values to be filtered - a common key (filter_key)\n\nA dictionary called `filter_dict` - is created by mapping the value of `filter_key`\nfor each entry in `lookup_entries` to True.\n\nThen, for each item in pcoll, the value associated with `filter_key` checkd against\n`filter_dict`, and if it is found, the entry passes through. Otherwise, the entry is\ndiscarded.\n\nNote: `lookup_entries` will be used as a **side input**, so care\nmust be taken regarding the size of the `lookup_entries`\n\n#### `carling.DifferencePerKey`\n\nProduces the difference per key between two `PCollection`s.\n\nGiven two `PCollection`s of `V`,\nthis transform indexes the collections by the specified keys `primary_keys`,\ncompares corresponding two `V` lists for every `K`,\nand produces the difference per `K`.\nIf there is no difference, this transform produces nothing.\n\nTwo `V` lists are considered to be different if the numbers of elements\ndiffer or two elements of the lists with a same index differ\nat one of the specified columns `columns`.\n\n#### `carling.SortedSelectPerKey`\n\n- Groups items by a set of `keys` -- column names per row\n- Emits the "MAX" _value_ for each collection as defined by the `key_fn`\n- Can emit "MIN" by passing `reverse=True` kwarg\n\n#### `carling.PartitionRowsContainingNone`\n\nEmits two tagged PCollections:\n\n- Default (`result[None]`): Rows are guaranteed not to have any `None` values\n- `result["contains_none"]`: Rows for which at least one column had a `None` value\n\n## Categorical\n\n#### `carling.CreateCategoricalDicts`\n\nFor a set of columnular data inputs this function takes:\n\n    - cat_cols:\n\n        Type: `[str]`\n\n        An array of "categorical" columns\n\n    - existing_dicts:\n\n        Type: `PCollection[(string, string, int)]`\n\n        Rows of tuples of type:\n        (column, previously_seen_value, mapped_unique_int)\n\n        Mapping a set of "previously seen" keys to unique int values for each\n        column.\n        Not optional.\n        If none exist, pass an empty PCollection.\n\nIt then creates a transform which takes a pcollection and\n\n    - looks at the input pcoll for unseen values in each categorical column\n    - creates new unique integers for each distinct unseen value, starting at max(previous value for column)+1\n    - ammends the existing mappings with (col, unseen_value, new_unique_int)\n\nOutput is:\n\n    - Type: `PCollection[(string, string, int)]`\n\nThis is useful for preparing data to be trained by eg. LightGBM\n\n#### `carling.ReplaceCategoricalColumns`\n\n- Utilizes the "categorical dictionary rows" generated by `CreateCategoricalDicts` which is a list of pairs of type of `(column, value,unique_int)`.\n\n- Replaces each column with the appropriate value found in the mapping.\n\n## Test Utils\n\n#### `carling.test_utils.pprint_equal_to`\n\nThis module contains the `equal_to` function from apache beam, but adapted to output results using pretty print. Reading the results as a large, unformatted string makes it harder to pick out what changed/is missing.\n\n## General Util\n\n#### `carling.LogSample`\n\nPrint items of the given `PCollection` to the log.\n\n`LogSample` prints the JSON representations of the input items to the Python\'s\nstandard logging system.\n\nTo avoid too much log entries being printed, `LogSample` limits the number of\nlogged items. The constructor parameter `n` determines the limit.\n\nBy default, `LogSample` prints logs with the `INFO` log level. The constructor\nparameter `level` determines the level.\n\n#### `carling.ReifyMultiValueOption`\n\nPrepares multi-value delimited options. Useful in contexts where\nyou want to create a multi-value option in a template environment.\n\n- inputs:\n  - delimited string option\n  - optional delimiter string (default is "|")\n\n* output:\n  - Type: `PCollection[str]`\n',
    'author': 'Adam Moore',
    'author_email': 'adam@mcdigital.jp',
    'maintainer': None,
    'maintainer_email': None,
    'url': 'https://github.com/mc-digital/carling',
    'packages': packages,
    'package_data': package_data,
    'install_requires': install_requires,
    'python_requires': '>=3.7,<4.0',
}


setup(**setup_kwargs)
