# -*- coding: utf-8 -*-
from setuptools import setup

packages = \
['spark_frame',
 'spark_frame.data_diff',
 'spark_frame.examples',
 'spark_frame.fp',
 'spark_frame.graph_impl',
 'spark_frame.nested_functions_impl',
 'spark_frame.nested_impl',
 'spark_frame.transformations_impl']

package_data = \
{'': ['*'], 'spark_frame': ['templates/*']}

setup_kwargs = {
    'name': 'spark-frame',
    'version': '0.3.1',
    'description': 'A library containing various utility functions for playing with PySpark DataFrames',
    'long_description': '# Spark-frame\n\n[![PyPI version](https://badge.fury.io/py/spark-frame.svg)](https://badge.fury.io/py/spark-frame)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/spark-frame.svg)](https://pypi.org/project/spark-frame/)\n[![GitHub Build](https://img.shields.io/github/actions/workflow/status/FurcyPin/spark-frame/build_and_validate.yml?branch=main)](https://github.com/FurcyPin/spark-frame/actions)\n[![SonarCloud Coverage](https://sonarcloud.io/api/project_badges/measure?project=FurcyPin_spark-frame&metric=coverage)](https://sonarcloud.io/component_measures?id=FurcyPin_spark-frame&metric=coverage&view=list)\n[![SonarCloud Bugs](https://sonarcloud.io/api/project_badges/measure?project=FurcyPin_spark-frame&metric=bugs)](https://sonarcloud.io/component_measures?metric=reliability_rating&view=list&id=FurcyPin_spark-frame)\n[![SonarCloud Vulnerabilities](https://sonarcloud.io/api/project_badges/measure?project=FurcyPin_spark-frame&metric=vulnerabilities)](https://sonarcloud.io/component_measures?metric=security_rating&view=list&id=FurcyPin_spark-frame)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/spark-frame)](https://pypi.org/project/spark-frame/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n\n## What is it ?\n\n**[Spark-frame](https://furcypin.github.io/spark-frame/) is a library that super-charges your Spark DataFrames!**\n\nIt brings several utility methods and transformation functions for PySpark DataFrames.\nThese methods were initially part of the [karadoc](https://github.com/FurcyPin/karadoc) project \nused at [Younited](https://medium.com/younited-tech-blog), but they were fully independent from karadoc, \nso it made more sense to keep them as a standalone library.\n\nSeveral of these methods were my initial inspiration to make the cousin project \n[bigquery-frame](https://github.com/FurcyPin/bigquery-frame), which was first made to illustrate\nthis [blog article](https://medium.com/towards-data-science/sql-jinja-is-not-enough-why-we-need-dataframes-4d71a191936d).\nThis is why you will find similar methods in both `spark_frame` and `bigquery_frame`, \nexcept the former runs on PySpark while the latter runs on BigQuery (obviously).\nI try to keep both projects consistent together, and new eventually port new developments made on \none project to the other one.\n\n## Getting Started\n\nVisit the official Spark-frame website [documentation](https://furcypin.github.io/spark-frame/) \nfor [use cases examples](https://furcypin.github.io/spark-frame/use_cases/intro/) \nand [reference](https://furcypin.github.io/spark-frame/reference/functions/).\n\n## Installation\n\n[spark-frame is available on PyPi](https://pypi.org/project/spark-frame/).\n\n```bash\npip install spark-frame\n```\n\n## Compatibilities and requirements\n\nThis library does not depend on any other library.\n**Pyspark must be installed separately to use it.**\nIt is compatible with the following versions:\n\n- Python: requires 3.8.1 or higher (tested against Python 3.9, 3.10 and 3.11)\n- pyspark: requires 3.3.0 or higher\n\nThis library is tested against Windows, Mac and Linux.\n\n\n**Some features require extra libraries to be installed alongside this project.**\n**We chose to not include them as direct dependencies for security and flexibility reasons.**\n**This way, users who are not using these features don\'t need to worry about these dependencies.**\n\n| feature                               | Method                      | module required |\n|---------------------------------------|-----------------------------|----------------:|\n| Generating HTML reports for data diff | `DiffResult.export_to_html` |          jinja2 |\n\n\n# Release notes\n\n# v0.3.1\n\nFixes and improvements on data_diff\n\n- The `export_html_diff_report` method now accepts arguments to specify the path and encoding of the output html report. \n- Data-diff join now works correctly with null values\n- Visual improvements to HTML diff report\n\n\n# v0.3.0\n\nFixes and improvements on data_diff\n\n- Fixed incorrect diff results\n- Column values are not truncated at all, this was causing incorrect results. The possibility to limit the size \n  of the column values will be added back in a later version\n- Made sure that the most frequent values per column are now displayed by decreasing order of frequency\n\n\n# v0.2.0\n\nTwo new exciting features: *analyze* and *data_diff*. \nThey are still in experimental stage and will be improved in future releases.\n\n- Added a new transformation `spark_frame.transformations.analyze`.\n- Added new *data_diff* feature. Example:\n\n```python\nfrom pyspark.sql import DataFrame\nfrom spark_frame.data_diff import DataframeComparator\ndf1: DataFrame = ...\ndf2: DataFrame = ...\ndiff_result = DataframeComparator().compare_df(df1, df2) # Produces a DiffResult object\ndiff_result.display() # Print a diff report in the terminal\ndiff_result.export_to_html() # Generates a html diff report file named diff_report.html\n```\n\n\n# v0.1.1\n\n- Added a new transformation `spark_frame.transformations.flatten_all_arrays`.\n- Added support for multi-arg transformation to `nested.select` and `nested.with_fields` \n  With this feature, we can now access parent fields from higher levels\n  when applying a transformation. Example:\n  \n```\n>>> nested.print_schema(df)\n"""\nroot\n |-- id: integer (nullable = false)\n |-- s1!.average: integer (nullable = false)\n |-- s1!.values!: integer (nullable = false)\n"""\n>>> df.show(truncate=False)\n+---+--------------------------------------+\n|id |s1                                    |\n+---+--------------------------------------+\n|1  |[{2, [1, 2, 3]}, {3, [1, 2, 3, 4, 5]}]|\n+---+--------------------------------------+\n>>> new_df = df.transform(nested.with_fields, {\n>>>     "s1!.values!": lambda s1, value: value - s1["average"]  # This transformation takes 2 arguments\n>>> })\n+---+-----------------------------------------+\n|id |s1                                       |\n+---+-----------------------------------------+\n|1  |[{2, [-1, 0, 1]}, {3, [-2, -1, 0, 1, 2]}]|\n+---+-----------------------------------------+\n```\n\n# v0.1.0\n\n- Added a new _amazing_ module called `spark_frame.nested`, \n  which makes manipulation of nested data structure much easier!\n  Make sure to check out the [reference](https://furcypin.github.io/spark-frame/reference/nested/)\n  and the [use-cases](https://furcypin.github.io/spark-frame/use_cases/working_with_nested_data/).\n\n- Also added a new module called `spark_frame.nested_functions`,\n  which contains aggregation methods for nested data structures\n  ([See Reference](https://furcypin.github.io/spark-frame/reference/nested_functions/)).\n\n- New [transformations](https://furcypin.github.io/spark-frame/reference/transformations/):\n  - `spark_frame.transformations.transform_all_field_names`\n  - `spark_frame.transformations.transform_all_fields`\n  - `spark_frame.transformations.unnest_field`\n  - `spark_frame.transformations.unnest_all_fields`\n  - `spark_frame.transformations.union_dataframes`\n\n# v0.0.3\n\n- New transformation: `spark_frame.transformations.convert_all_maps_to_arrays`.\n- New transformation: `spark_frame.transformations.sort_all_arrays`.\n- New transformation: `spark_frame.transformations.harmonize_dataframes`.\n',
    'author': 'FurcyPin',
    'author_email': 'None',
    'maintainer': 'None',
    'maintainer_email': 'None',
    'url': 'https://github.com/FurcyPin/spark-frame',
    'packages': packages,
    'package_data': package_data,
    'python_requires': '>=3.8.1,<3.11',
}


setup(**setup_kwargs)
