# -*- coding: utf-8 -*-
from setuptools import setup

packages = \
['pdcast']

package_data = \
{'': ['*']}

install_requires = \
['pandas>=0.24']

extras_require = \
{':python_version < "3.7"': ['numpy>=1.16.5', 'dataclasses'],
 ':python_version >= "3.7"': ['numpy>=1.17']}

setup_kwargs = {
    'name': 'pandas-downcast',
    'version': '1.2.1',
    'description': 'Shrink Pandas DataFrames with precision safe schema inference.',
    'long_description': 'Pandas Downcast\n===============\n\n[![image](https://img.shields.io/pypi/v/pandas-downcast.svg)](https://pypi.python.org/pypi/pandas-downcast)\n[![PyPI pyversions](https://img.shields.io/pypi/pyversions/pandas-downcast.svg)](https://pypi.python.org/pypi/pandas-downcast/)\n[![Build Status](https://travis-ci.com/domvwt/pandas-downcast.svg?branch=main)](https://travis-ci.com/domvwt/pandas-downcast)\n[![codecov](https://codecov.io/gh/domvwt/pandas-downcast/branch/main/graph/badge.svg?token=TQPLURKQ9Z)](https://codecov.io/gh/domvwt/pandas-downcast)\n\nShrink [Pandas](https://pandas.pydata.org/) DataFrames with precision safe schema inference.\n`pandas-downcast` finds the minimum viable type for each column, ensuring that resulting values\nare within tolerance of original values.\n\n## Installation\n\n```bash\npip install pandas-downcast\n```\n\n## Dependencies\n\n* python >= 3.6\n* pandas\n* numpy\n\n## License\n\n[MIT](https://opensource.org/licenses/MIT)\n\n## Usage\n\n```python\nimport pdcast as pdc\nimport numpy as np\nimport pandas as pd\n\ndata = {\n    "integers": np.linspace(1, 100, 100),\n    "floats": np.linspace(1, 1000, 100).round(2),\n    "booleans": np.random.choice([1, 0], 100),\n    "categories": np.random.choice(["foo", "bar", "baz"], 100),\n}\n\ndf = pd.DataFrame(data)\n\n# Downcast DataFrame to minimum viable schema.\ndf_downcast = pdc.downcast(df)\n\n# Infer minimum schema for DataFrame.\nschema = pdc.infer_schema(df)\n\n# Coerce DataFrame to schema - required if converting float to Pandas Integer.\ndf_new = pdc.coerce_df(df, schema)\n```\n\nSmaller data types -> smaller memory footprint.\n\n```python\ndf.info()\n# <class \'pandas.core.frame.DataFrame\'>\n# RangeIndex: 100 entries, 0 to 99\n# Data columns (total 4 columns):\n#  #   Column      Non-Null Count  Dtype  \n# ---  ------      --------------  -----  \n#  0   integers    100 non-null    float64\n#  1   floats      100 non-null    float64\n#  2   booleans    100 non-null    int64  \n#  3   categories  100 non-null    object \n# dtypes: float64(2), int64(1), object(1)\n# memory usage: 3.2+ KB\n\ndf_downcast.info()\n# <class \'pandas.core.frame.DataFrame\'>\n# RangeIndex: 100 entries, 0 to 99\n# Data columns (total 4 columns):\n#  #   Column      Non-Null Count  Dtype   \n# ---  ------      --------------  -----   \n#  0   integers    100 non-null    uint8   \n#  1   floats      100 non-null    float32 \n#  2   booleans    100 non-null    bool    \n#  3   categories  100 non-null    category\n# dtypes: bool(1), category(1), float32(1), uint8(1)\n# memory usage: 932.0 bytes\n```\n\nNumerical data types will be downcast if the resulting values are within tolerance of the original values.\nFor details on tolerance for numeric comparison, see the notes on [`np.allclose`](https://numpy.org/doc/stable/reference/generated/numpy.allclose.html).\n\n```python\nprint(df.head())\n#    integers  floats  booleans categories\n# 0       1.0    1.00         1        foo\n# 1       2.0   11.09         0        baz\n# 2       3.0   21.18         1        bar\n# 3       4.0   31.27         0        bar\n# 4       5.0   41.36         0        foo\n\nprint(df_downcast.head())\n#    integers     floats  booleans categories\n# 0         1   1.000000      True        foo\n# 1         2  11.090000     False        baz\n# 2         3  21.180000      True        bar\n# 3         4  31.270000     False        bar\n# 4         5  41.360001     False        foo\n\n\nprint(pdc.options.ATOL)\n# >>> 1e-08\n\nprint(pdc.options.RTOL)\n# >>> 1e-05\n```\n\nTolerance can be set at module level or passed in function arguments.\n\n```python\npdc.options.ATOL = 1e-10\npdc.options.RTOL = 1e-10\ndf_downcast_new = pdc.downcast(df)\n```\n\nOr\n\n```python\ninfer_dtype_kws = {\n    "ATOL": 1e-10,\n    "RTOL": 1e-10\n}\ndf_downcast_new = pdc.downcast(df, infer_dtype_kws=infer_dtype_kws)\n```\n\nThe `floats` column is now kept as `float64` to meet the tolerance requirement.\nValues in the `integers` column are still safely cast to `uint8`.\n\n```python\ndf_downcast_new.info()\n# <class \'pandas.core.frame.DataFrame\'>\n# RangeIndex: 100 entries, 0 to 99\n# Data columns (total 4 columns):\n#  #   Column      Non-Null Count  Dtype   \n# ---  ------      --------------  -----   \n#  0   integers    100 non-null    uint8   \n#  1   floats      100 non-null    float64 \n#  2   booleans    100 non-null    bool    \n#  3   categories  100 non-null    category\n# dtypes: bool(1), category(1), float64(1), uint8(1)\n# memory usage: 1.3 KB\n```\n\nInferred schemas can be restricted to Numpy data types only.\n\n```python\n# Downcast DataFrame to minimum viable Numpy schema.\ndf_downcast = pdc.downcast(df, numpy_dtypes_only=True)\n\n# Infer minimum  Numpy schema for DataFrame.\nschema = pdc.infer_schema(df, numpy_dtypes_only=True)\n```\n\n## Example\n\nThe following example shows how downcasting data often leads to size reductions of **greater than 70%**, depending on the original types.\n\n```python\nimport pdcast as pdc\nimport pandas as pd\nimport seaborn as sns\n\ndf_dict = {df: sns.load_dataset(df) for df in sns.get_dataset_names()}\n\nresults = []\n\nfor name, df in df_dict.items():\n    size_pre = df.memory_usage(deep=True).sum()\n    df_post = pdc.downcast(df)\n    size_post = df_post.memory_usage(deep=True).sum()\n    shrinkage = int((1 - (size_post / size_pre)) * 100)\n    results.append(\n        {"dataset": name, "size_pre": size_pre, "size_post": size_post, "shrink_pct": shrinkage}\n    )\n\nresults_df = pd.DataFrame(results).sort_values("shrink_pct", ascending=False).reset_index(drop=True)\nprint(results_df)\n```\n\n```\n           dataset  size_pre  size_post  shrink_pct\n0             fmri    213232      14776          93\n1          titanic    321240      28162          91\n2        attention      5888        696          88\n3         penguins     75711       9131          87\n4             dots    122240      17488          85\n5           geyser     21172       3051          85\n6           gammas    500128     108386          78\n7         anagrams      2048        456          77\n8          planets    112663      30168          73\n9         anscombe      3428        964          71\n10            iris     14728       5354          63\n11        exercise      3302       1412          57\n12         flights      3616       1888          47\n13             mpg     75756      43842          42\n14            tips      7969       6261          21\n15        diamonds   3184588    2860948          10\n16  brain_networks   4330642    4330642           0\n17     car_crashes      5993       5993           0\n```\n',
    'author': 'Dominic Thorn',
    'author_email': 'dominic.thorn@gmail.com',
    'maintainer': None,
    'maintainer_email': None,
    'url': 'https://github.com/domvwt/pandas-downcast',
    'packages': packages,
    'package_data': package_data,
    'install_requires': install_requires,
    'extras_require': extras_require,
    'python_requires': '>=3.6.1,<4',
}


setup(**setup_kwargs)
