Metadata-Version: 2.1
Name: pyspark-ds-toolbox
Version: 0.0.2a0
Summary: A Pyspark companion for data science tasks.
Home-page: https://github.com/viniciusmsousa/pyspark-ds-toolbox
License: GPL-3.0-only
Author: vinicius.sousa
Author-email: vinisousa04@gmail.com
Requires-Python: >=3.7.1,<4.0
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: numpy (==1.21.0)
Requires-Dist: pandas (>=1.3.4,<2.0.0)
Requires-Dist: pyspark (>=3.1.1,<4.0.0)
Requires-Dist: typeguard (>=2.13.2,<3.0.0)
Project-URL: Documentation, https://viniciusmsousa.github.io/pyspark-ds-toolbox/index.html
Project-URL: Repository, https://github.com/viniciusmsousa/pyspark-ds-toolbox
Description-Content-Type: text/markdown

# Pyspark DS Toolbox

The objective of the package is to provide tools that helps the daily work of data science with spark.

## Package Structure
```
pyspark-ds-toolbox
├─ .git/
├─ .github
│  └─ workflows
│     └─ package-tests.yml
├─ .gitignore
├─ LICENSE.md
├─ README.md
├─ examples
│  └─ ml_eval_estimate_shapley_values.ipynb
├─ poetry.lock
├─ pyproject.toml
├─ docs/
├─ pyspark_ds_toolbox
│  ├─ __init__.py
│  ├─ causal_inference
│  │  ├─ __init__.py
│  │  ├─ diff_in_diff.py
│  │  └─ ps_matching.py
│  ├─ ml
│  │  ├─ __init__.py
│  │  ├─ data_prep.py
│  │  └─ eval.py
│  └─ wrangling.py
├─ requirements.txt
└─ tests
   ├─ __init__.py
   ├─ conftest.py
   ├─ data
   ├─ test_causal_inference
   │  ├─ test_diff_in_diff.py
   │  └─ test_ps_matching.py
   ├─ test_ml
   │  ├─ test_data_prep.py
   │  └─ test_ml_eval.py
   ├─ test_pyspark_ds_toolbox.py
   └─ test_wrangling.py
```
