Metadata-Version: 2.1
Name: pydrift
Version: 0.1.6
Summary: How do we measure the degradation of a machine learning process? Why does the performance of our predictive models decrease? Maybe it is that a data source has changed (one or more variables) or maybe what changes is the relationship of these variables with the target we want to predict. `pydrift` tries to facilitate this task to the data scientist, performing this kind of checks and somehow measuring that degradation.
Home-page: https://github.com/sergiocalde94/Data-And-Model-Drift-Checker
License: MIT
Author: sergiocalde94
Author-email: sergiocalde94@gmail.com
Requires-Python: >=3.6.1,<4.0.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Requires-Dist: catboost (>=0.23,<0.24)
Requires-Dist: coveralls (>=2.0.0,<3.0.0)
Requires-Dist: flake8 (>=3.8.1,<4.0.0)
Requires-Dist: jupyter (>=1.0.0,<2.0.0)
Requires-Dist: missingno (>=0.4.2,<0.5.0)
Requires-Dist: pandas (>=1.0.3,<2.0.0)
Requires-Dist: plotly_express (>=0.4.1,<0.5.0)
Requires-Dist: pre-commit (>=2.4.0,<3.0.0)
Requires-Dist: pytest (>=5.4.2,<6.0.0)
Requires-Dist: shap (>=0.35.0,<0.36.0)
Requires-Dist: sklearn (>=0.0,<0.1)
Requires-Dist: sphinx (>=3.0.3,<4.0.0)
Requires-Dist: sphinx_press_theme (>=0.5.1,<0.6.0)
Requires-Dist: typing-extensions (>=3.7.4,<4.0.0)
Project-URL: Documentation, https://sergiocalde94.github.io/Data-And-Model-Drift-Checker/
Project-URL: Repository, https://github.com/sergiocalde94/Data-And-Model-Drift-Checker
Description-Content-Type: text/markdown

# Welcome to `pydrift` 0.1.6

How do we measure the degradation of a machine learning process? Why does the performance of our predictive models decrease? Maybe it is that a data source has changed (one or more variables) or maybe what changes is the relationship of these variables with the target we want to predict. `pydrift` tries to facilitate this task to the data scientist, performing this kind of checks and somehow measuring that degradation.

# Install `pydrift` :v:

`pip install pydrift`

# Structure :triangular_ruler:

This is intended to be user-friendly. pydrift is divided into **DataDriftChecker** and **ModelDriftChecker**:

- **DataDriftChecker**: search for drift in the variables, check if their distributions have changed
- **ModelDriftChecker**: search for drift in the relationship of the variables with the target, checks that the model behaves the same way for both data sets

Both can use a discriminative model (defined by parent class **DriftChecker**), where the target would be binary in belonging to one of the two sets, 1 if it is the left one and 0 on the contrary. If the model is not able to differentiate given the two sets, there is no difference!

![Class inheritance](/images/class_inheritance.png)

# Usage :book:

You can take a look to the `notebooks` folder where you can find one example for `DataDriftChecker` and other one for `ModelDriftChecker`. 

For more info check the docs available [here](https://sergiocalde94.github.io/Data-And-Model-Drift-Checker/)

More demos and code improvements will coming, if you want to contribute you can contact me, in the future I will upload a file to explain how this would work.

