[![Documentation Status](https://readthedocs.org/projects/pywedge/badge/?version=main)](https://pywedge.readthedocs.io/en/main/?badge=main)  [![Downloads](https://pepy.tech/badge/pywedge)](https://pepy.tech/project/pywedge) [![PyPI version](https://badge.fury.io/py/pywedge.svg)](https://badge.fury.io/py/pywedge) [![License: MIT](https://img.shields.io/badge/License-MIT-brightgreen.svg)](https://opensource.org/licenses/MIT)

# Pywedge                                   

# [Pywedge-Make_Charts Heroku Web App - Demo](https://pywedge-make-charts.herokuapp.com/)

[![badge](https://img.shields.io/badge/try_pywedge_make_charts_in%20-binder-579ACA.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFkAAABZCAMAAABi1XidAAAB8lBMVEX///9XmsrmZYH1olJXmsr1olJXmsrmZYH1olJXmsr1olJXmsrmZYH1olL1olJXmsr1olJXmsrmZYH1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olJXmsrmZYH1olL1olL0nFf1olJXmsrmZYH1olJXmsq8dZb1olJXmsrmZYH1olJXmspXmspXmsr1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olLeaIVXmsrmZYH1olL1olL1olJXmsrmZYH1olLna31Xmsr1olJXmsr1olJXmsrmZYH1olLqoVr1olJXmsr1olJXmsrmZYH1olL1olKkfaPobXvviGabgadXmsqThKuofKHmZ4Dobnr1olJXmsr1olJXmspXmsr1olJXmsrfZ4TuhWn1olL1olJXmsqBi7X1olJXmspZmslbmMhbmsdemsVfl8ZgmsNim8Jpk8F0m7R4m7F5nLB6jbh7jbiDirOEibOGnKaMhq+PnaCVg6qWg6qegKaff6WhnpKofKGtnomxeZy3noG6dZi+n3vCcpPDcpPGn3bLb4/Mb47UbIrVa4rYoGjdaIbeaIXhoWHmZYHobXvpcHjqdHXreHLroVrsfG/uhGnuh2bwj2Hxk17yl1vzmljzm1j0nlX1olL3AJXWAAAAbXRSTlMAEBAQHx8gICAuLjAwMDw9PUBAQEpQUFBXV1hgYGBkcHBwcXl8gICAgoiIkJCQlJicnJ2goKCmqK+wsLC4usDAwMjP0NDQ1NbW3Nzg4ODi5+3v8PDw8/T09PX29vb39/f5+fr7+/z8/Pz9/v7+zczCxgAABC5JREFUeAHN1ul3k0UUBvCb1CTVpmpaitAGSLSpSuKCLWpbTKNJFGlcSMAFF63iUmRccNG6gLbuxkXU66JAUef/9LSpmXnyLr3T5AO/rzl5zj137p136BISy44fKJXuGN/d19PUfYeO67Znqtf2KH33Id1psXoFdW30sPZ1sMvs2D060AHqws4FHeJojLZqnw53cmfvg+XR8mC0OEjuxrXEkX5ydeVJLVIlV0e10PXk5k7dYeHu7Cj1j+49uKg7uLU61tGLw1lq27ugQYlclHC4bgv7VQ+TAyj5Zc/UjsPvs1sd5cWryWObtvWT2EPa4rtnWW3JkpjggEpbOsPr7F7EyNewtpBIslA7p43HCsnwooXTEc3UmPmCNn5lrqTJxy6nRmcavGZVt/3Da2pD5NHvsOHJCrdc1G2r3DITpU7yic7w/7Rxnjc0kt5GC4djiv2Sz3Fb2iEZg41/ddsFDoyuYrIkmFehz0HR2thPgQqMyQYb2OtB0WxsZ3BeG3+wpRb1vzl2UYBog8FfGhttFKjtAclnZYrRo9ryG9uG/FZQU4AEg8ZE9LjGMzTmqKXPLnlWVnIlQQTvxJf8ip7VgjZjyVPrjw1te5otM7RmP7xm+sK2Gv9I8Gi++BRbEkR9EBw8zRUcKxwp73xkaLiqQb+kGduJTNHG72zcW9LoJgqQxpP3/Tj//c3yB0tqzaml05/+orHLksVO+95kX7/7qgJvnjlrfr2Ggsyx0eoy9uPzN5SPd86aXggOsEKW2Prz7du3VID3/tzs/sSRs2w7ovVHKtjrX2pd7ZMlTxAYfBAL9jiDwfLkq55Tm7ifhMlTGPyCAs7RFRhn47JnlcB9RM5T97ASuZXIcVNuUDIndpDbdsfrqsOppeXl5Y+XVKdjFCTh+zGaVuj0d9zy05PPK3QzBamxdwtTCrzyg/2Rvf2EstUjordGwa/kx9mSJLr8mLLtCW8HHGJc2R5hS219IiF6PnTusOqcMl57gm0Z8kanKMAQg0qSyuZfn7zItsbGyO9QlnxY0eCuD1XL2ys/MsrQhltE7Ug0uFOzufJFE2PxBo/YAx8XPPdDwWN0MrDRYIZF0mSMKCNHgaIVFoBbNoLJ7tEQDKxGF0kcLQimojCZopv0OkNOyWCCg9XMVAi7ARJzQdM2QUh0gmBozjc3Skg6dSBRqDGYSUOu66Zg+I2fNZs/M3/f/Grl/XnyF1Gw3VKCez0PN5IUfFLqvgUN4C0qNqYs5YhPL+aVZYDE4IpUk57oSFnJm4FyCqqOE0jhY2SMyLFoo56zyo6becOS5UVDdj7Vih0zp+tcMhwRpBeLyqtIjlJKAIZSbI8SGSF3k0pA3mR5tHuwPFoa7N7reoq2bqCsAk1HqCu5uvI1n6JuRXI+S1Mco54YmYTwcn6Aeic+kssXi8XpXC4V3t7/ADuTNKaQJdScAAAAAElFTkSuQmCC)](https://mybinder.org/v2/gh/taknev83/pywedge_make_charts_heroku_demo/81ecbd94801f44d7fc05840cf884e82346a30f28?filepath=notebooks%2FPywedge_Make_Charts_Demo.ipynb)

# Installation

```
pip install pywedge
```

For JupyterLab, please run the following commands in anaconda prompt to enable required JupyterLab extensions to display interactive chart widget,

```
conda install -c conda-forge nodejs

jupyter labextension install @jupyter-widgets/jupyterlab-manager

jupyter labextension install jupyterlab-plotly@4.14.1

jupyter labextension install @jupyter-widgets/jupyterlab-manager plotlywidget@4.14.1
```


Pywedge is a [pip installable](https://pypi.org/project/pywedge/) Python package that intends to,

1. Make multiple charts in a single line of code, to enable the user to quickly read through the charts and can make informed choices in pre-processing steps

2. Quickly preprocess the data by taking the user’s preferred choice of pre-processing techniques & it returns the cleaned datasets to the user in the first step.

3. Make a baseline model summary, which can return ten various baseline models, which can point the user to explore the best performing baseline model.

Pywedge intends to help the user by quickly making charts, preprocessing the data and to rightly point out the best performing baseline model for the given dataset so that the user can spend quality time tuning such a model algorithm.

# Pywedge Features
Cleans the raw data frame to fed into ML models. Following data pre_processing will be carried out,
1) Makes 8 different types of ***interactive charts*** with interactive axis selection widgets
2) Interactive pre-processing & 10 different baseline models 
    - Missing values imputation for numeric & categorical columns
    - Standardization
    - Feature importance
    - Class oversampling using SMOTE
    - Computes 10 different baseline models
 3) Interactive Hyperparameter tuning & tracking hyperparameters using integreted MLFlow
    - Classification / Regression Hyperparameters tuning
        - Available baseline estimators for interactive hyperparameter tuning as of now, more baseline estimators will be added soon for interactive hyperparameter tunings
        - Logistic / Linear Regression
        - Decision Tree Classifier / Regressor
        - Random Forest Clasifier/ Regressor
        - KNN Classifier / Regressor

# Make_Charts()
Makes 8 different types of interactive Charts with interactive axis selection widgets in a single line of code for the given dataset. 

Different types of Charts viz,
1) Scatter Plot
2) Pie Chart
3) Bar Plot
4) Violin Plot
5) Box Plot
6) Distribution Plot
7) Histogram 
8) Correlation Plot
    
Arguments:
1) Dataframe
2) c = any redundant column to be removed (like ID column etc., at present supports a single column removal, subsequent version will provision multiple column removal requirements)
3) y = target column name as a string 
        
Returns:

Charts widget

Pywedge-Make_Charts Demo YouTube link below,

<div align="left">
      <a href="https://youtu.be/-3rrQqyMTVk">
     <img 
      src="https://raw.githubusercontent.com/taknev83/pywedge/main/images/mq1.jpg" 
      alt="Pywedge-Make_Charts" 
      style="width:100%;">
      </a>
    </div>



Please read about Pywedge-Make_Charts module in this article published in [Analytics India Magazine](https://analyticsindiamag.com/how-to-build-interactive-eda-in-2-lines-of-code-using-pywedge/).

# baseline_model()
The baseline_model class starts with interactive pre-processing steps,
![baseline_model](https://raw.githubusercontent.com/taknev83/pywedge/main/images/baseline_models_inputs.jpg)

Instantiate the baseline class & call the classification_summary method from baseline_model class,

```python
blm = pw.baseline_model(train, test, c, y, type)
blm.classification_summary()
```

Args:
1) train = train dataframe
2) test = test dataframe
3) c = any redundant column to be removed (like ID column etc., at present supports a single column removal, subsequent version will provision multiple column removal requirements)
4) y = target column name as a string 
5) type = Classification(Default) / Regression


- For classification - classification_summary() 
- For Regression - Regression_summary()

User Inputs:
1) Categorical columns conversion options
    -   Using Pandas Catcodes
    -   Using Pandas Get Dummies
2) Standardization Options,
    -   Standard scalar
    -   Minmax scalar
    -   Robust Scalar
    -   No Standardization
3) For Classification, Class balance using SMOTE options
    -   Yes
    -   No
4) Test Size for Train-test split
    -   test size in float

Returns:

1) Baseline models tab - Various baseline model metrics
2) Predict Baseline model tab - User can select the preferred available baseline choices to predict

![baseline_output](https://raw.githubusercontent.com/taknev83/pywedge/main/images/baseline_model_output.gif)


# Pywedge_HP()

* Introducing interactive hyperparameter tuning classes, Pywedge_HP, which has following two methods,
    - HP_Tune_Classification
    - HP_Tune_Regression

Instantiate the Pywedge_HP class & call the HP_Tune_CLassification method from Pywedge_HP class,

```python
pph = pw.Pywedge_HP(train, test, c, y)
pph.HP_Tune_Classification()
```

Args:
1) train = train dataframe
2) test = test dataframe
3) c = any redundant column to be removed (like ID column etc., at present supports a single column removal, subsequent version will provision multiple column removal requirements)
4) y = target column name as a string 


- For classification - HP_Tune_Classification() 
- For Regression - HP_Tune_Regression()

![HP_Tune](https://raw.githubusercontent.com/taknev83/pywedge/main/images/HP_tune.gif)  
    
As seen in the above GIF, user can interactively enter hyperparameter values, without worrying about tracking the same, as the integreted MLFlow automatically takes care of tracking hyperparameter values. 

Regression Hyperparameter tuning is in the same lines of above steps.


### The following additions to pywedge is planned,
- [X] A separate method to produce good charts
- [ ] To handle NLP column
- [ ] To handle time series dataset
- [ ] To handle stock prices specific analysis





Requires Python 64 bit

THIS IS IN BETA VERSION 
