Metadata-Version: 2.1
Name: feature_engine
Version: 0.5.1
Summary: Feature engineering package with Scikit-learn's fit transform functionality
Home-page: http://github.com/solegalli/feature_engine
Author: Soledad Galli
Author-email: solegalli@protonmail.com
License: BSD 3 clause
Description: # Feature Engine
        
        ![Python 3.6](https://img.shields.io/badge/python-3.6-success.svg)
        ![Python 3.7](https://img.shields.io/badge/python-3.7-success.svg)
        ![Python 3.8](https://img.shields.io/badge/python-3.8-success.svg)
        ![License](https://img.shields.io/badge/license-BSD-success.svg)
        ![CircleCI](https://img.shields.io/circleci/build/github/solegalli/feature_engine/master.svg?token=5a1c2accc2c97450e52d2cb1b47c333ab495d2c2)
        ![Documentation Status](https://readthedocs.org/projects/feature-engine/badge/?version=latest)
        
        
        Feature-engine is a Python library with multiple transformers to engineer features for use in machine learning models. Feature-engine's transformers follow Scikit-learn functionality with fit() and transform() methods to first learn the transforming paramenters from data and then transform the data.
        
        
        ## Feature-engine features in the following resources:
        
        * [Feature Engineering for Machine Learning, Online Course](https://www.udemy.com/feature-engineering-for-machine-learning/?couponCode=FEATENGREPO).
        
        * [Python Feature Engineering Cookbook](https://www.packtpub.com/data/python-feature-engineering-cookbook)
        
        ## Blogs about Feature-engine:
        
        * [Feature-engine: A new open source Python package for feature engineering](https://www.trainindata.com/post/feature-engine-a-new-open-source-python-package-for-feature-engineering)
        
        * [Open Source Python libraries for Feature Engineering: Comparisons and Walkthroughs](https://www.trainindata.com/post/feature-engineering-python-libraries-comparisons)
        
        ## Documentation
        
        * Documentation: http://feature-engine.readthedocs.io
        * Home page: https://www.trainindata.com/feature-engine
        
        
        ## Current Feature-engine's transformers include functionality for:
        
        * Missing data imputation
        * Categorical variable encoding
        * Outlier removal
        * Discretisation
        * Numerical Variable Transformation
        
        ### Imputing Methods
        
        * MeanMedianImputer
        * RandomSampleImputer
        * EndTailImputer
        * AddNaNBinaryImputer
        * CategoricalVariableImputer
        * FrequentCategoryImputer
        * ArbitraryNumberImputer
        
        ### Encoding Methods
        * CountFrequencyCategoricalEncoder
        * OrdinalCategoricalEncoder 
        * MeanCategoricalEncoder
        * WoERatioCategoricalEncoder
        * OneHotCategoricalEncoder
        * RareLabelCategoricalEncoder
        
        ### Outlier Handling methods
        * Winsorizer
        * ArbitraryOutlierCapper
        * OutlierTrimmer
        
        ### Discretisation methods
        * EqualFrequencyDiscretiser
        * EqualWidthDiscretiser
        * DecisionTreeDiscretiser
        * UserInputDiscreriser
        
        ### Variable Transformation methods
        * LogTransformer
        * ReciprocalTransformer
        * PowerTransformer
        * BoxCoxTransformer
        * YeoJohnsonTransformer
        
        
        ### Scikit-learn Wrapper:
        
         * SklearnTransformerWrapper
        
        
        ### Installing
        
        ```
        pip install feature_engine
        ```
        or
        
        ```
        git clone https://github.com/solegalli/feature_engine.git
        ```
        
        ### Usage
        
        ```python
        >>> from feature_engine.categorical_encoders import RareLabelCategoricalEncoder
        >>> import pandas as pd
        
        >>> data = {'var_A': ['A'] * 10 + ['B'] * 10 + ['C'] * 2 + ['D'] * 1}
        >>> data = pd.DataFrame(data)
        >>> data['var_A'].value_counts()
        ```
        
        ```
        Out[1]:
        A    10
        B    10
        C     2
        D     1
        Name: var_A, dtype: int64
        ```
            
        ```python 
        >>> rare_encoder = RareLabelCategoricalEncoder(tol=0.10, n_categories=3)
        >>> data_encoded = rare_encoder.fit_transform(data)
        >>> data_encoded['var_A'].value_counts()
        ```
        
        ```
        Out[2]:
        A       10
        B       10
        Rare     3
        Name: var_A, dtype: int64
        ```
        
        See more usage examples in the jupyter notebooks in the **example** folder of this repository, or in the documentation: http://feature-engine.readthedocs.io
        
        ## Contributing
        
        ### Local Setup Steps
        - Clone the repo and cd into it
        - Run `pip install tox`
        - Run `tox` if the tests pass, your local setup is complete
        
        ### Opening Pull Requests
        PR's are welcome! Please make sure the CI tests pass on your branch.
        
        ## License
        
        BSD 3-Clause
        
        ## Authors
        
        * **Soledad Galli** - *Initial work* - [Feature Engineering for Machine Learning, Online Course](https://www.udemy.com/feature-engineering-for-machine-learning/?couponCode=FEATENGREPO).
        
        
        ### References
        
        Many of the engineering and encoding functionality is inspired by this [series of articles from the 2009 KDD competition](http://www.mtome.com/Publications/CiML/CiML-v3-book.pdf).
        
        To learn more about the rationale, functionality, pros and cos of each imputer, encoder and transformer, refer to the [Feature Engineering for Machine Learning, Online Course](https://www.udemy.com/feature-engineering-for-machine-learning/?couponCode=FEATENGREPO)
        
        For a summary of the methods check this [presentation](https://speakerdeck.com/solegalli/engineering-and-selecting-features-for-machine-learning) and this [article](https://www.trainindata.com/post/feature-engineering-comprehensive-overview)
        
        To stay alert of latest releases, sign up at [trainindata](https://www.trainindata.com)
        
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.6.0
Description-Content-Type: text/markdown
