Metadata-Version: 2.1
Name: MAHA
Version: 1.3
Summary: Performing ETL using Machine Learning
Home-page: https://github.com/user/FlintyTub49
Author: Mithesh R, Arth Akhouri, Heetansh Jhaveri, Ayaan Khan
Author-email: arthakhouri@gmail.com
License: MIT
Download-URL: https://github.com/FlintyTub49/MAHA/archive/1.3.tar.gz
Description: # MAHA
        
        MAHA is an in-progress ETL package which uses machine learning to clean your dataset with one line command. Features of MAHA include :-
        
          - Drop all the index columns
          - Drop columns with too many missing values
          - Using Regression to find the missing values in the data and then replacing them
        
        # Prerequisites
        
          - Data is in pandas DataFrame format
          - All the categorical variables are label encoded
          - All the columns are in the desired data type of the output
        
        You can also:
          - Find the mean and mode of every column
          - Fill the NA values with mean and mode of the columnns depending on the datatype
          - Find a model for every column with all other columns being the independent variables 
        
        ## Dependencies
        
        MAHA uses a number of open source projects to work properly:
        
        * [NumPy] - NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
        * [Pandas] - Pandas is a software library written for the Python programming language for data manipulation and analysis.
        * [Sklearn] - Machine Learning library which includes various classification, regression and clustering algorithms
        
        ## Installation
        
        MAHA requires pandas, numpy and sklearn
        
        Use pip to install the packages
        
        ```sh
        $ pip3 install pandas
        ```
        ```sh
        $ pip3 install numpy
        ```
        ```sh
        $ pip3 install sklearn
        ```
        
        If you have not installed pip, you can do it by
        
        ```sh
        $ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
        ```
        Then run the following command where you have installed get-pip.py
        ```
        $ python get-pip.py
        ```
        
        ## Development
        
        Developed By :-
        [Mithesh R], 
        [Arth Akhouri],
        [Heetansh Jhaveri],
        [Ayaan Khan]
        
        Want to contribute? Navigate to our GitHub for more information
        GitHub Repository - [MAHA]
        
        ## License
        
        MIT
        
Keywords: ETL,Machine Learning,Regression,Pandas,Numpy
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Description-Content-Type: text/markdown
