Metadata-Version: 2.1
Name: lda-classification
Version: 0.0.29
Summary: UNKNOWN
Home-page: https://github.com/FeryET/lda_classification
Author: Farhood Etaati
Author-email: farhoodet@gmail.com
License: UNKNOWN
Description: # lda_classifcation
        
        Instantly train an LDA model with a scikit-learn compatible wrapper around gensim's LDA model.
        
        
        * Preprocess Your Documents
        * Train an LDA 
        * Evaluate Your LDA Model
        * Extract Document Vectors 
        * Select the Most Informative Features
        * Classify Your Documents
        
        All in a few lines of code, completely compatible with `sklearn`'s Transformer API.
        
        ---------------------
        
        
        ### Installation:
        
        
        If you want to install via Pypi use the following command:
        
        ```pip install lda_classification```
        
        If you want to install from the sourcefile:
        ```
        git clone https://github.com/FeryET/lda_classification.git
        cd lda_classification/
        python setup.py install
        ```
        ------------------------------------
        
        
        ### Requirements:
        
        
        ```
        gensim == 3.8.0
        matplotlib == 3.1.2
        numpy == 1.19.1
        setuptools~=49.6.0
        spacy == 2.3.1
        tqdm == 4.48.2
        scikit-learn~=0.23.1
        tomotopy~=0.9.1
        ```
        
        ##### Optional:
        
        If you want to automate the feature selection using this package you can also install `xgboost` to use the util class.
        ```
        xgboost == 1.1.1 (Optional)
        ```
         ------------------------------------
        
        
        ### Example: 
        
        
        ```python
        from lda_classification.model import GensimLDAVectorizer
        from lda_classification.preprocess import SpacyCleaner
        from lda_classification.utils import XGBoostFeatureSelector
        
        # docs, labels = FETCH YOUR DATASET 
        # y = ENCODED_LABELS
        docs = SpacyCleaner().transform(docs)
        X = GensimLDAVectorizer(200, return_dense=False).fit_transform(docs)
        X_transform = XGBoostFeatureSelector().fit_transform(X, y)
        ```
        
        There is also a `dataloader` class and a `BaseData` class in
        order to automate reading your data files from disk. Extend
        `BaseData` and implement the abstractmethods in the subclass and
        feed it to `DataReader` to simplify fetching your dataset.
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.5
Description-Content-Type: text/markdown
