Metadata-Version: 2.1
Name: dostoevsky
Version: 0.6.0
Summary: Sentiment analysis library for russian language
Home-page: https://github.com/bureaucratic-labs/dostoevsky
Author: Bureaucratic Labs
Author-email: hello@b-labs.pro
License: MIT
Description: # Dostoevsky [![Build Status](https://travis-ci.org/bureaucratic-labs/dostoevsky.svg?branch=master)](https://travis-ci.org/bureaucratic-labs/dostoevsky)
        
        <img align="right" src="https://i.imgur.com/uLMWPuL.png">
        
        Sentiment analysis library for russian language
        
        ## Install
        
        Please note that `Dostoevsky` supports only Python 3.6+ on both Linux and Windows
        
        ```bash
        $ pip install dostoevsky
        ```
        
        ## Social network model [FastText]
        
        This model was trained on [RuSentiment dataset](https://github.com/text-machine-lab/rusentiment) and achieves up to ~0.71 F1 score.  
        
        ### Usage
        
        First of all, you'll need to download binary model:
        
        ```bash
        $ python -m dostoevsky download fasttext-social-network-model
        ```
        
        Then you can use sentiment analyzer:
        
        ```python
        from dostoevsky.tokenization import RegexTokenizer
        from dostoevsky.models import FastTextSocialNetworkModel
        
        tokenizer = RegexTokenizer()
        tokens = tokenizer.split('всё очень плохо')  # [('всё', None), ('очень', None), ('плохо', None)]
        
        model = FastTextSocialNetworkModel(tokenizer=tokenizer)
        
        messages = [
            'привет',
            'я люблю тебя!!',
            'малолетние дебилы'
        ]
        
        results = model.predict(messages, k=2)
        
        for message, sentiment in zip(messages, results):
            # привет -> {'speech': 1.0000100135803223, 'skip': 0.0020607432816177607}
            # люблю тебя!! -> {'positive': 0.9886782765388489, 'skip': 0.005394937004894018}
            # малолетние дебилы -> {'negative': 0.9525841474533081, 'neutral': 0.13661839067935944}]
            print(message, '->', sentiment)
        ```
        
        If you use the library in a research project, please include the following citation for the RuSentiment data:
        ```
        @inproceedings{rogers-etal-2018-rusentiment,
            title = "{R}u{S}entiment: An Enriched Sentiment Analysis Dataset for Social Media in {R}ussian",
            author = "Rogers, Anna  and
              Romanov, Alexey  and
              Rumshisky, Anna  and
              Volkova, Svitlana  and
              Gronas, Mikhail  and
              Gribov, Alex",
            booktitle = "Proceedings of the 27th International Conference on Computational Linguistics",
            month = aug,
            year = "2018",
            address = "Santa Fe, New Mexico, USA",
            publisher = "Association for Computational Linguistics",
            url = "https://www.aclweb.org/anthology/C18-1064",
            pages = "755--763",
        }
        
        ```
        
Keywords: natural language processing,sentiment analysis
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Text Processing :: Linguistic
Description-Content-Type: text/markdown
