Urduhack: NLP library for ( 🇵🇰 ) Urdu language
================================================

[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/urduhack/urduhack/blob/master/LICENSE)
[![image](https://img.shields.io/pypi/v/urduhack.svg)](https://pypi.org/project/urduhack/)
[![image](https://img.shields.io/pypi/pyversions/urduhack.svg)](https://pypi.org/project/urduhack/)
[![Build Status](https://travis-ci.org/urduhack/urduhack.svg?branch=master)](https://travis-ci.org/urduhack/urduhack)
[![codecov](https://codecov.io/gh/urduhack/urduhack/branch/master/graph/badge.svg)](https://codecov.io/gh/urduhack/urduhack)
![Last commit](https://img.shields.io/github/last-commit/urduhack/urduhack.svg)
[![image](https://img.shields.io/github/contributors/urduhack/urduhack.svg)](https://github.com/urduhack/urduhack/graphs/contributors)
[![Downloads](https://pepy.tech/badge/urduhack)](https://pepy.tech/project/urduhack)
[![Join Slack](https://img.shields.io/badge/join-us%20on%20slack-gray.svg?longCache=true&logo=slack&colorB=red)](https://join.slack.com/t/urduhack/shared_invite/enQtNDE5NDg4NzU2Mzg4LTk3ZDNlYzBhOWM5MGY0ZGE0ZmNmNzU2ZTViYjAwMTg3NTBmZGU4OTM0M2E0MzQ0NDI1MDIyYzVkYTVmZTkyZjg)

Urduhack is a NLP library for urdu language. It comes with a lot of battery included features to help you process Urdu
data in the easiest way possible.


Features Support
----------------
- [x] Normalization
    - [x] Arabic and Urdu Unicode Redundancy Problem
    - [x] Character Normalization
    - [x] Combined Characters Normalization 
    - [x] Diacritics Removal
    - [x] Spaces Before & After Digits
    - [x] Spaces After Punctuations
    - [x] Joined Words Fix
- [x] Tokenization
    - [x] Sentence Tokenization
    - [x] Words Tokenization
 - [x] Data Pre-processing
     - [x] Handles all kind of numbers, emails, currencies and urls etc.
- [ ] Tasks
  - [x] Sentimental Analysis
  - [ ] Sentence Classification
  - [ ] Documents Classification
  - [ ] Name Entity Recognition
  - [ ] Image to Text
  - [ ] Speech to Text


Installation
------------
Urduhack officially supports Python 3.6–3.7, and runs great on PyPy.
``` {.sourceCode .bash}
$ pip install urduhack
```

Documentation
-------------
Fantastic documentation is available at <https://urduhack.readthedocs.io/>

How to Contribute
-----------------
1.  Check for open issues or open a fresh issue to start a discussion
    around a feature idea or a bug. There is a [Contributor Friendly](https://github.com/urduhack/urduhack/issues)
    tag for issues that should be ideal for people who are not very
    familiar with the codebase yet.
3.  Write a test which shows that the bug was fixed or that the feature
    works as expected.
4.  Send a pull request and bug the maintainer until it gets merged and
    published. :)

Contributors
-------------
Special thanks to everyone who contributed to getting the UrduHack to the current state.

Backers [![Backers on Open Collective](https://opencollective.com/urduhack/backers/badge.svg)](#backers)
---------------------------------------------------------------------------------------------------------
Thank you to all our backers! 🙏 [[Become a backer](https://opencollective.com/urduhack#backer)]
<a href="https://opencollective.com/urduhack#backers" target="_blank"><img src="https://opencollective.com/urduhack/backers.svg?width=890"></a>

Sponsors [![Sponsors on Open Collective](https://opencollective.com/urduhack/sponsors/badge.svg)](#sponsors)
-----------------------------------------------------------------------------------------------------------------
Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [[Become a sponsor](https://opencollective.com/urduhack#sponsor)]
<a href="https://opencollective.com/urduhack/sponsor/0/website" target="_blank"><img src="https://opencollective.com/urduhack/sponsor/0/avatar.svg"></a>
<a href="https://opencollective.com/urduhack/sponsor/1/website" target="_blank"><img src="https://opencollective.com/urduhack/sponsor/1/avatar.svg"></a>

Copyright and license
---------------------
Code released under the [MIT License](ttps://github.com/urduhack/urduhack/blob/master/LICENSE).