Metadata-Version: 2.1
Name: text-sensitivity
Version: 0.2.4
Summary: Extension of text_explainability for sensitivity testing (robustness, fairness)
Home-page: https://git.science.uu.nl/m.j.robeer/text_sensitivity
Author: Marcel Robeer
Author-email: m.j.robeer@uu.nl
License: GNU LGPL v3
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python
Classifier: License :: OSI Approved :: GNU Lesser General Public License v3 or later (LGPLv3+)
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE

*<p align="center">
  <img src="https://git.science.uu.nl/m.j.robeer/text_sensitivity/-/raw/main/img/TextLogo-Logo_large_sensitivity.png" alt="T_xt Sensitivity logo" width="70%">*
</p>

**<h3 align="center">
Sensitivity testing (fairness & robustness) for text machine learning models**
</h3>

[![PyPI](https://img.shields.io/pypi/v/text_sensitivity)](https://pypi.org/project/text-sensitivity/)
[![Python_version](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10-blue)](https://pypi.org/project/text-sensitivity/)
[![Build_passing](https://img.shields.io/badge/build-passing-brightgreen)](https://git.science.uu.nl/m.j.robeer/text_sensitivity/-/pipelines)
[![License](https://img.shields.io/pypi/l/text_sensitivity)](https://www.gnu.org/licenses/lgpl-3.0.en.html)
[![Docs_passing](https://img.shields.io/badge/docs-external-blueviolet)](https://marcelrobeer.github.io/text_sensitivity)
[![Code style: black](https://img.shields.io/badge/code%20style-flake8-aa0000)](https://github.com/PyCQA/flake8)

---

> Extension of [text_explainability](https://git.science.uu.nl/m.j.robeer/text_explainability)

Uses the **generic architecture** of `text_explainability` to also include tests of **robustness** (_how generalizable the model is in production_, e.g. ability to handle input characters, stability when adding typos, or the effect of adding random unrelated data) and **fairness** (_if equal individuals are treated equally by the model_, e.g. subgroup fairness on sex and nationality).

&copy; Marcel Robeer, 2021

## Quick tour

**Robustness**: test whether your model is able to handle different data types...

```python
from text_sensitivity import RandomAscii, RandomEmojis, combine_generators

# Generate 10 strings with random ASCII characters
RandomAscii().generate_list(n=10)

# Generate 5 strings with random ASCII characters and emojis
combine_generators(RandomAscii(), RandomEmojis()).generate_list(n=5)
```

... whether your model performs equally for different entities ...
```python
from text_sensitivity import RandomAddress, RandomEmail

# Random address of your current locale (default = 'nl')
RandomAddress(sep=', ').generate_list(n=5)

# Random e-mail addresses in Spanish ('es') and Portuguese ('pt'), and include from which country the e-mail is
RandomEmail(languages=['es', 'pt']).generate_list(n=10, attributes=True)
```

... and if it is robust under simple perturbations.
```python
from text_sensitivity import compare_accuracy
from text_sensitivity.perturbation import to_upper, add_typos

# Is model accuracy equal when we change all sentences to uppercase?
compare_accuracy(env, model, to_upper)

# Is model accuracy equal when we add typos in words?
compare_accuracy(env, model, add_typos)
```

**Fairness**: see if performance is equal among subgroups.

```python
from text_sensitivity import RandomName

# Generate random Dutch ('nl') and Russian ('ru') names, both 'male' and 'female' (+ return attributes)
RandomName(languages=['nl', 'ru'], sex=['male', 'female']).generate_list(n=10, attributes=True)
```

## Installation
| Method | Instructions |
|--------|--------------|
| `pip` | Install from [PyPI](https://pypi.org/project/text-sensitivity/) via `pip3 install text_sensitivity`. |
| Local | Clone this repository and install via `pip3 install -e .` or locally run `python3 setup.py install`.

## Documentation
Full documentation of the latest version is provided at [https://marcelrobeer.github.io/text_sensitivity/](https://marcelrobeer.github.io/text_sensitivity/).

## Example usage
See [example_usage.md](example_usage.md) to see an example of how the package can be used, or run the lines in `example_usage.py` to do explore it interactively.

## Releases
`text_explainability` is officially released through [PyPI](https://pypi.org/project/text-sensitivity/).

See [CHANGELOG.md](CHANGELOG.md) for a full overview of the changes for each version.

## Citation
```bibtex
@misc{text_sensitivity,
  title = {Python package text\_sensitivity},
  author = {Marcel Robeer},
  howpublished = {\url{https://git.science.uu.nl/m.j.robeer/text_sensitivity}},
  year = {2021}
}
```

## Maintenance
### Contributors
- [Marcel Robeer](https://www.uu.nl/staff/MJRobeer) (`@m.j.robeer`)
- Elize Herrewijnen (`@e.herrewijnen`)

### Todo
Tasks yet to be done:

* Word-level perturbations
* Add fairness-specific metrics:
    - Subgroup fairness
    - Counterfactual fairness
* Add expected behavior
    - Robustness: equal to prior prediction, or in some cases might expect that it deviates
    - Fairness: may deviate from original prediction
* Tests
    - Add tests for perturbations
    - Add tests for sensitivity testing schemes
* Add visualization ability

## Credits
- Edward Ma. _[NLP Augmentation](https://github.com/makcedward/nlpaug)_. 2019.
- Daniele Faraglia and other contributors. _[Faker](https://github.com/joke2k/faker)_. 2012.
- Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin and Sameer Singh. [Beyond Accuracy: Behavioral Testing of NLP models with CheckList](https://paperswithcode.com/paper/beyond-accuracy-behavioral-testing-of-nlp). _Association for Computational Linguistics_ (_ACL_). 2020.


