# -*- coding: utf-8 -*-
from setuptools import setup

packages = \
['comet',
 'comet.cli',
 'comet.encoders',
 'comet.models',
 'comet.models.ranking',
 'comet.models.regression',
 'comet.modules']

package_data = \
{'': ['*']}

install_requires = \
['jsonargparse==3.13.1',
 'numpy>=1.20.0',
 'pandas==1.1.5',
 'pytorch-lightning==1.3.5',
 'sacrebleu>=2.0.0',
 'scipy>=1.5.4',
 'sentencepiece>=0.1.96,<0.2.0',
 'torch>=1.6.0,<=1.10.0',
 'torchmetrics==0.6',
 'transformers>=4.8,<4.11']

entry_points = \
{'console_scripts': ['comet-compare = comet.cli.compare:compare_command',
                     'comet-score = comet.cli.score:score_command',
                     'comet-train = comet.cli.train:train_command']}

setup_kwargs = {
    'name': 'unbabel-comet',
    'version': '1.0.1',
    'description': 'High-quality Machine Translation Evaluation',
    'long_description': '<p align="center">\n  <img src="https://raw.githubusercontent.com/Unbabel/COMET/master/docs/source/_static/img/COMET_lockup-dark.png">\n  <br />\n  <br />\n  <a href="https://github.com/Unbabel/COMET/blob/master/LICENSE"><img alt="License" src="https://img.shields.io/github/license/Unbabel/COMET" /></a>\n  <a href="https://github.com/Unbabel/COMET/stargazers"><img alt="GitHub stars" src="https://img.shields.io/github/stars/Unbabel/COMET" /></a>\n  <a href=""><img alt="PyPI" src="https://img.shields.io/pypi/v/unbabel-comet" /></a>\n  <a href="https://github.com/psf/black"><img alt="Code Style" src="https://img.shields.io/badge/code%20style-black-black" /></a>\n</p>\n\n> Version 1.0 is finally out 🥳! whats new?\n> 1) `comet-compare` command for statistical comparison between two models\n> 2) `comet-score` with multiple hypothesis/systems\n> 3) Embeddings caching for faster inference (thanks to [@jsouza](https://github.com/jsouza)).\n> 4) Length Batching for faster inference (thanks to [@CoderPat](https://github.com/CoderPat))\n> 5) Integration with SacreBLEU for dataset downloading (thanks to [@mjpost](https://github.com/mjpost))\n> 6) Monte-carlo Dropout for uncertainty estimation (thanks to [@glushkovato](https://github.com/glushkovato) and [@chryssa-zrv](https://github.com/chryssa-zrv))\n> 7) Some code refactoring \n\n## Quick Installation\n\nDetailed usage examples and instructions can be found in the [Full Documentation](https://unbabel.github.io/COMET/html/index.html).\n\nSimple installation from PyPI\n\n```bash\npip install unbabel-comet\n```\nor\n```bash\npip install unbabel-comet==1.0.1 --use-feature=2020-resolver\n```\n\nTo develop locally install [Poetry](https://python-poetry.org/docs/#installation) and run the following commands:\n```bash\ngit clone https://github.com/Unbabel/COMET\ncd COMET\npoetry install\n```\n\nAlternately, for development, you can run the CLI tools directly, e.g.,\n\n```bash\nPYTHONPATH=. ./comet/cli/score.py\n```\n\n## Scoring MT outputs:\n\n### CLI Usage:\n\nTest examples:\n\n```bash\necho -e "Dem Feuer konnte Einhalt geboten werden\\nSchulen und Kindergärten wurden eröffnet." >> src.de\necho -e "The fire could be stopped\\nSchools and kindergartens were open" >> hyp1.en\necho -e "The fire could have been stopped\\nSchools and pre-school were open" >> hyp2.en\necho -e "They were able to control the fire.\\nSchools and kindergartens opened" >> ref.en\n```\n\n```bash\ncomet-score -s src.de -t hyp1.en -r ref.en --gpus 0\n```\n\nScoring multiple systems:\n\n```bash\ncomet-score -s src.de -t hyp1.en hyp2.en -r ref.en\n```\n\nWMT test sets via [SacreBLEU](https://github.com/mjpost/sacrebleu):\n\n```bash\ncomet-score -d wmt20:en-de -t PATH/TO/TRANSLATIONS\n```\n\nYou can select another model/metric with the --model flag and for reference-free (QE-as-a-metric) models you don\'t need to pass a reference.\n\n```bash\ncomet-score -s src.de -t hyp1.en --model wmt20-comet-qe-da\n```\n\nFollowing the work on [Uncertainty-Aware MT Evaluation](https://aclanthology.org/2021.findings-emnlp.330/) you can use the --mc_dropout flag to get a variance/uncertainty value for each segment score. If this value is high, it means that the metric is less confident in that prediction.\n\n```bash\ncomet-score -s src.de -t hyp1.en -r ref.en --mc_dropout 30\n```\n\nWhen comparing two MT systems we encourage you to run the `comet-compare` command to get **statistical significance** with Paired T-Test and bootstrap resampling [(Koehn, et al 2004)](https://aclanthology.org/W04-3250/).\n\n```bash\ncomet-compare -s src.de -x hyp1.en -y hyp2.en -r ref.en\n```\n\nFor even more detailed MT contrastive evaluation please take a look at our new tool [MT-Telescope](https://github.com/Unbabel/MT-Telescope).\n\n#### Multi-GPU Inference:\n\nCOMET is optimized to be used in a single GPU by taking advantage of length batching and embedding caching. When using Multi-GPU since data e spread across GPUs we will typically get fewer cache hits and the length batching samples is replaced by a [DistributedSampler](https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html#replace-sampler-ddp). Because of that, according to our experiments, using 1 GPU is faster than using 2 GPUs specially when scoring multiple systems for the same source and reference.\n\nNonetheless, if your data does not have repetitions and you have more than 1 GPU available, you can **run multi-GPU inference with the following command**:\n\n```bash\ncomet-score -s src.de -t hyp1.en -r ref.en --gpus 2\n```\n\n#### Changing Embedding Cache Size:\nYou can change the cache size of COMET using the following env variable:\n\n```bash\nexport COMET_EMBEDDINGS_CACHE="2048"\n```\nby default the COMET cache size is 1024.\n\n\n### Scoring within Python:\n\n```python\nfrom comet import download_model, load_from_checkpoint\n\nmodel_path = download_model("wmt20-comet-da")\nmodel = load_from_checkpoint(model_path)\ndata = [\n    {\n        "src": "Dem Feuer konnte Einhalt geboten werden",\n        "mt": "The fire could be stopped",\n        "ref": "They were able to control the fire."\n    },\n    {\n        "src": "Schulen und Kindergärten wurden eröffnet.",\n        "mt": "Schools and kindergartens were open",\n        "ref": "Schools and kindergartens opened"\n    }\n]\nseg_scores, sys_score = model.predict(data, batch_size=8, gpus=1)\n```\n\n### Languages Covered:\n\nAll the above mentioned models are build on top of XLM-R which cover the following languages:\n\nAfrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Basque, Belarusian, Bengali, Bengali Romanized, Bosnian, Breton, Bulgarian, Burmese, Burmese, Catalan, Chinese (Simplified), Chinese (Traditional), Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Hausa, Hebrew, Hindi, Hindi Romanized, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish (Kurmanji), Kyrgyz, Lao, Latin, Latvian, Lithuanian, Macedonian, Malagasy, Malay, Malayalam, Marathi, Mongolian, Nepali, Norwegian, Oriya, Oromo, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskri, Scottish, Gaelic, Serbian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tamil, Tamil Romanized, Telugu, Telugu Romanized, Thai, Turkish, Ukrainian, Urdu, Urdu Romanized, Uyghur, Uzbek, Vietnamese, Welsh, Western, Frisian, Xhosa, Yiddish.\n\n**Thus, results for language pairs containing uncovered languages are unreliable!**\n\n## COMET Models:\n\nWe recommend the two following models to evaluate your translations:\n\n- `wmt20-comet-da`: **DEFAULT** Reference-based Regression model build on top of XLM-R (large) and trained of Direct Assessments from WMT17 to WMT19. Same as `wmt-large-da-estimator-1719` from previous versions.\n- `wmt20-comet-qe-da`: **Reference-FREE** Regression model build on top of XLM-R (large) and trained of Direct Assessments from WMT17 to WMT19. Same as `wmt-large-qe-estimator-1719` from previous versions.\n\nThese two models were developed to participate on the WMT20 Metrics shared task [(Mathur et al. 2020)](https://aclanthology.org/2020.wmt-1.77.pdf) and were among the best metrics that year. Also, in a large-scale study performed by Microsoft Research these two metrics are ranked 1st and 2nd in terms of system-level decision accuracy [(Kocmi et al. 2020)](https://arxiv.org/pdf/2107.10821.pdf). At segment-level, these systems also correlate well with expert evaluations based on MQM data [(Freitag et al. 2020)](https://arxiv.org/pdf/2104.14478.pdf).\n\nFor more information about the available COMET models read our metrics descriptions [here](METRICS.md)\n\n## Train your own Metric: \n\nInstead of using pretrained models your can train your own model with the following command:\n```bash\ncomet-train --cfg configs/models/{your_model_config}.yaml\n```\n\nYou can then use your own metric to score:\n\n```bash\ncomet-score -s src.de -t hyp1.en -r ref.en --model PATH/TO/CHECKPOINT\n```\n\n**Note:** Please contact ricardo.rei@unbabel.com if you wish to host your own metric within COMET available metrics!\n\n## unittest:\nIn order to run the toolkit tests you must run the following command:\n\n```bash\ncoverage run --source=comet -m unittest discover\ncoverage report -m\n```\n\n## Publications\nIf you use COMET please cite our work! Also, don\'t forget to say which model you used to evaluate your systems.\n\n- [Are References Really Needed? Unbabel-IST 2021 Submission for the Metrics Shared Task](http://statmt.org/wmt21/pdf/2021.wmt-1.111.pdf)\n\n- [Uncertainty-Aware Machine Translation Evaluation](https://aclanthology.org/2021.findings-emnlp.330/) \n\n- [COMET - Deploying a New State-of-the-art MT Evaluation Metric in Production](https://www.aclweb.org/anthology/2020.amta-user.4)\n\n- [Unbabel\'s Participation in the WMT20 Metrics Shared Task](https://aclanthology.org/2020.wmt-1.101/)\n\n- [COMET: A Neural Framework for MT Evaluation](https://www.aclweb.org/anthology/2020.emnlp-main.213)\n\n\n\n',
    'author': 'Ricardo Rei, Craig Stewart, Catarina Farinha, Alon Lavie',
    'author_email': None,
    'maintainer': None,
    'maintainer_email': None,
    'url': 'https://github.com/Unbabel/COMET',
    'packages': packages,
    'package_data': package_data,
    'install_requires': install_requires,
    'entry_points': entry_points,
    'python_requires': '>=3.7.0,<4.0.0',
}


setup(**setup_kwargs)
