Metadata-Version: 2.4
Name: clemcore
Version: 3.2.0
Summary: The cLLM (chat-optimized Large Language Model, 'clem') framework tests such models' ability to engage in games, that is, rule-constituted activities played using language.
Author-email: Philipp Sadler <first.last@uni-potsdam.de>, Jonathan Jordan <first.last@uni-potsdam.de>, Sherzod Hakimov <first.last@uni-potsdam.de>, Anne Beyer <first.last@uni-potsdam.de>, "L. Pfennigschmidt" <first.last@uni-potsdam.de>, Kushal Koshti <first.last@uni-potsdam.de>
License: MIT
Project-URL: Homepage, https://github.com/clp-research/clemcore
Requires-Python: <3.13,>=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pyyaml>=6.0
Requires-Dist: numpy<2.0.0,>=1.24.3
Requires-Dist: retry>=0.9.2
Requires-Dist: tqdm>=4.65.0
Requires-Dist: nltk==3.8.1
Requires-Dist: aleph-alpha-client==7.0.1
Requires-Dist: openai==1.75.0
Requires-Dist: anthropic==0.47.1
Requires-Dist: cohere==4.48
Requires-Dist: google-generativeai==0.8.4
Requires-Dist: mistralai==1.8.0
Requires-Dist: matplotlib==3.7.1
Requires-Dist: pandas==2.0.1
Requires-Dist: seaborn==0.12.2
Provides-Extra: vllm
Requires-Dist: torch~=2.6.0; extra == "vllm"
Requires-Dist: transformers==4.51.1; extra == "vllm"
Requires-Dist: vllm==0.8.4; extra == "vllm"
Provides-Extra: huggingface
Requires-Dist: torch~=2.1.1; extra == "huggingface"
Requires-Dist: sentencepiece==0.1.99; extra == "huggingface"
Requires-Dist: accelerate==1.2.1; extra == "huggingface"
Requires-Dist: protobuf==4.21.6; extra == "huggingface"
Requires-Dist: einops==0.6.1; extra == "huggingface"
Requires-Dist: bitsandbytes==0.45.3; extra == "huggingface"
Requires-Dist: peft==0.15.2; extra == "huggingface"
Requires-Dist: transformers==4.51.1; extra == "huggingface"
Requires-Dist: torchvision==0.16.1; extra == "huggingface"
Requires-Dist: timm>=1.0.15; extra == "huggingface"
Provides-Extra: slurk
Requires-Dist: python-engineio==4.4.0; extra == "slurk"
Requires-Dist: python-socketio==5.7.2; extra == "slurk"
Requires-Dist: websocket-client; extra == "slurk"
Dynamic: license-file

### Updates
(March 2025): Version 2.0 of the benchmark has been [released](https://clembench.github.io/). And the framework is now pip installable. The games 
that make the benchmark got their own [repository](https://github.com/clp-research/clembench).

(February 2024): We have updated the framework code. If you have written games using the initial release version, see 
[this guide](docs/howto_update_to_v1.md) on how to update your game.

# clembench: A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents

The cLLM (chat-optimized Large Language Model, "clem") framework tests such models' ability to engage in games – 
rule-constituted activities played using language.
The framework is a systematic way of probing for the situated language understanding of language using agents.

This repository contains Clemcore, the core framework code used to run the games discussed in
> Chalamalasetti, K., Götze, J., Hakimov, S., Madureira, B., Sadler, P., & Schlangen, D. (2023). clembench: Using Game 
> Play to Evaluate Chat-Optimized Language Models as Conversational Agents (arXiv:2305.13455). arXiv. 
> https://doi.org/10.48550/arXiv.2305.13455

### Clembench benchmark game set
The main set of games on which the [leaderboard](https://clembench.github.io/leaderboard.html) is based is now found in a separate repository:  
[Clembench repository](https://github.com/clp-research/clembench) You can find details of the contained games there.

### Evaluation Results
Results of Clembench benchmark runs can be found on the [main project website](https://clembench.github.io), under [leaderboard](https://clembench.github.io/leaderboard.html).

# Using the clemcore CLI
**Clemcore is now available as a library on PyPI, making it installable using pip.**  
We highly recommend installing Clemcore in its own separate Python 3.10 virtual environment, to assure that dependencies 
of the framework and the games are managed well. For the following examples, a default Python venv named `myclem` is 
assumed to be created and active.  
You can simply install the packaged library using a terminal:
```
(myclem) pip install clemcore
```
This means that there is no need to checkout this repository to run the framework.

> **Note to framework developers:** 
> 
> Framework developers that want to contribute to the clemcore framework, should still fork and checkout the repository and install the framework locally using `pip install -e .` for testing and then create a pull request with the changes.

Additional installation options are:
```
(myclem) pip install clemcore[huggingface] # dependencies for the local huggingface transformers backend
(myclem) pip install clemcore[vllm]        # dependencies for the local vllm backend
(myclem) pip install clemcore[slurk]       # dependencies for the slurk backend 
```
After the installation you will have access to the `clem` CLI tool. The main functions are:
```
(myclem) clem list games               # list the games available for a run
(myclem) clem list backends            # list the backends available for a run
(myclem) clem list models              # list the models available for a run
(myclem) clem run -g <game> -m <model> # runs specified game using specified model
(myclem) clem transcribe               # translates interactions into html files
(myclem) clem score                    # computes individual performance measures
(myclem) clem eval                     # computes overall performances measures; requires scores
```

The games to `run` can be checkout from the [clembench repository](https://github.com/clp-research/clembench).

This repository is tested on `Python 3.10`.
