Metadata-Version: 2.1
Name: triton-bert
Version: 0.1.0
Summary: easy to use bert with nvidia triton server
License: MIT
Author: yanyongwen712
Author-email: yanyongwen712@pingan.com.cn
Requires-Python: >=3.10,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: more-itertools (>=10.1.0,<11.0.0)
Requires-Dist: protobuf (>=4.25.1,<5.0.0)
Requires-Dist: transformers (>=4.36.2,<5.0.0)
Requires-Dist: tritonclient[grpc,http] (>=2.41.0,<3.0.0)
Description-Content-Type: text/markdown

It is easy to use bert in triton now.
Algorithm Engineer only need to focus to write proprocess function to make his model work.

pls see examples



# install dependency
poetry shell
poetry install

# run examples
## run triton server
```bash
# for example
docker run -d  --name triton-server   --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864  --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /home/yanyongwen712/triton_models:/models  nvcr.io/nvidia/tritonserver::22.08-py3 tritonserver --model-repository=/models  --model-control-mode=poll  --exit-on-error=false --log-verbose 1
# configure triton model folder
```
## prepare model for triton server
```bash
cd examples
python save_model_for_triton_server.py
# sftp put examples/model/cpu/xxx triton_server_model_folder
docker logs -f trition-server    
# check whether it is loaded successfully
```

## prepare PG with pgvector extension
...

## run example
```bash
# change triton server ip , triton model name and local transformer model folder
python retrieval_pgvector.py
```


