Metadata-Version: 2.1
Name: edu-segmentation
Version: 0.0.108
Summary: To improve EDU segmentation performance using Segbot. As Segbot has an encoder-decoder model architecture, we can replace bidirectional GRU encoder with generative pretraining models such as BART and T5. Evaluate the new model using the RST dataset by using few-shot based settings (e.g. 100 examples) to train the model, instead of using the full dataset.
Author: Your Name
Author-email: you@example.com
Requires-Python: >=3.9,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: CacheControl (==0.12.11)
Requires-Dist: Jinja2 (==3.1.2)
Requires-Dist: MarkupSafe (==2.1.2)
Requires-Dist: PyYAML (==6.0)
Requires-Dist: Pygments (==2.15.1)
Requires-Dist: attrs (==23.1.0)
Requires-Dist: bleach (==6.0.0)
Requires-Dist: build (==0.10.0)
Requires-Dist: certifi (==2022.12.7)
Requires-Dist: charset-normalizer (==3.1.0)
Requires-Dist: cleo (==2.0.1)
Requires-Dist: click (==8.1.3)
Requires-Dist: colorama (==0.4.6)
Requires-Dist: crashtest (==0.4.1)
Requires-Dist: distlib (==0.3.6)
Requires-Dist: docutils (==0.19)
Requires-Dist: dulwich (==0.21.3)
Requires-Dist: filelock (==3.12.0)
Requires-Dist: fsspec (==2023.4.0)
Requires-Dist: html5lib (==1.1)
Requires-Dist: huggingface-hub (==0.14.1)
Requires-Dist: idna (==3.4)
Requires-Dist: importlib-metadata (==6.6.0)
Requires-Dist: installer (==0.7.0)
Requires-Dist: joblib (==1.2.0)
Requires-Dist: jsonschema (==4.17.3)
Requires-Dist: keyring (==23.13.1)
Requires-Dist: lockfile (==0.12.2)
Requires-Dist: markdown-it-py (==2.2.0)
Requires-Dist: mdurl (==0.1.2)
Requires-Dist: more-itertools (==9.1.0)
Requires-Dist: mpmath (==1.3.0)
Requires-Dist: msgpack (==1.0.5)
Requires-Dist: networkx (==3.1)
Requires-Dist: nltk (==3.8.1)
Requires-Dist: numpy (==1.24.3)
Requires-Dist: packaging (==23.1)
Requires-Dist: pexpect (==4.8.0)
Requires-Dist: pkginfo (==1.9.6)
Requires-Dist: platformdirs (==2.6.2)
Requires-Dist: poetry (==1.4.2)
Requires-Dist: poetry-core (==1.5.2)
Requires-Dist: poetry-plugin-export (==1.3.1)
Requires-Dist: ptyprocess (==0.7.0)
Requires-Dist: pyproject_hooks (==1.0.0)
Requires-Dist: pyrsistent (==0.19.3)
Requires-Dist: pywin32-ctypes (==0.2.0)
Requires-Dist: rapidfuzz (==2.15.1)
Requires-Dist: readme-renderer (==37.3)
Requires-Dist: regex (==2023.3.23)
Requires-Dist: requests (==2.29.0)
Requires-Dist: requests-toolbelt (==0.10.1)
Requires-Dist: rfc3986 (==2.0.0)
Requires-Dist: rich (==13.3.5)
Requires-Dist: shellingham (==1.5.0.post1)
Requires-Dist: six (==1.16.0)
Requires-Dist: sympy (==1.11.1)
Requires-Dist: tokenizers (==0.13.3)
Requires-Dist: tomlkit (==0.11.8)
Requires-Dist: torch (==2.0.0)
Requires-Dist: tqdm (==4.65.0)
Requires-Dist: transformers (==4.28.1)
Requires-Dist: trove-classifiers (==2023.4.25)
Requires-Dist: twine (==4.0.2)
Requires-Dist: typing_extensions (==4.5.0)
Requires-Dist: urllib3 (==1.26.15)
Requires-Dist: virtualenv (>20.4.5)
Requires-Dist: webencodings (==0.5.1)
Requires-Dist: zipp (==3.15.0)
Description-Content-Type: text/markdown

Final Year Project on EDU Segmentation:

To improve EDU segmentation performance using Segbot. As Segbot has an encoder-decoder model architecture, we can replace bidirectional GRU encoder with generative pretraining models such as BART and T5. Evaluate the new model using the RST dataset by using few-shot based settings (e.g. 100 examples) to train the model, instead of using the full dataset.

Segbot: <br>
http://138.197.118.157:8000/segbot/ <br>
https://www.ijcai.org/proceedings/2018/0579.pdf

----
### Authors
Liu Qingyi, Patria Lim

### How to Use
<li> `from edu_segmentation import download`: use `download.download_models()` to download all models
<li> `from edu_segmentation import main`: use `main.run_segbot(user_input, granularity_level="default", model="bart")` to perform edu-segmentation
<li> Options:
<li> granularity level = ["default", "conjunction words"]
<li> model = ["bart", "bert_uncased", "bert_cased"]
<li> device = ["cuda", "cpu"]
