Metadata-Version: 2.3
Name: polars-splitters
Version: 0.2.3rc1
Summary: Polars-based splitter functionalities for polars LazyFrames and DataFrames, similar to `sklearn.model_selection.train_test_split` and `sklearn.model_selection.StratifiedKFold`.
Author-email: "Jonas M. Miguel" <charter-shushes0n@icloud.com>
Requires-Python: <3.12,>=3.10
Requires-Dist: loguru>=0.7.2
Requires-Dist: polars<=1.6.0,>=1.1.0
Provides-Extra: test
Requires-Dist: pytest-check==2.3.1; extra == 'test'
Requires-Dist: pytest==8.2.2; extra == 'test'
Description-Content-Type: text/markdown

# polars-splitters

Polars-based splitter functionalities for polars LazyFrames and DataFrames similar to `sklearn.model_selection.train_test_split` and `sklearn.model_selection.StratifiedKFold`.

## features

- split_into_train_eval
- split_into_k_folds

## installation

```bash
pip install polars-splitters
```

## usage

```python
import polars as pl
from polars_splitters import split_into_train_eval, split_into_k_folds

df = pl.DataFrame(
    {
        "feature_1": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
        "treatment": [0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
        "outcome": [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1],
    }
)

df_train, df_test = split_into_train_eval(
    df,
    eval_rel_size=0.3,
    stratify_by=["treatment", "outcome"],
    shuffle=True,
    validate=True,
    as_lazy=False,
    rel_size_deviation_tolerance=0.1,
)

folds = split_into_k_folds(
    df,
    k=3,
    stratify_by=["treatment", "outcome"],
    shuffle=False,
    as_lazy=False
)
```
