Metadata-Version: 2.4
Name: rubicon-ml
Version: 0.13.1
Summary: AI/ML lifecycle metadata logger with configurable backends
Author: Ryan Soley, Capital One
License: Apache License, Version 2.0
Project-URL: Bug Tracker, https://github.com/capitalone/rubicon-ml/issues
Project-URL: Documentation, https://capitalone.github.io/rubicon-ml/
Project-URL: Source Code, https://github.com/capitalone/rubicon-ml
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Build Tools
Classifier: Topic :: Software Development :: Documentation
Requires-Python: >=3.10.0
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click<=8.3.0,>=7.1
Requires-Dist: fsspec<=2025.10.0,>=2021.4.0
Requires-Dist: intake<=2.0.8,>=0.5.2
Requires-Dist: jsonpath-ng<=1.7.0,>=1.5.3
Requires-Dist: numpy<=2.3.4,>=1.22.0
Requires-Dist: pandas<=2.3.3,>=1.0.0
Requires-Dist: pyarrow<=22.0.0,>=14.0.1
Requires-Dist: pyyaml<=6.0.3,>=5.4.0
Requires-Dist: scikit-learn<=1.7.2,>=0.22.0
Provides-Extra: all
Requires-Dist: rubicon-ml[s3,viz]; extra == "all"
Provides-Extra: build
Requires-Dist: build; extra == "build"
Requires-Dist: setuptools; extra == "build"
Requires-Dist: twine; extra == "build"
Requires-Dist: wheel; extra == "build"
Provides-Extra: dev
Requires-Dist: rubicon-ml[build,docs,ops,s3,test,viz]; extra == "dev"
Provides-Extra: docs
Requires-Dist: furo; extra == "docs"
Requires-Dist: ipython; extra == "docs"
Requires-Dist: nbsphinx; extra == "docs"
Requires-Dist: numpydoc; extra == "docs"
Requires-Dist: pandoc; extra == "docs"
Requires-Dist: rubicon-ml[viz]; extra == "docs"
Requires-Dist: sphinx; extra == "docs"
Requires-Dist: sphinx-copybutton; extra == "docs"
Provides-Extra: ops
Requires-Dist: bumpver; extra == "ops"
Requires-Dist: edgetest; extra == "ops"
Requires-Dist: pre-commit; extra == "ops"
Requires-Dist: pyproject-fmt; extra == "ops"
Requires-Dist: ruff; extra == "ops"
Provides-Extra: s3
Requires-Dist: s3fs<=2025.10.0,>=0.4; extra == "s3"
Provides-Extra: test
Requires-Dist: dask[dataframe,distributed]<2025.4.0; extra == "test"
Requires-Dist: h2o; extra == "test"
Requires-Dist: ipykernel; extra == "test"
Requires-Dist: jupyterlab; extra == "test"
Requires-Dist: kaleido==0.2.1; extra == "test"
Requires-Dist: lightgbm; extra == "test"
Requires-Dist: nbconvert; extra == "test"
Requires-Dist: palmerpenguins; extra == "test"
Requires-Dist: pillow; extra == "test"
Requires-Dist: polars<1.0; extra == "test"
Requires-Dist: pytest; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Requires-Dist: xgboost; extra == "test"
Provides-Extra: ui
Requires-Dist: rubicon-ml[viz]; extra == "ui"
Provides-Extra: viz
Requires-Dist: dash<=2.18.2,>=2.11.0; extra == "viz"
Requires-Dist: dash-bootstrap-components<=1.7.1,>=1.0.0; extra == "viz"
Dynamic: license-file

# rubicon-ml

[![Test Package](https://github.com/capitalone/rubicon-ml/actions/workflows/test-package.yml/badge.svg)](https://github.com/capitalone/rubicon-ml/actions/workflows/test-package.yml)
[![Publish Package](https://github.com/capitalone/rubicon-ml/actions/workflows/publish-package.yml/badge.svg)](https://github.com/capitalone/rubicon-ml/actions/workflows/publish-package.yml)
[![Publish Docs](https://github.com/capitalone/rubicon-ml/actions/workflows/publish-docs.yml/badge.svg)](https://github.com/capitalone/rubicon-ml/actions/workflows/publish-docs.yml)
[![edgetest](https://github.com/capitalone/rubicon-ml/actions/workflows/edgetest.yml/badge.svg)](https://github.com/capitalone/rubicon-ml/actions/workflows/edgetest.yml)

[![Conda Version](https://img.shields.io/conda/vn/conda-forge/rubicon-ml.svg)](https://anaconda.org/conda-forge/rubicon-ml)
[![PyPi Version](https://img.shields.io/pypi/v/rubicon_ml.svg)](https://pypi.org/project/rubicon-ml/)
[![ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/capitalone/rubicon-ml/main?labpath=binder%2Fwelcome.ipynb)

## Purpose

rubicon-ml is a data science tool that captures and stores model training and
execution information, like parameters and outcomes, in a repeatable and
searchable way. Its `git` integration associates these inputs and outputs
directly with the model code that produced them to ensure full auditability and
reproducibility for both developers and stakeholders alike. While experimenting,
the dashboard makes it easy to explore, filter, visualize, and share
recorded work.

---

## Components

rubicon-ml is composed of three parts:

* A Python library for storing and retrieving model inputs, outputs, and
  analyses to filesystems that’s powered by
  [`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/?badge=latest)
* A dashboard for exploring, comparing, and visualizing logged data built with
  [`dash`](https://dash.plotly.com/)
* And a process for sharing a selected subset of logged data with collaborators
  or reviewers that leverages [`intake`](https://intake.readthedocs.io/en/latest/)

## Workflow

Use `rubicon_ml` to capture model inputs and outputs over time. It can be
easily integrated into existing Python models or pipelines and supports both
concurrent logging (so multiple experiments can be logged in parallel) and
asynchronous communication with S3 (so network reads and writes won’t block).

Meanwhile, periodically review the logged data within the Rubicon dashboard to
steer the model tweaking process in the right direction. The dashboard lets you
quickly spot trends by exploring and filtering your logged results and
visualizes how the model inputs impacted the model outputs.

When the model is ready for review, Rubicon makes it easy to share specific
subsets of the data with model reviewers and stakeholders, giving them the
context necessary for a complete model review and approval.

## Use

Check out the [interactive notebooks in this Binder](https://mybinder.org/v2/gh/capitalone/rubicon-ml/main?labpath=binder%2Fwelcome.ipynb)
to try `rubicon_ml` for yourself.

Here's a simple example:

```python
from rubicon_ml import Rubicon

rubicon = Rubicon(
    persistence="filesystem", root_dir="/rubicon-root", auto_git_enabled=True
)

project = rubicon.create_project(
    "Hello World", description="Using rubicon to track model results over time."
)

experiment = project.log_experiment(
    training_metadata=[SklearnTrainingMetadata("sklearn.datasets", "my-data-set")],
    model_name="My Model Name",
    tags=["my_model_name"],
)

experiment.log_parameter("n_estimators", n_estimators)
experiment.log_parameter("n_features", n_features)
experiment.log_parameter("random_state", random_state)

accuracy = rfc.score(X_test, y_test)
experiment.log_metric("accuracy", accuracy)
```

Then explore the project by running the dashboard:

```
rubicon_ml ui --root-dir /rubicon-root
```

## Documentation

For a full overview, visit the [docs](https://capitalone.github.io/rubicon-ml/). If
you have suggestions or find a bug, [please open an
issue](https://github.com/capitalone/rubicon-ml/issues/new/choose).

## Install

The Python library is available on Conda Forge via `conda` and PyPi via `pip`.

```
conda config --add channels conda-forge
conda install rubicon-ml
```

or

```
pip install rubicon-ml
```

## Develop

To contribute, check out our
[developer guide](https://capitalone.github.io/rubicon-ml/developer-guide.html)
for the latest instructions on setting up your local developer environment.

## Tests

The tests are separated into unit and integration tests. They can be run
directly in the `uv` environment via `uv run pytest tests/unit` or `uv run pytest
tests/integration`. Or by simply running `uv run pytest` to execute all of them.

**Note**: some integration tests are intentionally `marked` to control when they
are run (i.e. not during CICD). These tests include:

* Integration tests that write to physical filesystems - local and S3. Local
  files will be written to `./test-rubicon` relative to where the tests are run.
  An S3 path must also be provided to run these tests. By default, these
  tests are disabled. To enable them, run:

    ```
    uv run pytest -m "write_files" --s3-path "s3://my-bucket/my-key"
    ```

* Integration tests that run Jupyter notebooks. These tests are a bit slower
  than the rest of the tests in the suite as they need to launch Jupyter servers.
  By default, they are enabled. To disable them, run:

    ```
    uv run pytest -m "not run_notebooks and not write_files"
    ```

    **Note**: When simply running `uv run pytest`, `-m "not write_files"` is the
    default. So, we need to also apply it when disabling notebook tests.

## Style

We use `ruff` for linting and formatting. To check and update the code style, run:

```
uv run ruff check --fix
uv run ruff format
```

Install and configure pre-commit to automatically run `ruff` during commits:
* [install pre-commit](https://pre-commit.com/#installation)
* run `uv run pre-commit install` to set up the git hook scripts

Now `pre-commit` will run automatically on git commit and will ensure consistent
code format throughout the project. You can format without committing via
`uv run pre-commit run --all-files` or skip these checks with `git commit
--no-verify`.

---

If you're looking for Rubicon, the Java & Objective C to Python bridge, visit
[here](https://pypi.org/project/rubicon/).
