Metadata-Version: 2.1
Name: nb-workflows
Version: 0.6.0
Summary: Schedule parameterized notebooks programmatically using cli or a REST API
Home-page: https://github.com/nuxion/nb_workflows
License: Apache-2.0
Keywords: papermill,jupyter,workflows,data
Author: nuxion
Author-email: nuxion@gmail.com
Requires-Python: >=3.8,<3.10
Classifier: Environment :: Console
Classifier: Environment :: Web Environment
Classifier: Framework :: Jupyter :: JupyterLab
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Provides-Extra: fsspec
Provides-Extra: server
Requires-Dist: Jinja2 (>=3.0.3,<4.0.0)
Requires-Dist: PyJWT (>=2.1.0,<2.2.0)
Requires-Dist: PyYAML (>=6.0,<7.0)
Requires-Dist: SQLAlchemy-serializer (>=1.4.1,<2.0.0); extra == "server"
Requires-Dist: SQLAlchemy[asyncio] (>=1.4.26,<2.0.0); extra == "server"
Requires-Dist: aiofiles (>=0.8.0,<0.9.0)
Requires-Dist: aioredis[hiredis] (>=2.0.1,<3.0.0)
Requires-Dist: alembic (>=1.6.5,<2.0.0); extra == "server"
Requires-Dist: asyncpg (>=0.24.0,<0.25.0); extra == "server"
Requires-Dist: click (>=8.0.1,<9.0.0)
Requires-Dist: cloudpickle (>=2.0.0,<3.0.0)
Requires-Dist: cryptography (>=36.0.1,<37.0.0)
Requires-Dist: dateparser (>=1.1.0,<2.0.0)
Requires-Dist: docker (>=5.0.3,<6.0.0)
Requires-Dist: fsspec (>=2022.2.0,<2023.0.0); extra == "fsspec"
Requires-Dist: httpx (>=0.22.0,<0.23.0)
Requires-Dist: ipykernel (>=6.9.1,<7.0.0)
Requires-Dist: ipython (>=8.1.1,<9.0.0)
Requires-Dist: jupytext (>=1.13.0,<2.0.0)
Requires-Dist: loky (>=3.0.0,<4.0.0)
Requires-Dist: nanoid (>=2.0.0,<3.0.0)
Requires-Dist: nbconvert (>=6.2.0,<7.0.0)
Requires-Dist: papermill (>=2.3.3,<3.0.0)
Requires-Dist: psycopg2-binary (>=2.9.1,<3.0.0); extra == "server"
Requires-Dist: pydantic (>=1.9.0,<2.0.0)
Requires-Dist: pytz (>=2021.1,<2022.0)
Requires-Dist: rq (>=1.10.0,<2.0.0)
Requires-Dist: rq-scheduler (>=0.11.0,<0.12.0); extra == "server"
Requires-Dist: sanic (>=21.6.2,<22.0.0); extra == "server"
Requires-Dist: sanic-ext (>=21.9.0,<22.0.0); extra == "server"
Requires-Dist: sanic-jwt (>=1.7.0,<2.0.0); extra == "server"
Requires-Dist: sanic-openapi (>=21.6.1,<22.0.0); extra == "server"
Requires-Dist: tqdm (>=4.62.3,<5.0.0)
Requires-Dist: uvloop (>=0.16.0,<0.17.0); extra == "server"
Project-URL: Documentation, https://nb-workflows.readthedocs.io/
Project-URL: Repository, https://github.com/nuxion/nb_workflows
Description-Content-Type: text/markdown

# NB Workflows

[![nb-workflows](https://github.com/nuxion/nb_workflows/actions/workflows/main.yaml/badge.svg)](https://github.com/nuxion/nb_workflows/actions/workflows/main.yaml)
![readthedocs](https://readthedocs.org/projects/nb_workflows/badge/?version=latest)
![PyPI - Format](https://img.shields.io/pypi/format/nb_workflows)
![PyPI - Status](https://img.shields.io/pypi/status/nb_workflows)

[![codecov](https://codecov.io/gh/nuxion/nb_workflows/branch/main/graph/badge.svg?token=F025Y1BF9U)](https://codecov.io/gh/nuxion/nb_workflows)


## Description 

If SQL is a lingua franca for querying data, Jupyter should be a lingua franca for data explorations, model training, and complex and unique tasks related to data.

NB Workflows is a library and a platform that allows you to run parameterized notebooks in a distributed way. 
A Notebook could be launched remotly on demand, or could be schedule by intervals or using cron syntax.

Internally it uses [Sanic](https://sanicframework.org) as web server, [papermill](https://papermill.readthedocs.io/en/latest/) as notebook executor, an [RQ](https://python-rq.org/)
for task distributions and coordination. 

### Goal

Empowering different data roles in a project to put code into production, simplifying the time required to do so. It enables people to go from a data exploration instance to an entirely pipeline deployed in production, using the same notebook file made by a data scientist, analyst or whatever role working with data in an iterative way.

### Features

- Define a notebook like a function, and execute it on demand
- Automatic Dockerfile generation. A project should share a unique environment
- Docker building and versioning: it build and track each release. 
- Execution History, Notifications to Slack or Discord.

### Roadmap

See [Roadmap](/ROADMAP.md) *draft*

## Architecture

![nb_workflows architecture](/docs/platform-workflows.jpg)



## References & inspirations
- [Notebook Innovation - Netflix](https://netflixtechblog.com/notebook-innovation-591ee3221233)
- [Tensorflow metastore](https://www.tensorflow.org/tfx/guide/mlmd)
- [Maintainable and collaborative pipelines](https://blog.jupyter.org/ploomber-maintainable-and-collaborative-pipelines-in-jupyter-acb3ad2101a7)



