Metadata-Version: 2.1
Name: dask-lightgbm
Version: 0.2.0
Summary: LightGBM distributed training on Dask
Home-page: https://github.com/dask/dask-lightgbm
License: BSD-3-Clause
Author: Jan Stiborek
Author-email: honza.stiborek@gmail.com
Requires-Python: >=3.6,<4.0
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Distributed Computing
Provides-Extra: sparse
Requires-Dist: dask (>=2.6.0,<3.0.0)
Requires-Dist: distributed (>=2.6.0,<3.0.0)
Requires-Dist: lightgbm (>=2.3.0,<3.0.0)
Requires-Dist: numpy (>=1.17.3,<2.0.0)
Requires-Dist: scipy (>=1.3.1,<2.0.0); extra == "sparse"
Requires-Dist: sparse (==0.5.0); extra == "sparse"
Requires-Dist: toolz (>=0.10.0,<0.11.0)
Project-URL: Repository, https://github.com/dask/dask-lightgbm
Description-Content-Type: text/markdown

Dask-LightGBM - DEPRECATED
==========================

THIS REPOSITORY IS DEPRECATED
-----------------------------

This repository is deprecated and it is no longer maintained. The code was migrated into LightGBM package - https://github.com/microsoft/LightGBM.

[![Build Status](https://github.com/dask/dask-lightgbm/workflows/CI/badge.svg)](https://github.com/dask/dask-lightgbm/actions?query=workflow%3ACI)

Distributed training with LightGBM and Dask.distributed

This repository enables you to perform distributed training with LightGBM on
Dask.Array and Dask.DataFrame collections. It is based on dask-xgboost package.

Usage
-----
Load your data into distributed data-structure, which can be either Dask.Array or Dask.DataFrame.
Connect to a Dask cluster using Dask.distributed.Client.
Let dask-lightgbm train a model or make predictions for you.
See system tests for a sample code:
<https://github.com/dask/dask-lightgbm/blob/main/system_tests/test_fit_predict.py>

How this works
--------------
Dask is used mainly for accessing the cluster and managing data.
The library assures that both features and a label for each sample are located on the same worker.
It also lets each worker to know addresses and available ports of all other workers.
The distributed training is performed by LightGBM library itself using sockets.
See more details on distributed training in LightGBM here:
<https://github.com/microsoft/LightGBM/blob/main/docs/Parallel-Learning-Guide.rst>

