Metadata-Version: 2.2
Name: druhg
Version: 1.7.1
Summary: Universal clustering based on dialectical materialism
Home-page: https://github.com/artamono1/druhg
Maintainer: Pavel Artamonov
Maintainer-email: druhg.p@gmail.com
License: BSD
Keywords: cluster clustering density dialectics
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved
Classifier: Programming Language :: C
Classifier: Programming Language :: Python
Classifier: Topic :: Software Development
Classifier: Topic :: Scientific/Engineering
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Operating System :: Unix
Classifier: Operating System :: MacOS
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.8
Requires-Dist: cython>=0.27
Requires-Dist: numpy>=1.24
Requires-Dist: scipy>=1.10
Requires-Dist: scikit-learn>=1.3
Dynamic: classifier
Dynamic: description
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: maintainer
Dynamic: maintainer-email
Dynamic: requires-dist
Dynamic: summary

.. image:: https://img.shields.io/pypi/v/druhg.svg
    :target: https://pypi.python.org/pypi/druhg/
    :alt: PyPI Version
.. image:: https://img.shields.io/pypi/l/druhg.svg
    :target: https://github.com/artamono1/druhg/blob/master/LICENSE
    :alt: License

=====
DRUHG
=====

| DRUHG - Dialectical Reflection Universal Hierarchical Grouping (друг).
| Performs clustering based on densities and builds a minimum spanning tree.
| **Does not require parameters.** *(The parameter is space metric, e.x. euclidean)*
| The user can filter the size of the clusters with ``size_range``, for genuine result and genuine outliers set to [1,1].
| Parameter ``fix_outliers`` allows to label outliers to their closest clusters via mstree edges.

-------------
Basic Concept
-------------

| There are some optional tuning parameters but the actual algorithm requires none and is universal.
| It works by applying **the universal society rule: treat others how you want to be treated**.
| The core of the algorithm is to rank the subject's closest subjective similarities and amalgamate them accordingly.
| Parameter ``max_ranking`` controls precision vs productivity balance, after some value the precision and the result would not change.
| todo: Parameter ``algorithm`` can be set to 'slow' to further enhance the precision.
|
|
| The **dialectical distance** reflects the opposite density.
| Max( r/R d(r); d(R) ), where r and R are ranks from A to B and from B to A.
| This orders outliers last and equal densities first.
| It's great **replacement for DBSCAN** and **global outliers detection**.
|
| Those ordered connections become trees. Two trees reflect of each other in their totality and can transfrom into cluster.
| D N₂ K₁/(K₁+K₂) sum 1/dᵢ > N₁ - 1, where N is size of a tree, K is number of clusters in a tree.
| This allows newly formed clusters to resist the reshaping.


----------------
How to use DRUHG
----------------
.. code:: python

             import sklearn.datasets as datasets
             import druhg

             iris = datasets.load_iris()
             XX = iris['data']

             clusterer = druhg.DRUHG(max_ranking=50)
             labels = clusterer.fit(XX).labels_

It will build the tree and label the points. Now you can manipulate clusters by relabeling.

.. code:: python

             labels = dr.relabel(size_range==[1, len(XX)/2], fix_outliers=1)
             ari = adjusted_rand_score(iris['target'], labels)
             print ('iris ari', ari)

It will relabel the clusters, by restricting their size.

.. code:: python

            clusterer.plot(labels)

It will draw mstree with druhg-edges.

.. code:: python

            clusterer.plot()

It will provide interactive sliders for an exploration.

.. image:: https://raw.githubusercontent.com/artamono1/druhg/master/docs/source/pics/chameleon-sliders.png
    :width: 300px
    :align: center
    :height: 200px
    :alt: chameleon-sliders

-----------
Performance
-----------
| It can be slow on a highly structural data.
| There is a parameters ``max_ranking`` that can be used to decrease for a better performance.

.. image:: https://raw.githubusercontent.com/artamono1/druhg/master/docs/source/pics/comparison_ver.png
    :width: 300px
    :align: center
    :height: 200px
    :alt: comparison

----------
Installing
----------

PyPI install, presuming you have an up to date pip:

.. code:: bash

    pip install druhg


-----------------
Running the Tests
-----------------

The package tests can be run after installation using the command:

.. code:: bash

    pytest -k "test_name"


The tests may fail :-D

--------------
Python Version
--------------

The druhg library supports Python 3.


------------
Contributing
------------

We welcome contributions in any form! Assistance with documentation, particularly expanding tutorials,
is always welcome. To contribute please `fork the project <https://github.com/artamono1/druhg/issues#fork-destination-box>`_
make your changes and submit a pull request. We will do our best to work through any issues with
you and get your code merged into the main branch.

---------
Licensing
---------

The druhg package is 3-clause BSD licensed.
