Metadata-Version: 2.4
Name: mcmst_clust
Version: 1.0.1
Summary: Multi-Center Minimum Spanning Tree Clustering algorithm
Home-page: https://github.com/senolali/MCMSTClustering
Author: Ali Şenol
Author-email: Ali Şenol <alisenol@tarsus.edu.tr>
Project-URL: Homepage, https://github.com/senolali/MCMSTClustering
Project-URL: Documentation, https://github.com/senolali/MCMSTClustering
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: networkx
Dynamic: author
Dynamic: home-page
Dynamic: requires-python

# Motivation

MCMSTClustering is a minimum-cost MST based clustering algorithm.  
It uses MST distances and optional DBSCAN to detect clusters in high-dimensional data.

## Installation

```bash
pip install MCMSTClustering
```

## Usage

```bash
from mcmst_clust import MCMSTClustering, normalize
import numpy as np

# Generate random data
X = np.random.rand(10, 2)
X = normalize(X)

# Initialize and fit the clustering model
model = MCMSTClustering(min_samples=2)
model.fit(X)

# Predict cluster labels
labels = model.predict(X)
print(labels)

```

## Oerview

MCMSTClustering (Defining Non-Spherical Clusters by using Minimum Spanning Tree over KD-Tree-based Micro-Clusters) is designed to overcome limitations of conventional clustering algorithms when handling:

	- High-dimensional data
	
	- Imbalanced datasets
	
	- Clusters with varying densities
	
	- Noisy data/outliers
	
	- Arbitrary-shaped clusters
	

The algorithm consists of three main steps:

	1. Micro-cluster Formation: Defines micro-clusters using a KD-Tree data structure with range search.
	
	2. Macro-cluster Construction: Builds a minimum spanning tree (MST) over the micro-clusters to form macro-clusters.
	
	3. Cluster Regulation: Refines the clusters to improve accuracy and overall clustering quality.
	

Extensive experiments against state-of-the-art algorithms show that MCMSTClustering achieves high-quality clustering results with acceptable runtime.

Key Features

	- Clusters datasets with high quality

	- Detects arbitrary-shaped clusters

	- Robust against outliers/noisy data

	- Handles clusters with varying densities

	- Efficient on imbalanced datasets


## Cite

If you use the code in your works, please cite the paper given below:
```bash
Şenol, A. MCMSTClustering: defining non-spherical clusters by using minimum 
spanning tree over KD-tree-based micro-clusters. Neural Comput & Applic 35, 
13239–13259 (2023). https://doi.org/10.1007/s00521-023-08386-3
```

## BibTeX

```bash
@article{csenol2023mcmstclustering,
  title={MCMSTClustering: defining non-spherical clusters by using minimum spanning tree over KD-tree-based micro-clusters},
  author={{\c{S}}enol, Ali},
  journal={Neural Computing and Applications},
  volume={35},
  number={18},
  pages={13239--13259},
  year={2023},
  publisher={Springer}
}
```
