Metadata-Version: 2.1
Name: copulas
Version: 0.3.1
Summary: A python library for building different types of copulas and using them for sampling.
Home-page: https://github.com/sdv-dev/Copulas
Author: MIT Data To AI Lab
Author-email: dailabmit@gmail.com
License: MIT license
Description: <p align="left">
        <img width=20% src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png" alt=“sdv-dev” />
        <i>An open source project from Data to AI Lab at MIT.</i>
        </p>
        
        [![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
        [![PyPi Shield](https://img.shields.io/pypi/v/copulas.svg)](https://pypi.python.org/pypi/copulas)
        [![Travis CI Shield](https://travis-ci.org/sdv-dev/Copulas.svg?branch=master)](https://travis-ci.org/sdv-dev/Copulas)
        [![Coverage Status](https://codecov.io/gh/sdv-dev/Copulas/branch/master/graph/badge.svg)](https://codecov.io/gh/sdv-dev/Copulas)
        [![Downloads](https://pepy.tech/badge/copulas)](https://pepy.tech/project/copulas)
        
        # Copulas
        
        * License: [MIT](https://github.com/sdv-dev/Copulas/blob/master/LICENSE)
        * Development Status: [Pre-Alpha](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
        * Documentation: https://sdv-dev.github.io/Copulas
        * Homepage: https://github.com/sdv-dev/Copulas
        
        # Overview
        
        **Copulas** is a Python library for modeling multivariate distributions and sampling from them
        using [copula functions](https://en.wikipedia.org/wiki/Copula_%28probability_theory%29).
        Given a table containing numerical data, we can use Copulas to learn the distribution and
        later on generate new synthetic rows following the same statistical properties.
        
        Some of the features provided by this library include:
        
        * A variety of distributions for modeling univariate data.
        * Multiple Archimedean copulas for modeling bivariate data.
        * Gaussian and Vine copulas for modeling multivariate data.
        * Automatic selection of univariate distributions and bivariate copulas.
        
        ## Supported Distributions
        
        ### Univariate
        
        * Gaussian
        * Student T
        * Beta
        * Gamma
        * Gaussian KDE
        * Truncated Gaussian
        
        ### Archimedean Copulas (Bivariate)
        
        * Clayton
        * Frank
        * Gumbel
        
        ### Multivariate
        
        * Gaussian
        * D-Vine
        * C-Vine
        * R-Vine
        
        # Install
        
        ## Requirements
        
        **Copulas** has been developed and tested on [Python 3.5, 3.6 and 3.7](https://www.python.org/downloads/)
        
        Also, although it is not strictly required, the usage of a [virtualenv](https://virtualenv.pypa.io/en/latest/)
        is highly recommended in order to avoid interfering with other software installed in the system where **Copulas**
        is run.
        
        ## Install with pip
        
        The easiest and recommended way to install **Copulas** is using [pip](https://pip.pypa.io/en/stable/):
        
        ```bash
        pip install copulas
        ```
        
        This will pull and install the latest stable release from [PyPi](https://pypi.org/).
        
        If you want to install from source or contribute to the project please read the
        [Contributing Guide](https://sdv-dev.github.io/Copulas/contributing.html#get-started).
        
        
        # Quickstart
        
        In this short quickstart, we show how to model a multivariate dataset and then generate
        synthetic data that resembles it.
        
        ```python3
        import warnings
        warnings.filterwarnings('ignore')
        
        from copulas.datasets import sample_trivariate_xyz
        from copulas.multivariate import GaussianMultivariate
        from copulas.visualization import compare_3d
        
        # Load a dataset with 3 columns that are not independent
        real_data = sample_trivariate_xyz()
        
        # Fit a gaussian copula to the data
        copula = GaussianMultivariate()
        copula.fit(real_data)
        
        # Sample synthetic data
        synthetic_data = copula.sample(len(real_data))
        
        # Plot the real and the synthetic data to compare
        compare_3d(real_data, synthetic_data)
        ```
        
        The output will be a figure with two plots, showing what both the real and the synthetic
        data that you just generated look like:
        
        ![Quickstart](docs/images/quickstart.png)
        
        
        # What's next?
        
        For more details about **Copulas** and all its possibilities and features, please check the
        [documentation site](https://sdv-dev.github.io/Copulas/).
        
        There you can learn more about [how to contribute to Copulas](https://sdv-dev.github.io/Copulas/contributing.html)
        in order to help us developing new features or cool ideas.
        
        # Credits
        
        Copulas is an open source project from the Data to AI Lab at MIT which has been built and maintained
        over the years by the following team:
        
        * Manuel Alvarez <manuel@pythiac.com>
        * Carles Sala <carles@pythiac.com>
        * José David Pérez <jose@pythiac.com>
        * (Alicia)Yi Sun <yis@mit.edu>
        * Andrew Montanez <amontane@mit.edu>
        * Kalyan Veeramachaneni <kalyan@csail.mit.edu>
        * paulolimac <paulolimac@gmail.com>
        * Kevin Alex Zhang <kevz@mit.edu>
        * Gabriele Bonomi <gbonomib@gmail.com>
        
        # Related Projects
        
        ## SDV
        
        [SDV](https://github.com/HDI-Project/SDV), for Synthetic Data Vault, is the end-user library for
        synthesizing data in development under the [HDI Project](https://hdi-dai.lids.mit.edu/).
        SDV allows you to easily model and sample relational datasets using Copulas thought a simple API.
        Other features include anonymization of Personal Identifiable Information (PII) and preserving
        relational integrity on sampled records.
        
        ## CTGAN
        
        [CTGAN](https://github.com/sdv-dev/CTGAN) is a GAN based model for synthesizing tabular data.
        It's also developed by the [MIT's Data to AI Lab](https://sdv-dev.github.io/) and is under
        active development.
        
        
        # History
        
        ## 0.3.1 (2020-07-09)
        
        ### General Improvements
        
        * Raise numpy version upper bound to 2 - Issue [#178](https://github.com/sdv-dev/Copulas/issues/178) by @csala
        
        ### New Features
        
        * Add Student T Univariate - Issue [#172](https://github.com/sdv-dev/Copulas/issues/172) by @gbonomib
        
        ### Bug Fixes
        
        * Error in Quickstarts : Unknown projection '3d' - Issue [#174](https://github.com/sdv-dev/Copulas/issues/174) by @csala
        
        ## 0.3.0 (2020-03-27)
        
        Important revamp of the internal implementation of the project, the testing
        infrastructure and the documentation by Kevin Alex Zhang @k15z, Carles Sala
        @csala and Kalyan Veeramachaneni @kveerama
        
        ### Enhancements
        
        * Reimplementation of the existing Univariate distributions.
        * Addition of new Beta and Gamma Univariates.
        * New Univariate API with automatic selection of the optimal distribution.
        * Several improvements and fixes on the Bivariate and Multivariate Copulas implementation.
        * New visualization module with simple plotting patterns to visualize probability distributions.
        * New datasets module with toy datasets sampling functions.
        * New testing infrastructure with end-to-end, numerical and large scale testing.
        * Improved tutorials and documentation.
        
        ## 0.2.5 (2020-01-17)
        
        ### General Improvements
        
        * Convert import_object to get_instance - Issue [#114](https://github.com/sdv-dev/Copulas/issues/114) by @JDTheRipperPC
        
        ## 0.2.4 (2019-12-23)
        
        ### New Features
        
        * Allow creating copula classes directly - Issue [#117](https://github.com/sdv-dev/Copulas/issues/117) by @csala
        
        ### General Improvements
        
        * Remove `select_copula` from `Bivariate` - Issue [#118](https://github.com/sdv-dev/Copulas/issues/118) by @csala
        
        * Rename TruncNorm to TruncGaussian and make it non standard - Issue [#102](https://github.com/sdv-dev/Copulas/issues/102) by @csala @JDTheRipperPC
        
        ### Bugs fixed
        
        * Error on Frank and Gumble sampling - Issue [#112](https://github.com/sdv-dev/Copulas/issues/112) by @csala
        
        ## 0.2.3 (2019-09-17)
        
        ### New Features
        
        * Add support to Python 3.7 - Issue [#53](https://github.com/sdv-dev/Copulas/issues/53) by @JDTheRipperPC
        
        ### General Improvements
        
        * Document RELEASE workflow - Issue [#105](https://github.com/sdv-dev/Copulas/issues/105) by @JDTheRipperPC
        
        * Improve serialization of univariate distributions - Issue [#99](https://github.com/sdv-dev/Copulas/issues/99) by @ManuelAlvarezC and @JDTheRipperPC
        
        ### Bugs fixed
        
        * The method 'select_copula' of Bivariate return wrong CopulaType - Issue [#101](https://github.com/sdv-dev/Copulas/issues/101) by @JDTheRipperPC
        
        ## 0.2.2 (2019-07-31)
        
        ### New Features
        
        * `truncnorm` distribution and a generic wrapper for `scipy.rv_continous` distributions - Issue [#27](https://github.com/sdv-dev/Copulas/issues/27) by @amontanez, @csala and @ManuelAlvarezC
        * `Independence` bivariate copulas - Issue [#46](https://github.com/sdv-dev/Copulas/issues/46) by @aliciasun, @csala and @ManuelAlvarezC
        * Option to select seed on random number generator - Issue [#63](https://github.com/sdv-dev/Copulas/issues/63) by @echo66 and @ManuelAlvarezC
        * Option on Vine copulas to select number of rows to sample - Issue [#77](https://github.com/sdv-dev/Copulas/issues/77) by @ManuelAlvarezC
        * Make copulas accept both scalars and arrays as arguments - Issues [#85](https://github.com/sdv-dev/Copulas/issues/85) and [#90](https://github.com/sdv-dev/Copulas/issues/90) by @ManuelAlvarezC
        
        ### General Improvements
        
        * Ability to properly handle constant data - Issues [#57](https://github.com/sdv-dev/Copulas/issues/57) and [#82](https://github.com/sdv-dev/Copulas/issues/82) by @csala and @ManuelAlvarezC
        * Tests for analytics properties of copulas - Issue [#61](https://github.com/sdv-dev/Copulas/issues/61) by @ManuelAlvarezC
        * Improved documentation - Issue [#96](https://github.com/sdv-dev/Copulas/issues/96) by @ManuelAlvarezC
        
        ### Bugs fixed
        
        * Fix bug on Vine copulas, that made it crash during the bivariate copula selection - Issue [#64](https://github.com/sdv-dev/Copulas/issues/64) by @echo66 and @ManuelAlvarezC
        
        ## 0.2.1 - Vine serialization
        
        * Add serialization to Vine copulas.
        * Add `distribution` as argument for the Gaussian Copula.
        * Improve Bivariate Copulas code structure to remove code duplication.
        * Fix bug in Vine Copulas sampling: 'Edge' object has no attribute 'index'
        * Improve code documentation.
        * Improve code style and linting tools configuration.
        
        ## 0.2.0 - Unified API
        
        * New API for stats methods.
        * Standarize input and output to `numpy.ndarray`.
        * Increase unittest coverage to 90%.
        * Add methods to load/save copulas.
        * Improve Gaussian copula sampling accuracy.
        
        ## 0.1.1 - Minor Improvements
        
        * Different Copula types separated in subclasses
        * Extensive Unit Testing
        * More pythonic names in the public API.
        * Stop using third party elements that will be deprected soon.
        * Add methods to sample new data on bivariate copulas.
        * New KDE Univariate copula
        * Improved examples with additional demo data.
        
        ## 0.1.0 - First Release
        
        * First release on PyPI.
        
Keywords: copulas
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Requires-Python: >=3.5,<3.8
Description-Content-Type: text/markdown
Provides-Extra: tutorials
Provides-Extra: test
Provides-Extra: dev
