Metadata-Version: 1.2
Name: icdcodex
Version: 0.4.0
Summary: icd embedding for machine learning
Home-page: https://github.com/icd-codex/icd-codex
Author: Jeremy Fisher
Author-email: jeremyf@cmu.edu
License: MIT license
Description: [![PyPI version fury.io](https://badge.fury.io/py/icdcodex.svg)](https://pypi.python.org/pypi/icdcodex/) [![Documentation Status](https://readthedocs.org/projects/icd-codex/badge/?version=latest)](http://icd-codex.readthedocs.io/?badge=latest) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![GitHub license](https://img.shields.io/github/license/icd-codex/icd-codex.svg)](https://github.com/icd-codex/icd-codex/blob/master/LICENSE)
        
        `icdcodex` was the first prize winner in the Data Driven Healthcare Track of John Hopkins' [MedHacks 2020](https://medhacks2020.devpost.com).
        
        ```{admonition} Experimental 
        This is experimental software and a stable API is not expected until version 1.0
        ```
        
        ## Motivation
        Thousands of Americans are misquoted on their health insurance yearly due to ICD miscodes. While ICD coding is manual and laborous, it is difficult to automate by machine learning because the output space is enormous. For example, ICD-10 CM (clinical modification) has over 70,000 codes and growing. There are [many strategies](https://maxhalford.github.io/blog/target-encoding/) for label embedding that address these issues.
        
        `icdcodex` has two features that make ICD classification more amenable to modeling:
        - Access to a `networkx` tree representation of the ICD9 and ICD10 hierarchies
        - Vector embeddings of ICD codes (including pre-computed embeddings and an interface to create new embeddings)
        
        ## Example Code
        ```python
        from icdcodex import icd2vec, hierarchy
        embedder = icd2vec.Icd2Vec(num_embedding_dimensions=64)
        embedder.fit(*hierarchy.icd9())
        X = get_patient_covariates()
        y = embedder.to_vec(["001.0"])  # Cholera due to vibrio cholerae
        ```
        In this case, `y` is a 64-dimensional vector close to other `Infectious And Parasitic Diseases` codes. 
        
        ## Related Work
        - node2vec [Paper](https://cs.stanford.edu/people/jure/pubs/node2vec-kdd16.pdf), [Website](https://snap.stanford.edu/node2vec/), [Code](https://github.com/snap-stanford/snap/tree/master/examples/node2vec), [Alternate Code](https://github.com/eliorc/node2vec)
        - Learning Low-Dimensional Representations of Medical Concepts: [Paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001761/), [Code](https://github.com/clinicalml/embeddings)
        - Projection Word Embedding Model With Hybrid Sampling Training for Classifying ICD-10-CM Codes [Paper](https://pubmed.ncbi.nlm.nih.gov/31339103/)
        
        ## The Hackathon Team
        - Jeremy Fisher (Maintainer)
        - Alhusain Abdalla
        - Natasha Nehra
        - Tejas Patel
        - Hamrish Saravanakumar
        
        ## Documentation
        
        See the full documentation: [https://icd-codex.readthedocs.io/en/latest/](https://icd-codex.readthedocs.io/en/latest/)
        
        ## Contributions
        
        [Contributions are always welcome!](https://icd-codex.readthedocs.io/en/latest/contributing.html)
        
        
        =======
        History
        =======
        
        0.1.0 (2020-09-04)
        ------------------
        
        * First release on PyPI.
        
        0.3.0 (2020-09-05)
        ------------------
        
        * Finesse API, now consistent between documentation and implementation
Keywords: icdcodex
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.7
