Metadata-Version: 2.1
Name: pmapper
Version: 1.0.0
Summary: pmapper: 3D pharmacophore hashes and fingerprints
Home-page: https://github.com/DrrDom/pmapper
Author: Pavel Polishchuk
Author-email: pavel_polishchuk@ukr.net
License: UNKNOWN
Description: # Pmapper - 3D pharmacophore signatures and fingerprints
        
        Pmapper is a Python module to generate 3D pharmacophore signatures and fingerprints.
        Signatures uniquely encode 3D pharmacophores with hashes suitable for fast identification of identical pharmacophores.
        
        ## Dependency
        
        `rdkit >= 2017.09`  
        `networkx >= 2`
        
        ## Installation
        ```text
        pip install pmapper
        ```
        
        ## Changelog
        **1.0.0**
        - added functionality to calculate 3D pharmacophore descriptors for molecules with exclusion of single atoms (for the purpose of model interpretation)
        - added convenience function get_feature_ids
        - added function add_feature to manually edit/construct a pharmacophore
        - added save/load of pharmit pharmacophore models
        
        
        - IMPORTANT: changed the hashing procedure to make it more stable (pickle dependency was removed). This breaks compatibility with previously generated md5 hashes with `get_signature_md5`, `iterate_pharm` and `iterate_pharm1` functions, all other functionality was not affected. 
        
        ## Examples
        
        ### Load modules
        ```python
        from pmapper.pharmacophore import Pharmacophore as P
        from rdkit import Chem
        from rdkit.Chem import AllChem
        from pprint import pprint
        ```
        ### Create pharmacophore from a single conformer using default feature definitions
        ```python
        # load a molecule from SMILES and generate 3D coordinates
        mol = Chem.MolFromSmiles('C1CC(=O)NC(=O)C1N2C(=O)C3=CC=CC=C3C2=O')  # talidomide
        mol = Chem.AddHs(mol)
        AllChem.EmbedMolecule(mol, randomSeed=42)
        
        # create pharmacophore
        p = P()
        p.load_from_mol(mol)
        ```
        ### Get 3D pharmacophore signature
        ```python
        # get 3D pharmacophore signature
        sig = p.get_signature_md5()
        print(sig)
        ```
        Output:
        ```text
        98504647beeb143ae50bb6b7798ca0f0
        ```
        ### Get 3D pharmacophore signature with non-zero tolerance
        ```python
        sig = p.get_signature_md5(tol=5)
        print(sig)
        ```
        Output:
        ```text
        bc54806ba01bf59736a7b62b017d6e1d
        ```
        ### Create pharmacophores for a multiple conformer compound
        ```python
        from pmapper.utils import load_multi_conf_mol
        
        # create multiple conformer molecule
        AllChem.EmbedMultipleConfs(mol, numConfs=10, randomSeed=1024)
        
        ps = load_multi_conf_mol(mol)
        
        sig = [p.get_signature_md5() for p in ps]
        
        pprint(sig)  # identical signatures occur
        ```
        Output:
        ```text
        ['d5f5f9d65e39cb8605f1fa9db5b2fbb0',
         '6204791002d1e343b2bde323149fa780',
         'abfabd8a4fcf5719ed6bf2c71a60852c',
         'dfe9f17d30210cb94b8dd7acf77feae9',
         'abfabd8a4fcf5719ed6bf2c71a60852c',
         'e739fb5f9985ce0c65a16da41da4a33f',
         '2297ddf0e437b7fc32077f75e3924dcd',
         'e739fb5f9985ce0c65a16da41da4a33f',
         '182a00bd9057abd0c455947d9cfa457c',
         '68f226d474808e60ab1256245f64c2b7']
        ```
        Identical hashes should correspond to pharmacophores with low RMSD. Pharmacophores #2 and #4 have identical hash `abfabd8a4fcf5719ed6bf2c71a60852c`. Let's check RMSD.
        ```python
        from pmapper.utils import get_rms
        for i in range(len(ps)):
            print("rmsd bewteen 2 and %i pharmacophore:" % i, round(get_rms(ps[2], ps[i]), 2))
        ```
        Output
        ```text
        rmsd bewteen 2 and 0 pharmacophore: 0.63
        rmsd bewteen 2 and 1 pharmacophore: 0.99
        rmsd bewteen 2 and 2 pharmacophore: 0.0
        rmsd bewteen 2 and 3 pharmacophore: 0.41
        rmsd bewteen 2 and 4 pharmacophore: 0.18
        rmsd bewteen 2 and 5 pharmacophore: 0.19
        rmsd bewteen 2 and 6 pharmacophore: 1.15
        rmsd bewteen 2 and 7 pharmacophore: 0.32
        rmsd bewteen 2 and 8 pharmacophore: 0.69
        rmsd bewteen 2 and 9 pharmacophore: 0.36
        ```
        They really have RMSD < binning step (1A by default). However, other pharmacophores with distinct hashes also have low RMSD to #2. Identical hashes guarantee low RMSD between corresponding pharmacophores, but not vice versa.
        
        ### Pharmacophore match
        Create a two-point pharmacophore model and match with a pharmacophore of a molecule (both pharmacophores should have identical binning steps)
        ```python
        q = P()
        q.load_from_feature_coords([('a', (3.17, -0.23, 0.24)), ('D', (-2.51, -1.28, -1.14))])
        p.fit_model(q)
        ```
        Output
        ```text
        (0, 1)
        ```
        If they do not match `None` will be returned
        
        ### Generate 3D pharmacophore fingerprint
        ```python
        # generate 3D pharmacophore fingerprint which takes into account stereoconfiguration
        b = p.get_fp(min_features=4, max_features=4)   # set of activated bits
        print(b)
        ```
        Output (a set of activated bit numbers):
        ```text
        {259, 1671, 521, 143, 912, 402, 278, 406, 1562, 1692, 1835, 173, 558, 1070, 942, 1202, 1845, 823, 1476, 197, 968, 1355, 845, 1741, 1364, 87, 1881, 987, 1515, 378, 628, 1141, 1401, 1146, 2043}
        ```
        Change settings:
        ```python
        b = p.get_fp(min_features=4, max_features=4, nbits=4096, activate_bits=2)
        print(b)
        ```
        Output (a set of activated bit numbers):
        ```text
        {389, 518, 2821, 1416, 2952, 395, 3339, 511, 3342, 1937, 1042, 2710, 1817, 1690, 3482, 3737, 286, 1824, 1700, 804, 1318, 2729, 3114, 812, 556, 175, 3763, 2356, 3124, 1077, 1975, 3384, 1081, 185, 65, 1223, 713, 1356, 1998, 1487, 2131, 85, 3670, 1877, 3030, 2395, 1116, 2141, 1885, 347, 2404, 1382, 1257, 3049, 2795, 3691, 2541, 1646, 2283, 241, 113, 3698, 756, 2548, 4086, 2293, 1528, 2802, 127}
        ```
        ### Save/load pharmacophore
        ```python
        p.save_to_pma('filename.pma')
        ```
        Output is a text file having json format.
        ```python
        p = P()
        p.load_from_pma('filename.pma')
        ```
        ### Support other formats
        Pharmacophores can be saved/loaded from LigandScout pml-files. Also pharmacophores can be read from xyz-files.
        
        ### Caching
        Pharmacophores can be created with enabled `cached` argument. This will speed up all futher repeated calls to retrive hash, fingerprints or descriptors.
        ```python
        p = P(cached=True)
        ```
        
        ## Speed tests
        Generation of pharmacophore signatures (hashes) is a CPU-bound task. The computation speed depends on the number of features in pharmacophores.  
        Tests were run on a random subset of compounds from Drugbank. Up to 50 conformers were generated for each compound.   
        Laptop configuration:
        - Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz
        - 12 GB RAM
        - calculation was run in 1 thread (the module is thread safe and calculations can be parallelized)
        
        To run the speed test use `pmapper_speed_test` command line tool
        
        ```text
        ========== Reading of conformers of molecules ==========
        329 molecules were read in 0.0134 s
        
        ========== Creation of pharmacophores (with enabled caching) ==========
        1938 pharmacophores were created in 3.17065 s
        
        ========== First calculation of hashes ==========
        2 pharmacophores with 0 features - 0.00014s or 7e-05s per pharmacophore
        2 pharmacophores with 1 features - 0.0001s or 5e-05s per pharmacophore
        12 pharmacophores with 2 features - 0.00042s or 3e-05s per pharmacophore
        44 pharmacophores with 3 features - 0.00212s or 5e-05s per pharmacophore
        100 pharmacophores with 4 features - 0.00933s or 9e-05s per pharmacophore
        103 pharmacophores with 5 features - 0.05155s or 0.0005s per pharmacophore
        105 pharmacophores with 6 features - 0.10857s or 0.00103s per pharmacophore
        109 pharmacophores with 7 features - 0.25322s or 0.00232s per pharmacophore
        117 pharmacophores with 8 features - 0.59508s or 0.00509s per pharmacophore
        101 pharmacophores with 9 features - 0.8795s or 0.00871s per pharmacophore
        105 pharmacophores with 10 features - 1.61349s or 0.01537s per pharmacophore
        100 pharmacophores with 11 features - 2.24937s or 0.02249s per pharmacophore
        103 pharmacophores with 12 features - 3.53308s or 0.0343s per pharmacophore
        117 pharmacophores with 13 features - 6.49837s or 0.05554s per pharmacophore
        103 pharmacophores with 14 features - 7.54796s or 0.07328s per pharmacophore
        142 pharmacophores with 15 features - 14.92654s or 0.10512s per pharmacophore
        104 pharmacophores with 16 features - 13.86378s or 0.13331s per pharmacophore
        100 pharmacophores with 17 features - 17.94023s or 0.1794s per pharmacophore
        120 pharmacophores with 18 features - 28.01455s or 0.23345s per pharmacophore
        136 pharmacophores with 19 features - 42.53481s or 0.31276s per pharmacophore
        113 pharmacophores with 20 features - 45.88228s or 0.40604s per pharmacophore
        
        ========== Second calculation of hashes of the same pharmacophores ==========
        2 pharmacophores with 0 features - 5e-05s or 2e-05s per pharmacophore
        2 pharmacophores with 1 features - 3e-05s or 1e-05s per pharmacophore
        12 pharmacophores with 2 features - 0.00012s or 1e-05s per pharmacophore
        44 pharmacophores with 3 features - 0.00041s or 1e-05s per pharmacophore
        100 pharmacophores with 4 features - 0.00089s or 1e-05s per pharmacophore
        103 pharmacophores with 5 features - 0.00166s or 2e-05s per pharmacophore
        105 pharmacophores with 6 features - 0.00316s or 3e-05s per pharmacophore
        109 pharmacophores with 7 features - 0.00707s or 6e-05s per pharmacophore
        117 pharmacophores with 8 features - 0.0166s or 0.00014s per pharmacophore
        101 pharmacophores with 9 features - 0.02005s or 0.0002s per pharmacophore
        105 pharmacophores with 10 features - 0.03527s or 0.00034s per pharmacophore
        100 pharmacophores with 11 features - 0.05271s or 0.00053s per pharmacophore
        103 pharmacophores with 12 features - 0.08097s or 0.00079s per pharmacophore
        117 pharmacophores with 13 features - 0.13274s or 0.00113s per pharmacophore
        103 pharmacophores with 14 features - 0.1588s or 0.00154s per pharmacophore
        142 pharmacophores with 15 features - 0.32687s or 0.0023s per pharmacophore
        104 pharmacophores with 16 features - 0.29255s or 0.00281s per pharmacophore
        100 pharmacophores with 17 features - 0.38286s or 0.00383s per pharmacophore
        120 pharmacophores with 18 features - 0.61327s or 0.00511s per pharmacophore
        136 pharmacophores with 19 features - 0.93486s or 0.00687s per pharmacophore
        113 pharmacophores with 20 features - 0.94041s or 0.00832s per pharmacophore
        ```
        
        ## Documentation
        More documentation can be found here - https://pmapper.readthedocs.io/en/latest/
        
        ## Citation
        Ligand-Based Pharmacophore Modeling Using Novel 3D Pharmacophore Signatures  
        Alina Kutlushina, Aigul Khakimova, Timur Madzhidov, Pavel Polishchuk  
        *Molecules* **2018**, 23(12), 3094  
        https://doi.org/10.3390/molecules23123094
        
        ##### Further publications
        Virtual Screening Using Pharmacophore Models Retrieved from Molecular Dynamic Simulations  
        Pavel Polishchuk, Alina Kutlushina, Dayana Bashirova, Olena Mokshyna, Timur Madzhidov  
        *Int. J. Mol. Sci.* **2019**, 20(23), 5834  
        https://doi.org/10.3390/ijms20235834
        
        ## License
        BSD-3 clause
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Chemistry
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Provides-Extra: rdkit
