Metadata-Version: 2.1
Name: disteval
Version: 0.1
Summary: DISTEVAL: For inter-residue protein distance evaluation
Home-page: https://github.com/ba-lab/disteval
Author: ba-lab
Author-email: adhikarib@umsl.edu
License: UNKNOWN
Description: <p align="center">
        <img src="https://github.com/ba-lab/disteval/blob/main/banner.png" alt="DISTEVAL BANNER" width=800/>
        </p>
        
        # DISTEVAL: Protein distance evaluation
        
        ## Project abstract
        **Background:** Protein inter-residue contact and distance prediction are two key intermediate steps essential to accurate protein structure prediction. Distance prediction comes in two forms: real-valued distances and 'binned' distograms, which are a more finely grained variant of the binary contact prediction problem. The latter has been introduced as a new challenge in the 14<sup>th</sup> Critical Assessment of Techniques for Protein Structure Prediction (CASP14) 2020 experiment. Despite the recent proliferation of methods for predicting distances, few methods exist for evaluating these predictions.  Currently only numerical metrics, which evaluate the entire prediction at once, are used.  These give no insight into the structural details of a prediction. For this reason, new methods and tools are needed.  
        **Results:** We have developed a web server for evaluating predicted inter-residue distances. Our server, DISTEVAL, accepts predicted contacts, distances, and a true structure as optional inputs to generate informative heatmaps, chord diagrams, and 3D models. All of these outputs facilitate visual and qualitative assessment. The server also evaluates predictions using other metrics such as mean absolute error, root mean squared error, and contact precision.  
        **Conclusions:** The visualizations generated by DISTEVAL complement each other and collectively serve as a powerful tool for both quantitative and qualitative assessments of predicted contacts and distances, even in the absence of a true 3D structure.
        
        ## Webserver
        [http://deep.cs.umsl.edu/disteval/](http://deep.cs.umsl.edu/disteval/)
        
        # Distance/contact evaluation using `disteval.py`
        
        ## Download
        Download from [https://github.com/ba-lab/disteval/releases](https://github.com/ba-lab/disteval/releases)
        
        ## Prerequisites
        - [x] Python3
        - [x] Numpy
        - [x] Scikit-learn
        
        ## Installation from PIP
        ```bash
        pip install disteval
        ```
        
        ## Test
        
        ### Example 0. See help
           ```bash
           disteval -h
           ```
        ### Download the test files from
           [https://github.com/ba-lab/disteval/blob/main/test/] (https://github.com/ba-lab/disteval/blob/main/test/)
           
        
        ### Example 1. Evaluate a predicted RR contacts file
           ```bash
           disteval -n ./test/1guuA.pdb -c ./test/1guuA.contact.rr
           ```
           Expected output:
           ```
           Evaluating contacts..
           min-seq-sep: 12 xL: Top-L/5 {'precision': 1.0, 'count': 9}
           min-seq-sep: 12 xL: Top-L   {'precision': 1.0, 'count': 9}
           min-seq-sep: 12 xL: Top-NC  {'precision': 1.0, 'count': 9}
           min-seq-sep: 24 xL: Top-L/5 {'precision': 1.0, 'count': 1}
           min-seq-sep: 24 xL: Top-L   {'precision': 1.0, 'count': 1}
           min-seq-sep: 24 xL: Top-NC  {'precision': 1.0, 'count': 1}
           ```
        ### Example 2. Evaluate a predicted distance map
           ```bash
          disteval -n ./test/1guuA.pdb -d ./test/1guuA.predicted.npy
           ```
           Expected output:
           ```
           Evaluating distances..
           min-seq-sep: 12 xL: Top-L/5 {'mae': 0.9403, 'mse': 1.5143, 'rmse': 1.2306, 'count': 10}
           min-seq-sep: 12 xL: Top-L   {'mae': 1.7522, 'mse': 5.6841, 'rmse': 2.3841, 'count': 50}
           min-seq-sep: 12 xL: Top-NC  {'mae': 1.9263, 'mse': 6.6872, 'rmse': 2.586, 'count': 603}
           min-seq-sep: 24 xL: Top-L/5 {'mae': 1.8154, 'mse': 4.6469, 'rmse': 2.1557, 'count': 10}
           min-seq-sep: 24 xL: Top-L   {'mae': 2.1541, 'mse': 8.1816, 'rmse': 2.8603, 'count': 50}
           min-seq-sep: 24 xL: Top-NC  {'mae': 2.4536, 'mse': 9.6231, 'rmse': 3.1021, 'count': 295}
           Evaluating contacts..
           min-seq-sep: 12 xL: Top-L/5 {'precision': 0.9, 'count': 10}
           min-seq-sep: 12 xL: Top-L   {'precision': 0.6, 'count': 30}
           min-seq-sep: 12 xL: Top-NC  {'precision': 0.6, 'count': 30}
           min-seq-sep: 24 xL: Top-L/5 {'precision': 0.5, 'count': 10}
           min-seq-sep: 24 xL: Top-L   {'precision': 0.38462, 'count': 13}
           min-seq-sep: 24 xL: Top-NC  {'precision': 0.38462, 'count': 13}
           ```
        ### Example 3. Evaluate trRosetta prediction
           ```bash
           disteval -n ./test/1guuA.pdb -r ./test/1guuA.npz 
           ```
           Expected output:
           ```
           Evaluating distances..
           min-seq-sep: 12 xL: Top-L/5 {'mae': 0.5485, 'mse': 0.5375, 'rmse': 0.7331, 'count': 10}
           min-seq-sep: 12 xL: Top-L   {'mae': 0.6789, 'mse': 0.7678, 'rmse': 0.8762, 'count': 50}
           min-seq-sep: 12 xL: Top-NC  {'mae': 1.2951, 'mse': 3.8733, 'rmse': 1.9681, 'count': 741}
           min-seq-sep: 24 xL: Top-L/5 {'mae': 0.537, 'mse': 0.4237, 'rmse': 0.6509, 'count': 10}
           min-seq-sep: 24 xL: Top-L   {'mae': 0.6691, 'mse': 0.6725, 'rmse': 0.8201, 'count': 50}
           min-seq-sep: 24 xL: Top-NC  {'mae': 1.2281, 'mse': 3.2863, 'rmse': 1.8128, 'count': 351}
        
           Evaluating contacts..
           min-seq-sep: 12 xL: Top-L/5 {'precision': 1.0, 'count': 10}
           min-seq-sep: 12 xL: Top-L   {'precision': 0.8, 'count': 30}
           min-seq-sep: 12 xL: Top-NC  {'precision': 0.8, 'count': 30}
           min-seq-sep: 24 xL: Top-L/5 {'precision': 1.0, 'count': 10}
           min-seq-sep: 24 xL: Top-L   {'precision': 0.84615, 'count': 13}
           min-seq-sep: 24 xL: Top-NC  {'precision': 0.84615, 'count': 13}
           ```
        
        ### Example 4. Evaluate a CASP14 RR file
           ```bash
           wget http://deep.cs.umsl.edu/disteval/static/data/casp14/T1024/RaptorX_RR1
           wget http://deep.cs.umsl.edu/disteval/static/data/casp14/casp14_pdbs/T1024.pdb
           
           disteval -n ./T1024.pdb -c ./RaptorX_RR1
           ```
           Expected output:
           ```
           Evaluating distances..
           min-seq-sep: 12 xL: Top-L/5 {'mae': 1.7837, 'mse': 4.9053, 'rmse': 2.2148, 'count': 78}
           min-seq-sep: 12 xL: Top-L   {'mae': 2.4797, 'mse': 13.0069, 'rmse': 3.6065, 'count': 392}
           min-seq-sep: 12 xL: Top-NC  {'mae': 3.6061, 'mse': 16.4059, 'rmse': 4.0504, 'count': 5459}
           min-seq-sep: 24 xL: Top-L/5 {'mae': 1.7837, 'mse': 4.9053, 'rmse': 2.2148, 'count': 78}
           min-seq-sep: 24 xL: Top-L   {'mae': 2.4398, 'mse': 12.8404, 'rmse': 3.5834, 'count': 392}
           min-seq-sep: 24 xL: Top-NC  {'mae': 3.6114, 'mse': 16.4634, 'rmse': 4.0575, 'count': 4906}
           Evaluating contacts..
           min-seq-sep: 12 xL: Top-L/5 {'precision': 0.9359, 'count': 78}
           min-seq-sep: 12 xL: Top-L   {'precision': 0.82143, 'count': 392}
           min-seq-sep: 12 xL: Top-NC  {'precision': 0.68562, 'count': 633}
           min-seq-sep: 24 xL: Top-L/5 {'precision': 0.9359, 'count': 78}
           min-seq-sep: 24 xL: Top-L   {'precision': 0.80357, 'count': 392}
           min-seq-sep: 24 xL: Top-NC  {'precision': 0.68631, 'count': 577}
           ```
        
        # Evaluation through 3D modeling using `disteval.py`
        
        ## Prerequisites
        - [x] Install csh
           ```bash
           sudo apt install csh
           ```
        - [x] Download 'dssp-2.0.4-linux-amd64' from https://osf.io/qydjv/
           ```bash
           chmod +x dssp-2.0.4-linux-amd64
           ```
        - [x] Download TM-score from https://zhanglab.ccmb.med.umich.edu/TM-score/TMscore.gz
            ```bash
            wget https://zhanglab.ccmb.med.umich.edu/TM-score/TMscore.gz
            gunzip TMscore.gz
            chmod +x TMscore
            ```
        - [x] DISTFOLD
            - Follow instructions [here](DISTFOLD.md) to download DISTFOLD, an updated version of CONFOLD.
        
        ## Test
        
        ### Example 1. Predicted contacts (RR file) & Secondary structure
           ```
           disteval -f ./test/1guuA.fasta -n ./test/1guuA.pdb -c ./test/1guuA.contact.rr -s ./test/1guuA.ss -o ./build-1guuA  -b
           ```
        
           Expected output:
           ```
           TM-score RMSD    GDT-TS MODEL
           0.297    10.100  0.385  1guuA_11.pdb
           0.320     7.729  0.460  1guuA_8.pdb
           ...
           0.465     3.935  0.630  1guuA_model1.pdb
           0.483     5.776  0.600  1guuA_model2.pdb
           0.550     4.534  0.665  1guuA_5.pdb
           ```
        
        ### Example 2. Predicted distance map (up to 12Å) without local distances & Secondary structure
           ```
           disteval -f ./test/1guuA.fasta -n ./test/1guuA.pdb -d ./test/1guuA.predicted.npy -s ./test/1guuA.ss -o ./build-1guuA -b -m 6 -t 12
           ```
        
           Expected output:
           ```
           TM-score RMSD    GDT-TS MODEL
           0.107    37.610  0.155  extended.pdb
           0.630     3.016  0.745  1guuA_11.pdb
           ...
           0.681     2.528  0.785  1guuA_6.pdb
           0.681     2.489  0.790  1guuA_9.pdb
           ```
        
        ### Example 3. Predicted distance map (up to 12Å) including local distances
           ```
           disteval -f ./test/1guuA.fasta -n ./test/1guuA.pdb -d ./test/1guuA.predicted.npy -s ./test/1guuA.ss -o ./build-1guuA -b -m 2 -t 12
           ```
        
           Expected output:
           ```
           TM-score RMSD    GDT-TS MODEL
           0.107    37.610  0.155  extended.pdb
           0.253    10.230  0.340  1guuA_11.pdb
           ...
           0.681     3.349  0.775  1guuA_13.pdb
           0.684     2.330  0.795  1guuA_3.pdb
           ```
           
        ### Example 4. Reconstruction using a native (true) distance map
           ```
           disteval -f ./test/1guuA.fasta -n ./test/1guuA.pdb -o ./build-1guuA -p -b -m 2 -t 18
           ```
        
           Expected output:
           ```
           TM-score RMSD    GDT-TS MODEL
           0.107    37.610  0.155  extended.pdb
           ...
           0.987     0.265  1.000  1guuA_model2.pdb
           0.991     0.214  1.000  1guuA_16.pdb
           ```
        
        ### Example 5. Distances predicted by trRosetta method
           ```
           disteval -f ./test/1guuA.fasta -n ./test/1guuA.pdb -r ./test/1guuA.npz -o ./build-1guuA -b -m 2 -t 12
           ```
           Expected output:
           ```
           TM-score RMSD    GDT-TS MODEL
           0.107    37.610  0.155  extended.pdb
           0.268     9.724  0.375  1guuA_14.pdb
           ...
           0.876     0.979  0.940  1guuA_model1.pdb
           0.880     1.151  0.950  1guuA_16.pdb
           ```
        
        ## Using as a Library
        ### Usage
        
        ### Example 1. Convert PDB file to distance map
           ```bash
           from disteval import pdp2dmap
        
           pdb2dmap('path_to_pdb_file')
           ```
        
        ### Example 2. Convert trRosetta prediction file (.npz) file to distance map
        ```bash
           from disteval import pdp2dmap
           
           pdb2dmap('path_to_pdb_file')
           ```
           
        ```
        For other functions please check https://github.com/ba-lab/disteval/blob/main/disteval.py
        ```
        
        ## Contact  
        Badri Adhikari  
        adhikarib@umsl.edu  
        University of Missouri-St. Louis  
        
        ## Published By
        Bikash Shrestha
        bsmmy@umsystem.edu
        University of Missouri-St. Louis
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Description-Content-Type: text/markdown
