Metadata-Version: 2.1
Name: paste-bio
Version: 1.0.1
Summary: A computational method to align and integrate spatial transcriptomics experiments.
Home-page: https://github.com/raphael-group/paste
Author: Max Land
Author-email: max.ruikang.land@gmail.com
License: UNKNOWN
Project-URL: Bug Tracker, https://github.com/raphael-group/paste/issues
Description: # PASTE
        
        PASTE is a computational method that leverages both gene expression similarity and spatial distances between spots align and integrate spatial transcriptomics data. In particular, there are two methods:
        1. `pairwise_align`: align spots across pairwise ST layers.
        2. `center_align`: integrate multiple ST layers into one center layer.
        
        You can read our preprint [here](https://www.biorxiv.org/content/10.1101/2021.03.16.435604v1). 
        
        PASTE is actively being worked on with future updates coming. 
        
        ### Dependencies
        
        To run PASTE, you will need the following Python packages:
        1. POT: Python Optimal Transport (https://PythonOT.github.io/)
        2. NetworkX (https://networkx.org/)
        3. Numpy
        4. Pandas 
        5. scipy.spatial
        6. sklearn.preprocessing
        
        ### Installation
        
        The easiest way is to install PASTE on pypi: https://pypi.org/project/paste-bio/.
        
        `pip install paste-bio`
        
        Check out Tutorial.ipynb for an example of how to use PASTE.
        
        Or you can clone the respository and run from command line (see below).
        
        
        ### Command Line
        
        We provide the option of running PASTE from the command line. 
        
        First, clone the repository:
        
        `git clone https://github.com/raphael-group/paste.git`
        
        Sample execution: `python paste-cmd-line.py -m pairwise -f file1.csv file2.csv file3.csv`
        
        Note: `pairwise` will return pairwise alignment between each consecutive pair of files (e.g. \[file1,file2\], \[file2,file3\]).
        
        | Flag | Name | Description | Default Value |
        | --- | --- | --- | --- |
        | -m | mode | Select either `pairwise` or `center` | (str) `pairwise` |
        | -f | files | Path to data files (.csv) | None |
        | -d | direc | Directory to store output files | Current Directory |
        | -a | alpha | alpha parameter for PASTE | (float) `0.1` |
        | -p | n_components | n_components for NMF step in `center_align` | (int) `15` |
        | -l | lmbda | lambda parameter in `center_align` | (floats) probability vector of length `n`  |
        | -i | intial_layer | Specify which file is also the intial layer in `center_align` | (int) `1` |
        | -t | threshold | Convergence threshold for `center_align` | (float) `0.001` |
        
        Input files are .csv files of the form:
        
        ```
               	'gene_a'  'gene_b'
        '2x5'	   0         9      
        '2x7'	   2         6      
        ```
        Where the columns indexes are gene names (str), row indexes are spatial coordinates (str), and entries are gene counts (int). In particular, row indexes are of the form `AxB` where `A` and `B` are floats.
        
        `pairwise_align` outputs a (.csv) file containing mapping of spots between each consecutive pair of layers. The rows correspond to spots of the first layer, and cols the second.
        
        `center_align` outputs two files containing the low dimensional representation (NMF decomposition) of the center layer gene expression, and files containing a mapping of spots between the center layer (rows) to each input layer (cols).
        
        ### Sample Dataset
        
        Added sample spatial transcriptomics dataset consisting of four breast cancer layers courtesy of:
        
        Ståhl, Patrik & Salmén, Fredrik & Vickovic, Sanja & Lundmark, Anna & Fernandez Navarro, Jose & Magnusson, Jens & Giacomello, Stefania & Asp, Michaela & Westholm, Jakub & Huss, Mikael & Mollbrink, Annelie & Linnarsson, Sten & Codeluppi, Simone & Borg, Åke & Pontén, Fredrik & Costea, Paul & Sahlén, Pelin Akan & Mulder, Jan & Bergmann, Olaf & Frisén, Jonas. (2016). Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science. 353. 78-82. 10.1126/science.aaf2403. 
        
        Note: Original data is (.tsv), but we converted it to (.csv).
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
