Metadata-Version: 2.1
Name: diffacto
Version: 1.0.6
Summary: A protein summarization method for shotgun proteomics experiments
Home-page: https://github.com/statisticalbiotechnology/diffacto
Author: Bo Zhang, Lukas Käll, KTH
Author-email: lukas.kall@scilifelab.se
Maintainer: Lukas Käll, KTH
Maintainer-email: lukas.kall@scilifelab.se
License: Apache
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
License-File: LICENSE.txt

Diffacto: Differential Factor Analysis for Comparative Shotgun Proteomics
==========================================================================

Requirements
--------------

Anaconda_ Python3.5+

Packages needed:

- numpy 1.10+
- scipy 0.17+
- pandas 0.18+
- networkx 1.10+
- scikit-learn 0.17+
- pyteomics_ 3.3+

.. _Anaconda: https://www.anaconda.com/
.. _pyteomics: https://pyteomics.readthedocs.io/

Installation via ``pip``
*************************

::

  pip install diffacto


Usage
-----

::

  diffacto.py [-h] -i I [-db [DB]] [-samples [SAMPLES]] [-log2 LOG2]
                       [-normalize {average,median,GMM,None}]
                       [-farms_mu FARMS_MU] [-farms_alpha FARMS_ALPHA]
                       [-reference REFERENCE] [-min_samples MIN_SAMPLES]
                       [-use_unique USE_UNIQUE]
                       [-impute_threshold IMPUTE_THRESHOLD]
                       [-cutoff_weight CUTOFF_WEIGHT] [-fast FAST] [-out OUT]
                       [-mc_out MC_OUT]
  optional arguments:
  -h, --help            show this help message and exit
  -i I                  Peptides abundances in CSV format. The first row
                        should contain names for all samples. The first column
                        should contain unique peptide sequences. Missing
                        values should be empty instead of zeros. (default:
                        None)
  -db [DB]              Protein database in FASTA format. If None, the peptide
                        file must have protein ID(s) in the second column.
                        (default: None)
  -samples [SAMPLES]    File of the sample list. One run and its sample group
                        per line, separated by tab. If None, read from peptide
                        file headings, then each run will be summarized as a
                        group. (default: None)
  -log2 LOG2            Input abundances are in log scale (True) or linear
                        scale (False) (default: False)
  -normalize {average,median,GMM,None}
                        Method for sample-wise normalization. (default: None)
  -farms_mu FARMS_MU    Hyperparameter mu (default: 0.1)
  -farms_alpha FARMS_ALPHA
                        Hyperparameter weight of prior probability (default:
                        0.1)
  -reference REFERENCE  Names of reference sample groups (separated by
                        semicolon) (default: average)
  -min_samples MIN_SAMPLES
                        Minimum number of samples peptides needed to be
                        quantified in (default: 1)
  -use_unique USE_UNIQUE
                        Use unique peptides only (default: False)
  -impute_threshold IMPUTE_THRESHOLD
                        Minimum fraction of missing values in the group.
                        Impute missing values if missing fraction is larger
                        than the threshold. (default: 0.99)
  -cutoff_weight CUTOFF_WEIGHT
                        Peptides weighted lower than the cutoff will be
                        excluded (default: 0.5)
  -fast FAST            Allow early termination in EM calculation when noise
                        is sufficiently small. (default: False)
  -out OUT              Path to output file (writing in TSV format).
  -mc_out MC_OUT        Path to MCFDR output (writing in TSV format).
                        (default: None)


Example
-------

Examples are given in the example_ directory.

.. _example: ./example


