Metadata-Version: 2.1
Name: BRACoD
Version: 0.3.6
Summary: BRACoD is a method to identify associations between bacteria and physiological variables in Microbiome data
Home-page: https://github.com/ajverster/BRACoD/tree/main
Author: ['Adrian Verster']
Author-email: adrian.verster@hc-sc.gc.ca
License: UNKNOWN
Platform: UNKNOWN
Requires-Python: >3.6
Description-Content-Type: text/markdown

# BRACoD: Bayesian Regression Analysis of Compositional Data

### Installation

Installation in python: 

    pip install BRACoD

There is also an R interface, which depends on the python version being installed. There is a helper function that will do it for you, but it might be easier to do it with pip. You can install via CRAN.

    install.packages("BRACoD.R")

Or you can install via github.

    devtools::install_github("ajverster/BRACoD/BRACoD.R")

### Python Walkthrough

1. Simulate some data and normalize it

    ```python
    import BRACoD
    import numpy as np
    sim_counts, sim_y, contributions = BRACoD.simulate_microbiome_counts(BRACoD.df_counts_obesity)
    sim_relab = BRACoD.scale_counts(sim_counts)
    ```

2. Run BRACoD

    ```python
    trace = BRACoD.run_bracod(sim_relab, sim_y, n_sample = 1000, n_burn=1000, njobs=4)
    ```
    
3. Examine the diagnostics

    ```python
    BRACoD.convergence_tests(trace, sim_relab)
    ```

4. Examine the results

    ```python
    df_results = BRACoD.summarize_trace(trace, sim_counts.columns, 0.3)
    ```

5. Compare the results to the simulated truth

    ```python
    taxon_identified = df_results["taxon_num"].values
    taxon_actual = np.where(contributions != 0)[0]

    precision, recall, f1 = BRACoD.score(taxon_identified, taxon_actual)
    print("Precision: {}, Recall: {}, F1: {}".format(precision, recall, f1))
    ```

6. Try with your real data. We have included some functions to help you threshold and process your data
    
    ```python
    df_counts = BRACoD.threshold_count_data(BRACoD.df_counts_obesity)
    df_rel = BRACoD.scale_counts(df_counts)
    df_rel, Y = BRACoD.remove_null(df_rel, BRACoD.df_scfa_obesity["butyric"].values)
    trace = BRACoD.run_bracod(df_rel, Y, n_sample = 1000, n_burn=1000, njobs=4)
    df_results = BRACoD.summarize_trace(trace, df_rel.columns, 0.3)
    ```

The taxonomy information for these OTUs is available at ```BRACoD.df_taxonomy```
    
### R Walkthrough

1. Simulate some data and normalize it

    ```R
    library('BRACoD.R')
    data(obesity)
    r <- simulate_microbiome_counts(df_counts_obesity)

    sim_counts <- r$sim_counts
    sim_y <- r$sim_y
    contributions <- r$contributions
    sim_relab <- scale_counts(sim_counts)
    ```

2. Run BRACoD

    ```R
    trace <- run_bracod(sim_relab, sim_y, n_sample = 1000, n_burn=1000, njobs=4)
    ```
    
3. Examine the diagnostics

    ```R
    convergence_tests(trace, sim_relab)
    ```

4. Examine the results

    ```R
    df_results <- summarize_trace(trace, colnames(sim_counts))
    ```

5. Compare the results to the simulated truth

    ```R
    taxon_identified <- df_results$taxon_num
    taxon_actual <- which(contributions != 0)

    r <- score(taxon_identified, taxon_actual)
    
    print(sprintf("Precision: %.2f, Recall: %.2f, F1: %.2f", r$precision, r$recall, r$f1))
    ```

6. Try with your real data. We have included some functions to help you threshold and process your data
    
    ```R
    df_counts_obesity_sub <- threshold_count_data(df_counts_obesity)
    df_rel <- scale_counts(df_counts_obesity_sub)
    r <- remove_null(df_rel, df_scfa$butyric)
    df_rel <- r$df_rel
    Y <- r$Y
    
    trace <- run_bracod(df_rel, Y, n_sample = 1000, n_burn=1000, njobs=4)
    df_results <- summarize_trace(trace, colnames(df_counts_obesity_sub), 0.3)
    ```



