{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How to use `indigopy`\n",
    "\n",
    "Example code for how to use the `indigopy` package. The sample data used in this example notebook is derived from the [INDIGO](https://doi.org/10.1007/978-1-4939-8891-4_13), [INDIGO-MTB](https://doi.org/10.1128/mbio.02627-19), and [MAGENTA](https://doi.org/10.1371/journal.pcbi.1006677) publications.  \n",
    "\n",
    "## Set up environment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import dependencies\n",
    "import pandas as pd\n",
    "from scipy.stats import spearmanr\n",
    "from sklearn.metrics import r2_score, classification_report\n",
    "from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor\n",
    "import seaborn as sns\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# Import package functions\n",
    "import sys\n",
    "sys.path.append('c:/Users/carol/github/INDIGOpy/') # modify if testing locally in different machine; remove once package is published\n",
    "from indigopy.core import load_sample, featurize, classify"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: *E. coli*\n",
    "\n",
    "The following analysis and results were originally reported in the [INDIGO](https://doi.org/10.1007/978-1-4939-8891-4_13) paper.  \n",
    "- **Training dataset**: 105 two-way interactions between 15 antibiotics  \n",
    "- **Testing dataset**: 66 two-way interactions between the 15 antibiotics in the training set + 4 new antibiotics  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Defining INDIGO features: 100%|██████████| 105/105 [00:00<00:00, 406.95it/s]\n",
      "Defining INDIGO features: 100%|██████████| 66/66 [00:00<00:00, 464.53it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Regression results:\n",
      "\tSpearman R = 0.6575\n",
      "\tSpearman p = 2e-09\n",
      "\tR2 = 0.3895\n",
      "Classification results:\n",
      "              precision    recall  f1-score   support\n",
      "\n",
      "           A       0.50      0.31      0.38        13\n",
      "           N       0.68      0.90      0.78        42\n",
      "           S       0.50      0.09      0.15        11\n",
      "\n",
      "    accuracy                           0.65        66\n",
      "   macro avg       0.56      0.43      0.44        66\n",
      "weighted avg       0.61      0.65      0.59        66\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Load sample data\n",
    "sample = load_sample('ecoli')\n",
    "\n",
    "# Define input arguments\n",
    "key             = sample['key']\n",
    "profiles        = sample['profiles']\n",
    "feature_names   = sample['feature_names']\n",
    "train_ixns      = sample['train']['interactions']\n",
    "train_scores    = sample['train']['scores']\n",
    "test_ixns       = sample['test']['interactions']\n",
    "test_scores     = sample['test']['scores']\n",
    "\n",
    "# Determine ML features\n",
    "train_data      = featurize(train_ixns, profiles, feature_names=feature_names, key=key, silent=True)\n",
    "test_data       = featurize(test_ixns, profiles, feature_names=feature_names, key=key, silent=True)\n",
    "X_train, X_test = train_data['feature_df'].to_numpy().transpose(), test_data['feature_df'].to_numpy().transpose()\n",
    "\n",
    "# Determine class labels\n",
    "thresh, classes = (-0.5, 2), ('S', 'N', 'A')\n",
    "train_labels    = classify(train_scores, thresholds=thresh, classes=classes)\n",
    "test_labels     = classify(test_scores, thresholds=thresh, classes=classes)\n",
    "\n",
    "# Train and apply a regression-based model\n",
    "reg_model = RandomForestRegressor()\n",
    "reg_model.fit(X_train, train_scores)\n",
    "reg_y = reg_model.predict(X_test)\n",
    "r, p = spearmanr(test_scores, reg_y)\n",
    "r2 = r2_score(test_scores, reg_y)\n",
    "print('Regression results:')\n",
    "print('\\tSpearman R = {}'.format(round(r, 4)))\n",
    "print('\\tSpearman p = {:.3g}'.format(p))\n",
    "print('\\tR2 = {}'.format(round(r2, 4)))\n",
    "\n",
    "# Train and apply a classification-based model\n",
    "class_model = RandomForestClassifier()\n",
    "class_model.fit(X_train, train_labels)\n",
    "class_y = class_model.predict(X_test)\n",
    "print('Classification results:')\n",
    "print(classification_report(test_labels, class_y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: *M. tuberculosis*\n",
    "\n",
    "The following analysis and results were originally reported in the [INDIGO-MTB](https://doi.org/10.1128/mbio.02627-19) paper.  \n",
    "- **Training dataset**: 196 two- to five-way interactions between 40 antibacterials  \n",
    "- **Testing dataset**: 36 two- to three-way interactions between the 13 antibacterials  \n",
    "- **Clinical dataset**: clinical outcomes for 57 two- to five-way interactions between 7 antibacterials  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Defining INDIGO features: 100%|██████████| 196/196 [00:00<00:00, 512.41it/s]\n",
      "Defining INDIGO features: 100%|██████████| 36/36 [00:00<00:00, 538.73it/s]\n",
      "Defining INDIGO features: 100%|██████████| 57/57 [00:00<00:00, 552.91it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Regression results:\n",
      "\tSpearman R = 0.5348\n",
      "\tSpearman p = 0.000779\n",
      "\tR2 = 0.122\n",
      "Classification results:\n",
      "              precision    recall  f1-score   support\n",
      "\n",
      "           A       0.62      0.31      0.42        16\n",
      "           N       0.00      0.00      0.00         1\n",
      "           S       0.73      0.84      0.78        19\n",
      "\n",
      "    accuracy                           0.58        36\n",
      "   macro avg       0.45      0.38      0.40        36\n",
      "weighted avg       0.66      0.58      0.60        36\n",
      "\n",
      "Clinical results:\n",
      "\tSpearman R = 0.5035\n",
      "\tSpearman p = 6.55e-05\n"
     ]
    }
   ],
   "source": [
    "# Load sample data\n",
    "sample = load_sample('mtb')\n",
    "\n",
    "# Define input arguments\n",
    "key             = sample['key']\n",
    "profiles        = sample['profiles']\n",
    "feature_names   = sample['feature_names']\n",
    "train_ixns      = sample['train']['interactions']\n",
    "train_scores    = sample['train']['scores']\n",
    "test_ixns       = sample['test']['interactions']\n",
    "test_scores     = sample['test']['scores']\n",
    "clinical_ixns   = sample['clinical']['interactions']\n",
    "clinical_scores = sample['clinical']['scores']\n",
    "\n",
    "# Determine ML features\n",
    "train_data      = featurize(train_ixns, profiles, feature_names=feature_names, key=key, silent=True)\n",
    "test_data       = featurize(test_ixns, profiles, feature_names=feature_names, key=key, silent=True)\n",
    "clinical_data   = featurize(clinical_ixns, profiles, feature_names=feature_names, key=key, silent=True)\n",
    "X_train, X_test = train_data['feature_df'].to_numpy().transpose(), test_data['feature_df'].to_numpy().transpose()\n",
    "X_clinical      = clinical_data['feature_df'].to_numpy().transpose()\n",
    "\n",
    "# Determine class labels\n",
    "thresh, classes = (0.9, 1.1), ('S', 'N', 'A')\n",
    "train_labels    = classify(train_scores, thresholds=thresh, classes=classes)\n",
    "test_labels     = classify(test_scores, thresholds=thresh, classes=classes)\n",
    "\n",
    "# Train and apply a regression-based model\n",
    "reg_model = RandomForestRegressor()\n",
    "reg_model.fit(X_train, train_scores)\n",
    "reg_y = reg_model.predict(X_test)\n",
    "r, p = spearmanr(test_scores, reg_y)\n",
    "r2 = r2_score(test_scores, reg_y)\n",
    "print('Regression results:')\n",
    "print('\\tSpearman R = {}'.format(round(r, 4)))\n",
    "print('\\tSpearman p = {:.3g}'.format(p))\n",
    "print('\\tR2 = {}'.format(round(r2, 4)))\n",
    "\n",
    "# Train and apply a classification-based model\n",
    "class_model = RandomForestClassifier()\n",
    "class_model.fit(X_train, train_labels)\n",
    "class_y = class_model.predict(X_test)\n",
    "print('Classification results:')\n",
    "print(classification_report(test_labels, class_y))\n",
    "\n",
    "# Apply model to clinical data\n",
    "clinical_y = reg_model.predict(X_clinical)\n",
    "r, p = spearmanr(clinical_scores, clinical_y)\n",
    "print('Clinical results:')\n",
    "print('\\tSpearman R = {}'.format(round(-r, 4)))\n",
    "print('\\tSpearman p = {:.3g}'.format(p))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: *S. aureus*\n",
    "\n",
    "The following analysis and results were originally reported in the [INDIGO](https://doi.org/10.1007/978-1-4939-8891-4_13) paper.  \n",
    "- **Training dataset**: 171 two-way interactions between 19 antibiotics measured in *E. coli*  \n",
    "- **Testing dataset**: 45 two-way interactions between the 10 antibiotics measured in *S. aureus*  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Defining INDIGO features: 100%|██████████| 171/171 [00:00<00:00, 508.65it/s]\n",
      "Defining INDIGO features: 100%|██████████| 45/45 [00:00<00:00, 569.02it/s]\n",
      "Mapping orthologous genes: 100%|██████████| 1/1 [00:01<00:00,  1.99s/it]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Regression results:\n",
      "\tSpearman R = 0.5428\n",
      "\tSpearman p = 0.000117\n",
      "\tR2 = -1.257\n",
      "Classification results:\n",
      "              precision    recall  f1-score   support\n",
      "\n",
      "           A       0.00      0.00      0.00         2\n",
      "           N       0.49      1.00      0.66        22\n",
      "           S       0.00      0.00      0.00        21\n",
      "\n",
      "    accuracy                           0.49        45\n",
      "   macro avg       0.16      0.33      0.22        45\n",
      "weighted avg       0.24      0.49      0.32        45\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "c:\\Users\\carol\\AppData\\Local\\Programs\\PythonCodingPack\\lib\\site-packages\\sklearn\\metrics\\_classification.py:1221: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.\n",
      "  _warn_prf(average, modifier, msg_start, len(result))\n"
     ]
    }
   ],
   "source": [
    "# Load sample data\n",
    "sample = load_sample('saureus')\n",
    "\n",
    "# Define input arguments\n",
    "key             = sample['key']\n",
    "profiles        = sample['profiles']\n",
    "feature_names   = sample['feature_names']\n",
    "train_ixns      = sample['train']['interactions']\n",
    "train_scores    = sample['train']['scores']\n",
    "test_ixns       = sample['test']['interactions']\n",
    "test_scores     = sample['test']['scores']\n",
    "strains         = sample['orthology']['strains']\n",
    "orthology_map   = sample['orthology']['map']\n",
    "\n",
    "# Determine ML features\n",
    "train_data      = featurize(train_ixns, profiles, feature_names=feature_names, key=key, silent=True)\n",
    "test_data       = featurize(test_ixns, profiles, feature_names=feature_names, key=key, silent=True, \n",
    "                            strains=strains, orthology_map=orthology_map)\n",
    "X_train, X_test = train_data['feature_df'].to_numpy().transpose(), test_data['feature_df'].to_numpy().transpose()\n",
    "\n",
    "# Determine class labels\n",
    "thresh, classes = (-0.5, 2), ('S', 'N', 'A')\n",
    "train_labels    = classify(train_scores, thresholds=thresh, classes=classes)\n",
    "test_labels     = classify(test_scores, thresholds=thresh, classes=classes)\n",
    "\n",
    "# Train and apply a regression-based model\n",
    "reg_model = RandomForestRegressor()\n",
    "reg_model.fit(X_train, train_scores)\n",
    "reg_y = reg_model.predict(X_test)\n",
    "r, p = spearmanr(test_scores, reg_y)\n",
    "r2 = r2_score(test_scores, reg_y)\n",
    "print('Regression results:')\n",
    "print('\\tSpearman R = {}'.format(round(r, 4)))\n",
    "print('\\tSpearman p = {:.3g}'.format(p))\n",
    "print('\\tR2 = {}'.format(round(r2, 4)))\n",
    "\n",
    "# Train and apply a classification-based model\n",
    "class_model = RandomForestClassifier()\n",
    "class_model.fit(X_train, train_labels)\n",
    "class_y = class_model.predict(X_test)\n",
    "print('Classification results:')\n",
    "print(classification_report(test_labels, class_y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: *A. baumannii*\n",
    "\n",
    "The following analysis and results were originally reported in the [MAGENTA](https://doi.org/10.1371/journal.pcbi.1006677) paper.  \n",
    "- **Training dataset**: 338 two- to three-way interactions between 24 antibiotics measured in *E. coli* cultured in various media conditions  \n",
    "- **Testing dataset**: 45 two-way interactions between the 8 antibiotics measured in *A. baumannii*  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Defining INDIGO features: 100%|██████████| 338/338 [00:00<00:00, 497.36it/s]\n",
      "Defining INDIGO features: 100%|██████████| 45/45 [00:00<00:00, 589.56it/s]\n",
      "Mapping orthologous genes: 100%|██████████| 1/1 [00:02<00:00,  2.58s/it]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Regression results:\n",
      "\tSpearman R = 0.614\n",
      "\tSpearman p = 7.28e-06\n",
      "\tR2 = -0.3302\n",
      "Classification results:\n",
      "              precision    recall  f1-score   support\n",
      "\n",
      "           A       0.43      0.94      0.59        17\n",
      "           N       0.00      0.00      0.00        11\n",
      "           S       0.86      0.35      0.50        17\n",
      "\n",
      "    accuracy                           0.49        45\n",
      "   macro avg       0.43      0.43      0.36        45\n",
      "weighted avg       0.49      0.49      0.41        45\n",
      "\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAOkAAAEXCAYAAABI2GM+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAq+klEQVR4nO3dd1gU59rH8S8sCCo2DGI3RmNJ1NgrNoIRiYhAYsEae4vljUTURLDFhj12TXyNxnJUQIzRaOyKPcZysERflaKiNCmisMz7B4c9ouJSlwHuz3V5Xezs7sy9Cz+fmd2Z5zZSFEVBCKFaxnldgBDi3SSkQqichFQIlZOQCqFyElIhVE5CKoTKSUjzqcTERGxsbBgyZEiGHr98+XJmzJiRy1XlvqVLl+Lr6wuAk5MTz549y9uCDMAkrwsQWXPw4EHq1KnDtWvXuHPnDjVq1Mjrkgxi3Lhxup/9/PzysBLDMZKTGfKnfv364eDgwO3bt0lKStI7Si5fvpwLFy6QlJREdHQ0devWxdPTEwsLC44cOcKaNWt4+fIlERERdO/enfHjx3P27FlmzpzJ3r17AdLcXr58OQ8ePODx48c8efKEjz/+mBYtWuDr60twcDDu7u507dqVp0+fMm3aNMLDw3ny5AmVKlViyZIllC1bFltbW5ydnQkICODhw4c4OTnptrt48WKqVKmie33Tp0+nSZMmeHh48OGHHzJ48GBq165NQEAAlpaWhnjL84zs7uZD//zzD3/99Rf29vZ0794dPz8/IiMj9T7vwYMHLF++HH9/fxRFYdWqVSiKwk8//cTcuXPZvXs327dvZ+3atUREROhd38WLF1mxYgU+Pj4cP36cO3fusGXLFr7//nuWL18OwG+//UbDhg3Zvn07f/75J+bm5mlGwPj4eH799Ve2bdvGTz/9RFBQEABXrlxh0KBB+Pr64uLiwuLFi7P4buV/EtJ8aOvWrXTs2JEyZcrQoEEDKleuzI4dO/Q+r1OnTlhaWmJkZISrqyunT5/GyMiI1atXc/36dX788Ufmzp2Loig8f/5c7/pat25NiRIlMDc3p1y5crRt2xaAqlWrEhUVBcCAAQNo3LgxP//8M15eXty+fZv4+HjdOj799FMArK2tKVu2LNHR0QBUrFiRunXrAvDRRx/plhdGckyaz8THx+Pn50eRIkWwtbUFIDY2ls2bNzNo0CBMTU3Tfa5Go9H9nJycjImJCfHx8Tg7O2NnZ0fTpk1xdXXl0KFDKIqCkZERrx4NJSYmpllfkSJF0tw2MXnzz2nBggVcuXIFV1dXWrRoQVJSUpp1mpmZ6X5+dXvm5uZvXV4YyUiaz/j7+1O6dGlOnDjB4cOHOXz4MIcOHSI+Pp79+/e/87mHDx8mOjoarVbLjh07aNeuHffv3yc2Npbx48dja2vL2bNnefnyJcnJyVhaWhIaGkp4eDiKovDbb79lut6TJ08yYMAAunfvTtmyZTl9+jRarTarL79QkpE0n9m6dStfffVVmlGxZMmS9OvXj40bN7Jnzx569eql2418VY0aNRg+fDjPnj2jSZMmDBs2DFNTUzp06ECXLl0oUqQItWrVombNmty/f5+2bdvSq1cvXF1dsbKyokOHDly9ejVT9Y4ePZr58+ezdOlSTE1Nady4MQ8ePMj2+1CYyKe7Qqic7O4KoXISUiFUTkIqhMpJSIVQOQmpEConIRVC5fLse9LY2Fh69erF6tWrqVy5cpr7AgMDmTp1KnFxcTRt2pTp06djYmJCaGgo7u7uhIeHU716dby9vSlevHimthsZGUdysnzrJNTD2NiIMmXS/zvOk5H077//pnfv3ty7d++t97u7uzNt2jQOHDiAoii681KnT5+Om5sb+/fvp169eqxcuTLT205OVuSf/FPdv3fJk5Du2LEDT09PypUr98Z9ISEhJCQk0LBhQwBcXFzYv38/iYmJnD9/ns6dO6dZLkRBlye7u7Nnz073vrCwMKysrHS3raysePz4MZGRkVhYWOhO4k5dnllly1pkvmAh8pDqzt1NTk7GyMhId/vVqzFeXQ68cTsjwsNj9e5eCGFIxsZG7xw8VPfpbvny5Xny5Inu9tOnTylXrhyWlpbExMTorqB48uTJW3eXhShoVBfSSpUqYWZmxsWLF4GUeWzatWuHqakpTZs2Zd++fQD4+vrSrl27vCxVCINQTUiHDh2quwzK29ubOXPmYG9vT3x8PP379wfA09OTHTt24ODgwIULFxg/fnweViyEYRS6S9Wyekx66tRxTp48lqHHRkdHAVCqVOkMr9/Gpj1t2sieQWGU745JC4Lo6OhCPSePyFkykuaCefNmAjBp0ve5uh1RMMhIKkQ+JyEVQuUkpEKonIRUCJWTkAqhchJSIVROQiqEyklIhVA5CakQKichFULlJKRCqJyEVAiVk5AKoXISUiFUTkIqhMpJSIVQuTyZ0tPf359Vq1aRlJTEgAED6NOnj+6+wMBAPDw8dLcjIiIoVaoUe/fuxcfHh4ULF1K2bFkAOnTowIQJEwxevxCGZPCQPn78mMWLF7N7926KFClCr169aNGiBTVr1gSgbt26+Pn5AfD8+XO+/PJLvLy8ALh27RoeHh507drV0GULkWcMvrt7+vRpWrZsSenSpSlWrBidO3dOt13EmjVraNasGU2bNgXg6tWr+Pj44OjoyMSJE2UeIVEoGHwkfb2NRLly5bhy5cobj4uJiWHHjh34+/vrlllZWTFo0CAaN27MokWLmDFjBgsXLszU9g3RZsLUVAOAlVWJXN+WKPgMHtL02ki8bs+ePdjZ2emOPwFWrFih+3nIkCF06tQp09s3xERkiYmps+zH5Op2RMGguonIXm8jkV67iEOHDuHg4KC7HRMTw8aNG3W3FUVBo9Hkaq1CqIHBQ9q6dWsCAgKIiIjg+fPn/PHHH2+0i1AUhevXr9OoUSPdsmLFirF+/Xr+/vtvADZv3pylkVSI/Mbgu7vW1tZMmDCB/v37k5iYyBdffEGDBg0YOnQoY8eOpX79+kRERGBqaoqZmZnueRqNhiVLluDl5UVCQgLvv/8+8+fPN3T5QhicTI6dC2RybJEZqjsmFUJkjoRUCJWTkAqhchJSIVROQiqEyuXJVTBCfXKzSbI0SM4eGUlFpkmTZMOSkVQA0KZNuwyPdvI9sGHJSCqEyklIhVA5CakQKichFULlJKRCqJyEVAiVK9Rfwfz66yaCgu7n+HofPEhZZ+pXFTmtSpVquLn1z5V1C/Up1CENCrrPzdv/oDEvnaPrTdamTOvyT9DTHF0vgDYhKsfXKdStUIcUQGNemmLVPs3rMjIs/v6feV2CMDA5JhVC5fIkpP7+/jg4OPDZZ5+xZcuWN+7/8ccf6dixI05OTjg5OekeExoaSp8+fbC3t2fkyJHExcUZunQhDE51bSYgpZ3EokWL0swWCDB9+nTc3Nz4/PPPWbFiBStXrsTd3d3QL0EIgzJ4SF9tMwHo2kyMGTNG95hr166xZs0aQkJCaNasGZMmTcLY2Jjz58/rJsh2cXGhb9++EtJ3kE+vCwbVtZmIi4ujbt26uLu7U61aNTw8PFi5ciV9+vTBwsICE5OUkq2srHj8+LGhy89XgoLuc++fG5S3yNlfczGSAUh49E+OrhfgUWxSjq8zv1Ndm4nixYuzbt063e1BgwYxZcoU3Nzc3mhH8bb2FPq8OnVias+W/MbUVJOhPjOmphrKW5jwVQNLA1SVM36+EpHh11dYGDyk5cuX58KFC7rbr7eZCA0N5fTp03zxxRdASohNTEywtLQkJiYGrVaLRqNJtz2FPq/Ou5vasyW/SUzUZqjPTEF/fQWF6ubd1ddmwtzcnAULFhAUFISiKGzZsoVOnTphampK06ZN2bdvHwC+vr5vtKcQoiBSZZuJGTNmMHLkSBITE2ncuDFfffUVAJ6ennh4eLBq1SoqVKjAokWLslVLdHQU2oSofHWCgDYhiujoQn8OSqGSJ79tR0dHHB0d0yx79Ti0c+fOdO7c+Y3nVapUiV9++SXX6xNCTQr1f8mlSpXmybOkfHdaYEZn6RMFQ6EOqSg88vOUpXLurhCvUduUpTKSikIhP09ZKiOpECqnN6RxcXFMnz6dAQMGEBUVxbRp0+TqEyEMSG9IZ82aRcmSJQkPD8fMzIzY2FimTZtmiNqEEGQgpIGBgUyYMAETExOKFi2Kt7c3gYGBhqhNCEEGQmpsnPYhWq32jWVCiNyj99PdZs2asWDBAhISEjhx4gRbtmyhRYsWhqhNZFN0dBSRsUn8fCUir0vJsEexSZT5z/eUIoXeIXHixIkUK1aMEiVKsHjxYmrXrs23335riNqEEGRgJF22bBnffPMNo0ePNkQ9IgeVKlUas+dP8931pOZy2mMaekfSo0ePGqAMIUR69I6klStXZtCgQTRu3JjixYvrlqdePiaEyF16Q5o6YVhISEhu1yKEeAu9IZ0zZw6QEtKkpCSqVauW60UZUm5c9J2clACAsYl5jq4XUttMvJfj682P8uNsiFmZCVFvSO/fv8+oUaMICwsjOTmZMmXKsGbNGmrUqJHlQtWiSpXc+Q8n9ZdctUpuhOm9XKs7vwkKus+tuzfRlCqSo+tN1qTMDXUn/P9ydL3a6JdZep7ekM6YMYMhQ4bg7OwMwK5du5g+fTqbNm3K0gbVJLfmdlXbVRQFmaZUEUq1q5jXZWRI9PHQLD1P76e74eHhuoACuLq6EhkZmaWNpdLXZuLQoUM4OTnRrVs3Ro0apbu2z8fHBxsbG137icWLF2erDiHyA70jqVarJSoqSvcBUkRE9s5e0ddmIjY2Fi8vL3bt2oW1tTVLly5l+fLlfPfdd1y7dg0PDw+6du2arRqEyE/0jqR9+/alZ8+eLFmyhKVLl9K7d2969+6d5Q2+2maiWLFiujYTqRITE/H09MTa2hqA2rVr8/DhQwCuXr2Kj48Pjo6OTJw4UVVXzwuRW/SGtGfPnkyfPp3ExEQSEhLw8vLCzc0tyxt8W5uJV9tFlClThk6dOgGQkJDA2rVrsbOzA1JaS4waNYo9e/ZQoUIFZsyYkeU6hMgv9O7uPn78mP379+Pl5cXdu3fx9vamZs2aaYKWGfraTKSKiYlh9OjR1KlTR3dMnNqsCWDIkCG6MGfGu2YKzymp7SvyulWCqamGhDytIGsy00Yjv8lKCw29IZ00aRK2trZAyry3zZs3Z8qUKWnmyc0MfW0mIGW0HTx4MC1btmTKlClASmh37drFwIEDgZRwazSZ/yW92mYit6S2d8jrVgmJiVoe5cJVMLEvUxo2WRTJ+UsWH8UmYV6A22i8rYWGvjYTekMaGRlJ//4pX1WYmZkxcOBAfH19s1xk69atWb58ORERERQtWpQ//viDmTP/+6WxVqtlxIgRdOnShVGjRumWFytWjPXr19OoUSM++eQTNm/enKWRtDDJre9Tw/7zPfB75XN+/e+Te3XnVxn6dPfx48e6D3KePn2KomR9JNLXZuLRo0f8+9//RqvVcuDAAQDq1avH7NmzWbJkCV5eXiQkJPD+++8zf/78LNdRGMj3wAWD3pAOHDiQ7t2707ZtW4yMjDh9+nS2ryd9V5uJ+vXrc+PGjbc+r2nTpvj4+GRr20LkN3pD+sUXX1CvXj3OnDmDRqNh8ODB1KpVyxC1CSHI4Ly7xYsXZ+DAgVSuXJmDBw8SE1N4ekcKkdf0hnTatGmsW7eOO3fu8P333xMcHKz7xFUIkfv0hvTatWt4eXlx8OBBnJ2dmTNnjlxbKoQB6Q2poigYGxtz6tQpWrZsCaScCSSEMAy9Ia1atSpDhw4lODiY5s2b880331CnTh1D1CaEIIMzMxw8eJAmTZpgampK06ZN6d69uwFKE0JABkJarFgxnJycdLezcwWMECLzpF+EEConIRVC5SSkQqhcusek/fr1e+t1nqkKwkRkQuQH6Ya0b9++ABw8eJDY2FhcXV3RaDT4+flRsmRJgxUoRGGXbkg7d+4MwIYNG9i2bZuuJ2mHDh3o2bOnYaoTQug/Jo2MjOTFixe623FxcTIBmBAGpPd70q5du9KjRw86deqEoijs37+fHj16GKI2YUCnTh3n5MljGXpsZtsw2Ni0p02bdlmurbDTG9Jx48ZRr149AgICAPDw8KB9+/a5XphQr1KlSuV1CYWK3pBCylSaNWvWxMXFhevXr+d2TaqUmyMN5P1o06ZNOxntVErvMemuXbuYPHky69evJyYmhlGjRrFjx45sbVRfm4nAwEBcXFzo3LkzU6dOJSkpCYDQ0FD69OmDvb09I0eOJC4uLlt15JZSpUrJaCNyjN6RdPPmzWzfvp2+fftStmxZdu/ezZAhQ7J8XKqvzQSAu7s7s2bNomHDhkyZMoUdO3bg5ubG9OnTcXNz4/PPP2fFihWsXLkSd3f3LNWRWTLSiLyidyQ1NjbGwuK/c4JWqFAhS/PdptLXZiIkJISEhAQaNmwIgIuLC/v37ycxMZHz58/rvhpKXS5EQac3pKVLlyYwMFB39tGePXuytSunr83E6/dbWVnx+PFjIiMjsbCwwMTEJM1yIQo6vbu7U6ZMYdy4cTx48AAbGxvMzMxYuXJlljeor81Eeve/rR3Fu05bTI8h2kwIw5A2E//xwQcf4Ofnx71799BqtVSvXp34+PgsF6mvzUT58uV58uSJ7vbTp08pV64clpaWxMTEoNVq0Wg0b21PkRGGaDMhDEPaTPyHi4sLPj4+1KhRQ7esT58+7N27N0tF6mszUalSJczMzLh48SJNmjTBz8+Pdu3a6WaF2LdvH46Ojvj6+tKunXyQU5hFR0eRFPUiyx20DS0p6gXRJlGZfl66IR0wYABXr14lISGBxo0b65YnJydTv379LBUJ+ttM1K9fH29vb7777jtiY2P5+OOPdb1oPD098fDwYNWqVVSoUIFFixZluQ4h8gsjJZ3GLrGxsURFRTFlyhTmzJmjW25iYoKVlZXuhPv8RnZ3C45582ZyJ/z/KNWuYl6XkiHRx0OpUbb6Gz109O3upps0CwsLKleuzMqVK9m7dy+VKlUCYP369TKlpxAGpHc4nDx5MlFRUQCULFkSIyMjvv9eumkJYSh6Q3rv3j0mTZoEQIkSJZgyZQq3b9/O9cLys6ioSObOnUF0dFRelyIKAL0hTUpKIjY2Vnc7Li4uW/1JCwN/fx9u377Jnj2787oUUQDo/Qqme/fufPnll9jb22NkZMTBgwdxcXExRG35UlRUJCdPHkNRFE6ePE63bi6UKlU6r8sS+ZjekXT48OFMnDiRmJgY4uPjmThxIl999ZUhasuX/P19dJ8eJycny2gqsi3dkTQ2NhYLCwuioqJo0qQJTZo00d0XFRVF6dKlDVFfvhMQcAqtNuXSOq02iYCAU/TrNyiPqxL52Tun9PTx8aFly5ZvPZc2MDDQIAXmN61ateH48aNotUloNCa0atUmr0sS+Vy6IfXx8QHgxo0bBiumIHB0dObkyWNotSmX+XXrJsfvInvSDamvr+87nyid1d6udOky2Ni05+jRP7GxaScfGolsSzekqRdUP3nyhLt379KyZUtMTEw4e/YsdevWlZC+g6OjMyEhwTKKihyRbkhXr14NwLBhw1i8eDFVq1YFUuYZkjOO3q106TJ4eEzL6zJEAaH3K5iHDx/qAgpQsWJFHj16lKtFCXWTM6oMS29IraysWLZsGUFBQQQFBeHt7U2VKlUMUZtQKTmjyrD0hnTu3LncvHkTJycnnJ2dCQkJ4YcffjBEbUKFXj+jSkbT3Kf3tMBy5cqxYsUKoqOjZS5Z8dYzquRkjdylN6R3795lzJgxxMTEsHPnTgYOHMiPP/6YZjoVUXio7YwqbfTLHJ8+JTkhZe4kY/OcnehMG/0Symb+eXpDOmvWLKZOncqCBQuwtramb9++TJs27a0zz4uCT01nVFWpUi1X1pvaJqRq2Rxef9ms1aw3pFFRUbRp04YFCxYAKZOQZafNRGhoKO7u7oSHh1O9enW8vb0pXrx4mseEhYUxefJknj59irGxMd9++y2tWrUiMTGRFi1apPngavfu3dmarFtkjprOqHJz658r603t4fP6NCd5JUMTFb148UJ3/u6TJ09ITk7O8gZTW0Xs37+fevXqvXUO3/nz52Nra4ufnx8LFy5k4sSJaLVabt68SaNGjfDz89P9k4AaVuoZVUZGRnJGlYHoDWnv3r0ZPHgw4eHhLFy4kJ49e9K7d+8sbSyjrSI6depE165dAahWrRovXrwgPj6eq1evEhERgYuLCz169ODcuXNZqkNkj6OjMx9+WFvOqDIQvbu7X375Je+//z5Hjx4lKSmJmTNn0qZN1o5DMtoqIjXEABs2bKBu3bqUKFECIyMjPv30U4YPH87t27cZOnQo/v7+WFpaZqkekTVyRpVh6Q3pgAED+N///V+aNWuWqRX//vvvaaYChZRRMTOtIjZu3Mj27dvZvHkzAL169dLd99FHH9GgQQMuXbqEnZ1dhuuSNhNCn9T2FZltB5Fb9IY0dUaGYsWKZWrFXbp0oUuXLmmWpX7wk5FWEfPnz+fYsWNs2bKF8uXLAylX5jRu3Fh3mqKiKJiammaqLpl3V+iT2r7i9XYQuSXbbSaKFi1Kx44dqV27dpqgpp6AnxkZbRWxceNGzp49y9atWylZsqRu+c2bN7l8+TJeXl7cvXuXwMDANDNGCFEQpTuDfarUi79f5+zsnKUNhoSE4OHhQXh4uK5VRKlSpdi6dSthYWGMHTuW5s2bY2FhkSaga9eupXjx4kyZMoW7d+9iZGTE1KlTadmyZaa2LyOp0MfQX8HoG0nfGdJbt25x7949PvnkE6ytrXOlQEOTkAp91BbSdL+C2bVrF3379mXdunV069aNkydP5kqBQoh3S/eY9JdffsHf3x9ra2v++usvFi9ejI2NjSFrE0Kg52SG1F3cRo0aERkZaZCChBBppRvS17+/lNPvhMgbGW4y+q6TDoQQuSfdY9KbN2+m6fCd2vE7dXLsS5cuGaRAIQq7dEN68OBBQ9YhhEhHuiFN7ewthMhbGT4mFULkDQmpEConIRVC5SSkQqichFQIlZOQCqFyElIhVE5Cmguk65jISRLSXCBdx0ROkpDmMOk6JnKahDSHva3rmBDZYfCQhoaG0qdPH+zt7Rk5ciRxcXFvPCYkJIRGjRrh5OSEk5MTgwcPBlKm8Jw3bx729vY4ODhw8eJFQ5ev19u6jgmRHQYPaUZ6wVy7dg1HR0ddv5cNGzYAcODAAe7cucO+fftYsWIFkydPJikpydAv4Z1atWqDRpNy3UJedx0TBYNBQ5rRXjBXr17l1q1bODk50b9/f27evAnAsWPHcHBwwNjYmOrVq1OhQgX++usvQ74EvRwdnTE2TrlAPq+7jomCQe/k2Dkpo71gzMzM6NatG7169eLEiROMHj2affv2ERYWlmbGeysrKx49epSpGnK7zYSVVQns7OzYv38/nTrZUbNmFf1PEqqS79pMZFV2esF8/fXXup/bt2/PwoULuXv3LsnJyWkerygKxsaZ2xkwxLy7nTp15c6d/6NTp64Ga1Ugck6+azORVdnpBfPLL7/QtWtXypQpA6SE0cTEhPLlyxMWFqZ73NOnT9PtJZOXpOuYyEkGPSZ9tRcMkG4vmPPnz7Nz504Azp07R3JyMh988AHt2rXD398frVbL/fv3uXfvHvXr1zfkSxDC4Ax6TArg6emJh4cHq1at0vWCAXS9YMaNG8fUqVPx8PDAz88PMzMzFi5ciLGxMfb29ly5coVu3boBMHv2bMzNzQ39EoQwKL0Nmwoa6QUj9Mk3vWCEEOogIRVC5SSkQqichFQIlZOQCqFyElIhVE5CKoTKSUiFUDkJqRAqJyEVQuUkpEKonIRUCJWTkAqhchJSIVROQiqEyklIhVA5CakQKichFULlDD7HUWhoKO7u7oSHh1O9enW8vb0pXrx4mseMGDGChw8fAin9VG7dusXOnTupU6cOLVq0oEqV/85lu3v3bjQajUFfgxCGZPCQpraZ+Pzzz1mxYgUrV67E3d09zWNWr16t+3np0qU0bNiQ+vXrc+3aNRo1aqRrOyFEYaDKNhOp7t69i6+vL5MmTQJS2k9ERETg4uJCjx49OHfunEHqFiIvqbLNRKqVK1cyePBgLCxSZlIzMjLi008/Zfjw4dy+fZuhQ4fi7++PpaVlhmvI7TYTIv+TNhMZaDMBEB0dzalTp5g9e7ZuWa9evXQ/f/TRRzRo0IBLly5hZ2eX4bpkSk+hj7SZyECbCUjpoNauXTvMzMx0y3x9fWncuDFVq1YFUtpPmJqa5tZLEEIVVNlmAuDy5cs0bdo0zbKbN2/y008/ASnHq4GBgTRp0iR3ixYijxn8e1JPT0927NiBg4MDFy5cYPz48UBKm4mlS5fqHhcUFIS1tXWa544ePZqIiAi6du3KuHHjmDdvnu54VYiCStpMCPEaaTMhhMgUCakQKichFULlJKRCqJyEVAiVk5AKoXLyFYwoFE6dOs7Jk8cy9NgHD+4DULVqtQw93samPW3avP2knIzIs9MChcivSpUqldclpCEjqRB5TE5mECKfk5AKoXISUiFUTkIqhMpJSIVQOQmpEConIRVC5QrdyQzGxm+f+EyIvKLvb7LQncwgRH4ju7tCqJyEVAiVk5AKoXISUiFUTkIqhMpJSIVQOQmpEConIRVC5SSkQqhcoQvprVu3qF27NgcOHND72B07drB3714DVAWPHz9m6NChBtlWcHAwtWvX5tSpU2mW29raEhwcnOn1TZ48mZCQkEw9p3bt2pnejj779+/HxcWFbt264ejoyPr163N8G3mh0IV0165d2Nvbs337dr2PvXTpEi9fvjRAVWBtbc26desMsi1IaUP5/fffExsbm+11nT17lrw+u/Tx48fMmzePDRs2sGfPHrZt28a+ffv4888/87SunFCoTrBPTEzE39+fLVu20KtXLx48eEDVqlWxtbWlW7dunDx5kufPnzNv3jyePXvG4cOHOXPmDFZWVlhbWzNz5kzi4+OJiIhg2LBh9O7dm5iYGL799lsePHhAlSpVePToET/++CMVK1bkhx9+ICAgACMjI7p168awYcM4e/Ysa9aswdzcnDt37lC7dm28vb0JCwujf//+HD58GH9/f9avX49Go6Fy5cosWLCAy5cvs3r1akxNTQkODsbW1pZixYpx6NAhANauXct7772X4feiXLlytG7dmnnz5jFz5sw0961du5bff/8drVaLjY0N7u7uhISE6OoDWL58OQBmZmaEhYUxbNgwtmzZgqurKw0aNCAwMJBff/2VTZs2ERAQQHR0NOXKlWPx4sWZqjOjIiMjSUxMJCEhAYDixYszd+5cLl26RK9evdi2bRsAu3fv5u+//+aTTz7hxIkTREdHExQURJs2bfDy8nrn6x8yZAhlypTB3NyctWvX4unpycWLF7G2tsbIyIhRo0axc+dOmjVrRo8ePQDo168fEydO5JNPPsn6i1MKkYMHDyqurq6KoijKlClTlPnz5yuKoigdO3ZUfv75Z0VRFGXTpk3KmDFjFEVRlEmTJim7du1SFEVRZs2apZw+fVpRFEV58OCB0rBhQ0VRFGXOnDnKvHnzFEVRlCtXrih169ZVgoKClM2bNyujRo1SkpKSlPj4eMXV1VU5cuSIcubMGaVhw4bKw4cPFa1Wq7i6uip//vmnEhQUpHTs2FFRFEWxtbVVnj59qiiKosydO1f597//rZw5c0Zp1KiREhoaqsTHxysNGzZUtm7dqiiKonh4eCgbN27M8PuQuq2YmBilQ4cOysmTJ3Xvw6+//qp8/fXXSlJSkqLVapX/+Z//UXx9fdPUpyiKsmzZMmXZsmW65wUFBel+Tn3P7t27p4wZM0bRarWKoiiKu7u7smHDBkVRFKVWrVoZrjejpk2bpnz00UeKq6urMn/+fCUwMFBJTk5WbG1tlfv37yuKoij9+vVTLl++rOzatUtp3769EhMTo8THxyvt2rVTbty4oRw7dizd11+rVi3d69y0aZMyfvx4JTk5WQkODlYaNWqknDlzRgkICFDc3NwURVGU4OBgxcHBIduvq1Dt7u7atYuuXbsC4ODgwO7du3W7s23btgXgww8/JCoq6o3nenh48OLFC9asWcOSJUuIj48H4NSpUzg5OQFQv359atWqBaTsAjo7O6PRaChatCiOjo4EBATotlG+fHmMjY2pUaMG0dHRabbVsWNHevfuzfz58+ncuTN169YFoFatWlSoUIGiRYtSpkwZWrVqBUDFihV59uxZpt8PCwsLZs6cmWa3NyAggCtXruDi4oKzszPXrl3jn3/+ydR6U0eNatWqMWnSJP71r38xd+5cLl++rHvfcsP06dM5fPgwvXv3JjQ0lB49enDw4EGcnZ3Zs2cPoaGhhIeH6+pr1KgRFhYWFC1alCpVqhAdHf3O11+2bFkqV64MpPzeHR0dMTIyolKlSrrfRYsWLQgLCyM4OBhfX1/d30Z2FJrd3fDwcE6cOMH169fZtGkTiqLw7NkzDh48CKTstgEYGb392r7x48dTsmRJOnbsiIODg+4DJY1G89bjseTk5DS3FUVBq9Wm2Vbq9l5//nfffceNGzc4duwY7u7ujBkzhvLly2NqaprmcRqNJjNvwVvZ2NjodnsBtFotAwYM4KuvvgLg2bNnaDQaoqKi0tSZlJSEicnb/3xSX9+1a9f45ptvGDhwIJ07d8bY2DjXjl2PHj1KfHw8Dg4OuLq64urqyo4dO9i5cyeenp4MGTKEIkWKpAnN234P6b3+yMhIzM3NdY/XaDRv/I5T19O9e3d+++03fv/9dzZs2JDt11ZoRlI/Pz9atmzJ8ePHOXz4MEeOHGHEiBG6Y5W30Wg0umCdOnWKsWPHYmdnx/Hjx4GUP+hWrVrh7+8PwM2bN7l9+zZGRka0bNkSX19ftFotz58/x9/fnxYtWuitMykpic8++4wyZcowfPhwnJycCAwMzIF3IH0eHh6cPHmSsLAwWrZsiZ+fH3FxcSQlJTF69GgOHDhAyZIliYqKIiIigpcvX3LixAnd8199n151/vx5mjdvTu/evXn//fc5evToWx+XE8zNzVm4cKHu02lFUQgMDKRu3bpUqlSJ8uXLs23bNr0jW3qv/3WtW7dm3759KIrC48ePOXfunO4/eBcXF7Zt20aFChWwtrbO9msrNCOpj48PEyZMSLOsT58+rF+/HguLt88e3rp1axYtWkSJEiX4+uuvcXNzw8zMjDp16lCpUiWCg4MZPXo0kydPxtHRkapVq/Lee+9hbm5Oz549uXfvHk5OTiQmJuLo6EinTp04e/bsO+s0MTFh7NixDBo0CDMzM8qWLcvcuXMzvcuZGam7vYMHD6Zjx47ExMTQo0cPtFotbdu2xdnZGSMjI4YMGcIXX3xB+fLlqV+/vu75HTp0YNiwYW985eHg4MCYMWNwdHQEoF69eln6iicjWrZsyZgxYxgxYgSJiYlAyiHM6NGjdbX88ccfekNja2vLjRs33nj9r3/F1KNHD27cuIGjoyNWVlZUrFhRN9JWqFCBChUq4OzsnDMvLttHtYWcr6+vcuHCBUVRFCUkJETp2LGj7oMSoQ6JiYnKhAkTlAMHDuTYOo8cOaIcPnxYURRFefbsmWJra6tERkYqycnJyqNHj5ROnTopL168yJFtFZqRNLd88MEHeHp6kpycjLGxMTNmzMDYuNAcRaieoii0bduW1q1bY2dnl2PrrVGjBt9++y1LliwBYOzYsZQuXZr9+/fj5eWFl5cXRYoUyZFtyRxHQqic/JcvhMpJSIVQOQmpEConHxwVILNmzeL8+fMA3Llzh0qVKum+Fti+fXuaL+Nzk4eHBx9++CGDBw82yPYKOglpAfLdd9/pfra1tcXb2zvN95kif5KQFhLLly/n8uXLhIWFUbt2bapVq0ZkZCTTpk3T3Z96OyYmhtmzZ3Pr1i0SExNp1aoV33777RunAcbFxTFr1iwuXbqERqPBzs7ujRNGdu7cyfbt20lMTCQ6OpqhQ4fi5ubGkydPmDRpEpGRkQC0b9+e8ePHp7u8MJNj0kIkJCQEHx8fvL293/m4H374gY8//pjdu3fj6+tLZGQkP//88xuPW7ZsGS9evGDfvn34+vpy6dIlzp07p7s/Li6Of/3rX6xduxZfX18WL17MggULgJQL6itXroyPjw9btmzh/v37xMTEpLu8MJORtBBp2LBhuifFv+ro0aNcvXqVnTt3Auiu0Xzd6dOnmTx5MhqNBo1Gw+bNm4GUUzAh5ZrO1atXc+zYMe7du8eNGzd0V8G0bduWYcOG8fDhQ1q3bs0333xDiRIl0l1emElIC5FixYrpfn796pvU810h5QqepUuXUqNGDSDlSpC3XR1kYmKSZvnDhw/TfDj16NEjevbsSY8ePWjSpAn29vYcOXIEgAYNGvDnn38SEBDAmTNn+PLLL1m3bl26y+vVq5dzb0Q+I7u7hVSZMmW4fv06iqIQGxurCw+kXL62ceNGFEXh5cuXjBw5UjdKvqpVq1b4+PiQnJzMy5cvGTt2rO7TZUi5VM3S0pJRo0ZhY2Oj24ZWq8Xb25uVK1diZ2fH1KlTqVmzJrdv3053eWEmIS2kunXrhqWlJZ999hkjRoygefPmuvumTp1KfHw8jo6OODo6UqtWLYYMGfLGOsaMGYOpqSlOTk50796d9u3b89lnn+nub9OmDdbW1tjb29OlSxcePnyIpaUl9+/fZ8CAAdy4cYOuXbvi6upK5cqV+fzzz9NdXpjJubtCqJyMpEKonIRUCJWTkAqhchJSIVROQiqEyklIhVA5CakQKichFULl/h9T5B254wPScQAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 216x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Load sample data\n",
    "sample = load_sample('abaumannii')\n",
    "\n",
    "# Define input arguments\n",
    "key             = sample['key']\n",
    "profiles        = sample['profiles']\n",
    "feature_names   = sample['feature_names']\n",
    "train_ixns      = sample['train']['interactions']\n",
    "train_scores    = sample['train']['scores']\n",
    "test_ixns       = sample['test']['interactions']\n",
    "test_scores     = sample['test']['scores']\n",
    "strains         = sample['orthology']['strains']\n",
    "orthology_map   = sample['orthology']['map']\n",
    "\n",
    "# Determine ML features\n",
    "train_data      = featurize(train_ixns, profiles, feature_names=feature_names, key=key, silent=True)\n",
    "test_data       = featurize(test_ixns, profiles, feature_names=feature_names, key=key, silent=True, \n",
    "                            strains=strains, orthology_map=orthology_map)\n",
    "X_train, X_test = train_data['feature_df'].to_numpy().transpose(), test_data['feature_df'].to_numpy().transpose()\n",
    "\n",
    "# Determine class labels\n",
    "thresh, classes = (-0.5, 0), ('S', 'N', 'A')\n",
    "train_labels    = classify(train_scores, thresholds=thresh, classes=classes)\n",
    "test_labels     = classify(test_scores, thresholds=thresh, classes=classes)\n",
    "\n",
    "# Train and apply a regression-based model\n",
    "reg_model = RandomForestRegressor()\n",
    "reg_model.fit(X_train, train_scores)\n",
    "reg_y = reg_model.predict(X_test)\n",
    "r, p = spearmanr(test_scores, reg_y)\n",
    "r2 = r2_score(test_scores, reg_y)\n",
    "print('Regression results:')\n",
    "print('\\tSpearman R = {}'.format(round(r, 4)))\n",
    "print('\\tSpearman p = {:.3g}'.format(p))\n",
    "print('\\tR2 = {}'.format(round(r2, 4)))\n",
    "\n",
    "# Train and apply a classification-based model\n",
    "class_model = RandomForestClassifier()\n",
    "class_model.fit(X_train, train_labels)\n",
    "class_y = class_model.predict(X_test)\n",
    "print('Classification results:')\n",
    "print(classification_report(test_labels, class_y))\n",
    "\n",
    "# Visualize results\n",
    "df = pd.DataFrame({'x': test_labels, 'y': reg_y})\n",
    "df.replace({'A': 'Antagonism', 'N': 'Neutral', 'S': 'Synergy'}, inplace=True)\n",
    "sns.set(rc={'figure.figsize':(3, 4)})\n",
    "ax = sns.boxplot(x='x', y='y', data=df, order=['Antagonism', 'Neutral', 'Synergy'], )\n",
    "ax.set(title='A. baumannii', xlabel='True class', ylabel='Predicted score')\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.8.5 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "0ac2a46207b2ff734e5406bb8bd0909b0a981f84a860af7db5bce33c6bd25d0b"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
