Metadata-Version: 2.1
Name: ppe-match
Version: 0.1.4
Summary:  Intuitive framework that allows researchers to implement and test matching methodologies
Home-page: https://github.com/samorani/MatchingPPE
Author: Ram Bala, Michele Samorani, Rohit Jacob
Author-email: rbala@scu.edu, msamorani@scu.edu, rohitjacob92@gmail.com
License: MIT
Keywords: matching,ppe,framework,matcha,frappe
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Description-Content-Type: text/markdown
License-File: LICENSE.txt

# Framework for PPE Matching

## Table of Contents
  * [Overview](#overview)
    + [What is the PPE matching problem?](#what-is-the-ppe-matching-problem)
    + [Who needs to solve the PPE matching problem?](#who-needs-to-solve-the-ppe-matching-problem)
    + [What does this software package do?](#what-does-this-software-package-do)
  * [Installation](#installation)
  * [TestingFramework Class](#testingframework-class)
    + [Parameters](#parameters)
    + [Methods](#methods)


## Overview
### What is the PPE matching problem?
The PPE Matching Problem consists of optimally matching a set of requests, D, made by donors interested in donating Personal Protective Equipment (or PPE, such as masks, gowns, gloves, etc) with a set of requests, R, made by recipients interested in receiving PPE. Requests are characterized by a timestamp (date), a type and quantity of PPE to donate or request, and a donor or recipient id. The input of the problem also includes a matrix M of distances between donors and recipients. The objectives are multiple, and include maximizing the recipients' fill rate, minimizing the total shipping distance, minimizing the holding time of PPE, and minimizing the number of shipments of each donor.   

### Who needs to solve the PPE matching problem?
During health crises like the Covid-19 pandemic, organizations such as GetUsPPE.org provide a platform that aims at connecting prospective donors of PPE to prospective recipients of PPE. Requests by donors and recipients are collected over time. Every _delta_ days, the organization solves the PPE Matching Problem, in order to direct each donor to ship a certain quantity of PPE to a given recipient.

### What does this software package do?
Our package provides an open-source framework for researchers interested in developing and testing methodologies to solve the PPE matching problem.

The user only needs to implement a function _ppestrategy(D,R,M)_, which solves the PPE matching problem. Our testing framework evaluates the performance of that user-defined solution method on real-world requests received by GetUsPPE.org in the early months of the Covid-19 pandemic (April-July 2020).

## Installation
In a virtual environment with Python 3.6+, ppe_match can be installed via pip

    pip install ppe_match

### Import the package using

    from ppe_match import TestingFramework

### Test the installation with the code snippet below

    from ppe_match import TestingFramework

	# Initiate the testing framework with default parameters
    s = TestingFramework()

    # Run the testing procedure
    s.run()

	# Retrieve the decisions made throughout the simulation
	s.get_decisions() # Pandas dataframe that can be stored

	# Retrieve the performance metrics
	s.get_metrics() # Pandas dataframe that can be stored

### Visualize the five summary metrics
The last five metrics are the average holding time, the average number of shipments per donor, the average unit-miles, the average fill rate, and the average fill rate excluding zeros.

	res = s.get_metrics()
	res.tail(5)

### Obtain a single metric
By defining a weight for each of the five summary metrics, the user can easily obtain a single metric.

	import numpy as np

	# set weights of different objectives
	avg_holding_time_weight = 1
	avg_shipments_weight = 10
	avg_unit_miles_weight = 1
	fill_rate_weight = 100
	fill_rate_0_weight = 100

	weights = np.array([avg_holding_time_weight,avg_shipments_weight,avg_unit_miles_weight,fill_rate_weight,fill_rate_0_weight])

	# retrieve the five metrics above
	metric_values = res['value'].tail(5).values

	# compute dot product
	metric_values.dot(weights)


### User-defined matching solution methods

To test a new matching solution method, start by defining a function that takes as input the current date (date, a datetime object), the current donor and recipient requests (Dt and Rt), and the distance matrix between donors and recipients, M. Dt is a DataFrame with columns (don_id,date,ppe,qty), Rt is a DataFrame with columns (rec_id,date,ppe,qty), M is a DataFrame with columns (don_id,rec_id,distance). The function must return the DataFrame Xt of matching decisions (don_id, rec_id, ppe, qty).


For example, a proximity match strategy that matches the each donor's request with the closest recipient's request is implemented as follows:

	import pandas as pd
	def proximity_match_strategy(date,Dt,Rt,M):
        # prepare the result DataFrame (X^t)
	    Xt = pd.DataFrame(columns=['don_id','rec_id','ppe','qty'])
	    ppes_to_consider = set(Dt.ppe.unique())
	    ppes_to_consider = ppes_to_consider.intersection(set(Rt.ppe.unique()))

	    # for each ppe to consider, match each donor request with the closest recipient request
	    for ppe in ppes_to_consider:
		donors_ppe = Dt[Dt.ppe == ppe].copy()
		recipients_ppe = Rt[Rt.ppe == ppe].copy()

		for _, drow in donors_ppe.iterrows():
		    if len(recipients_ppe) == 0:
			break # if we don't have any more recipient with this ppe, consider the next ppe

		    # find the closest recipient to drow.don_id
		    dr = M[(M.don_id == drow.don_id)].merge(recipients_ppe,on='rec_id').sort_values('distance').iloc[0]
		    dqty = drow.qty # donor's qty
		    rqty = recipients_ppe.loc[recipients_ppe.rec_id == dr.rec_id,'qty'].values[0] #recipient's qty
		    qty = min(dqty,rqty) #qty to ship
		    if qty == 0:
			logger.info('qty is zero')
		    if qty == rqty:
			recipients_ppe = recipients_ppe[recipients_ppe.rec_id !=  dr.rec_id] # remove recipient
		    else:
			recipients_ppe.loc[recipients_ppe.rec_id == dr.rec_id,'qty'] -= qty #update recipient's qty
		    Xt.loc[len(Xt),:] = [dr.don_id, dr.rec_id,ppe,qty]

	    return Xt

On the other hand, a first-come-first-matched (FCFM) strategy that matches the i-th donor's request with the i-th recipient's request is implemented as follows:

    import pandas as pd
    def FCFM_strategy(date,Dt,Rt,M):
        # prepare the result DataFrame (X^t)
        Xt = pd.DataFrame(columns=['don_id','rec_id','ppe','qty'])

        # the ppe to consider are the intersection of the PPEs in the table of current donors Dt (D^t) and the table of current recipients Rt (R^t)
        ppes_to_consider = set(Dt.ppe.unique())
        ppes_to_consider = ppes_to_consider.intersection(set(Rt.ppe.unique()))

        # for each ppe to consider, match the i-th donor request with the i-th recipient request
        for ppe in ppes_to_consider:
            donors_ppe = Dt[Dt.ppe == ppe]
            recipients_ppe = Rt[Rt.ppe == ppe]

            n = min(len(donors_ppe),len(recipients_ppe))
            for i in range(n):
                don = donors_ppe.iloc[i]
                rec = recipients_ppe.iloc[i]
                qty = min(don.qty,rec.qty)

                # add
                Xt.loc[len(Xt)] = [don.don_id,rec.rec_id,ppe,qty]
        return Xt



Once you have implemented your own matching strategy (let us call it _my_strategy_), run the test on the GetUsPPE.org data set by passing the function to the TestingFramework constructor:

    s = TestingFramework(strategy=my_strategy)

The ppe_match package contains the implementation of two strategies illustrated above: the first-come-first-matched strategy (strategies.FCFM_strategy) and the "proximity matching" strategy tested by Bala et al. (2021) (strategies.proximity_match_strategy).

## TestingFramework Class
### Parameters
#### donor_path
Path to the data set containing the donors' requests. See expected format in the data folder.

Expected input type: csv

*Default: anon_donors.csv (which is the anonymized table of donors' requests from GetUsPPE.org)*

---
#### recipient_path
Path to the data set containing the recipients' requests. See expected format in the data folder.

Expected input type: csv

*Default: anon_recipients.csv (which is the anonymized table of recipients' requests from GetUsPPE.org)*


---
#### distance_matrix_path
Path to distance matrix between donors and recipients. See expected format in the data folder.

Expected input type: csv

*Default: anon_distance_matrix.csv (which is the anonymized distance matrix from GetUsPPE.org)*

---
#### strategy
User defined strategy to allocate PPE
The function must have the following arguments:

    ppestrategy(date, Dt,Rt,M)

where,

- `date` is a datetime with the current date
- `Dt` is a pandas.DataFrame object whose rows contain the current donor requests
- `Rt` is a pandas.DataFrame object whose rows contain the current recipient requests
- `M` is a pandas.DataFrameobject that reports the distance between each donor and each recipient.

*Default: proximity_match_strategy*

###### Returns:
pd.dataframe of decisions with columns (don_id, rec_id, ppe, qty). Each row represents the decision of shipping from donor _don_id_ to recipient _rec_id_ _qty_ units of PPE of type _ppe_.


---
#### interval
Day Interval set for framework to iterate over.
*Default: 7 (days)*

---
#### max_donation_qty
Maximum quantity limit for donor to donate (helps filter out dummy entries or test entries)
*Default: 1000 (ppe units)*

---
#### writeFiles
Boolean flag to save intermediate input objects (donors, recipients, and distances) and outputs (decisions) as csv
*Default: False*

If set to *True* intermediate data will be saved for every iteration as follows:
```
output
├── 2020-04-09
	 ├── decisions.csv
	 ├── distance_matrix.csv
	 ├── donors.csv
	 └── recipients.csv
├── 2020-04-16
	 ├── decisions.csv
	 ├── distance_matrix.csv
	 ├── donors.csv
	 └── recipients.csv
├── ...
```
---
#### output_directory
Sets the directory where the intermediate files and results will be saved
*Default: output/*

---


### Methods

#### run()
Tests a strategy function by simulating the arrival of the requests given in the input data

---
#### get_decisions()
Returns the list of all matching decisions made during the test.

---
#### get_metrics()
Returns the performance metrics described in Section 4 of the research article. The metrics are reported at the PPE level, the recipient level, and the "overall" level (see Section 4). The "overall" metrics are at the bottom of the DataFrame.

---
#### debug(bool_flag)
Sets the logging level to DEBUG if *True*
*Default: False (Loglevel sets to WARN)*

---


