Metadata-Version: 2.1
Name: utility-bill-scraper
Version: 0.6.0
Summary: Utility bill scraper for extracting data from pdfs and websites.
License: BSD-3-Clause
Author: Ryan Fobel
Author-email: ryan@fobel.net
Requires-Python: >=3.8,<3.11
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: arrow (>=1.2.0,<2.0.0)
Requires-Dist: beautifulsoup4 (>=4.10.0,<5.0.0)
Requires-Dist: google-api-python-client (>=2.27.0,<3.0.0)
Requires-Dist: matplotlib (>=3.4.3,<4.0.0)
Requires-Dist: numpy (>=1.21.2,<2.0.0)
Requires-Dist: pandas (>=1.3.3,<2.0.0)
Requires-Dist: pdfminer (>=20191125,<20191126)
Requires-Dist: python-dotenv (>=0.19.1,<0.20.0)
Requires-Dist: selenium (>=3.141.0,<4.0.0)
Description-Content-Type: text/markdown

# Utility bill scraper

[![build](https://github.com/ryanfobel/utility-bill-scraper/actions/workflows/build.yml/badge.svg?branch=main)](https://github.com/ryanfobel/utility-bill-scraper/actions/workflows/build.yml)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ryanfobel/utility-bill-scraper/main)
[![PyPI version shields.io](https://img.shields.io/pypi/v/utility-bill-scraper.svg)](https://pypi.python.org/pypi/utility-bill-scraper/)

Download energy usage data and estimate CO2 emissions from utility websites or pdf bills.

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- param::title::## Table of contents:: -->
<!-- param::mode::github.com:: -->

## Table of contents

- [Supported utilities](#supported-utilities)
- [Install](#install)
- [Data storage](#data-storage)
- [Getting and plotting data using the Python API](#getting-and-plotting-data-using-the-python-api)
    - [Update data](#update-data)
    - [Plot monthly gas consumption](#plot-monthly-gas-consumption)
    - [Convert gas consumption to CO2 emissions](#convert-gas-consumption-to-co2-emissions)
    - [Plot CO2 emissions versus previous years](#plot-co2-emissions-versus-previous-years)
- [Command line utilities](#command-line-utilities)
    - [Update data](#update-data-1)
    - [Export data](#export-data)
    - [Options](#options)
    - [Environment variables](#environment-variables)
- [Contributors](#contributors)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

## Supported utilities

The simplest way to get started without installing anything on your computer is to click on one of the following links, which will open a session on https://mybinder.org where you can try downloading some data.

 * [Kitchener Utilities (gas & water)](https://mybinder.org/v2/gh/ryanfobel/utility-bill-scraper/main?labpath=notebooks%2Fcanada%2Fon%2Fkitchener_utilities.ipynb)
 
## Install

```sh
pip install utility-bill-scraper
```

## Data storage

All data is stored in a `*.csv` file located at `$DATA_PATH/$UTILITY_NAME/data.csv` .The `DATA_PATH` can be set as input argument when creating an API object via the `data_path` argument, or via the `--data-path` command line switch or `DATA_PATH` environment variable when using the [command line lnterface](#command-line-utilities).

```
└───data
    └───Kitchener Utilities
        └───data.csv
        └───statements
            │───2021-10-18 - Kitchener Utilities - $102.30.pdf
            ...
            └───2021-06-15 - Kitchener Utilities - $84.51.pdf
```

## Getting and plotting data using the Python API

### Update data

```python
import utility_bill_scraper.canada.on.kitchener_utilities as ku

ku_api = ku.KitchenerUtilitiesAPI(username='username', password='password')

# Get new statements.
updates = ku_api.update()
if updates is not None:
    print(f"{ len(updates) } statements_downloaded")
ku_api.history().tail()
```
![history tail](https://raw.githubusercontent.com/ryanfobel/utility-bill-scraper/main/notebooks/canada/on/images/history_tail.png)




### Plot monthly gas consumption

```python
import matplotlib.pyplot as plt

df_ku = ku_api.history()

plt.figure()
plt.bar(df_ku.index, df_ku["Gas Consumption"], width=0.9, alpha=0.5)
plt.xticks(rotation=90)
plt.title("Monthly Gas Consumption")
plt.ylabel("m$^3$")
```

![monthly gas consumption](https://raw.githubusercontent.com/ryanfobel/utility-bill-scraper/main/notebooks/canada/on/images/monthly_gas_consumption.svg)

### Convert gas consumption to CO2 emissions

```python
from utility_bill_scraper import GAS_KGCO2_PER_CUBIC_METER

df_ku["kgCO2"] = df_ku["Gas Consumption"] * GAS_KGCO2_PER_CUBIC_METER
```

### Plot CO2 emissions versus previous years

```python
import datetime as dt

df_ku["kgCO2"] = df_ku["Gas Consumption"] * GAS_KGCO2_PER_CUBIC_METER
df_ku["year"] = [int(x[0:4]) for x in df_ku.index]
df_ku["month"] = [int(x[5:7]) for x in df_ku.index]

n_years_history = 1

plt.figure()
for year, df_year in df_ku.groupby("year"):
    if year >= dt.datetime.utcnow().year - n_years_history:
        df_year.sort_values("month", inplace=True)
        plt.bar(
            df_year["month"],
            df_year["Gas Consumption"],
            label=year,
            width=0.9,
            alpha=0.5,
        )
plt.legend()
plt.ylabel("m$^3$")
plt.xlabel("Month")
ylim = plt.ylim()
ax = plt.gca()
ax2 = ax.twinx()
plt.ylabel("tCO$_2$e")
plt.ylim([GAS_KGCO2_PER_CUBIC_METER * y / 1e3 for y in ylim])
plt.title("Monthly CO$_2$e emissions from natural gas")
```
![monthly_co2_emissions](https://raw.githubusercontent.com/ryanfobel/utility-bill-scraper/main/notebooks/canada/on/images/monthly_co2_emissions.svg)

## Command line utilities

Update and export your utility data from the command line.

### Update data

```sh
> python -m utility_bill_scraper.bin.ubs --utilty-name "Kitchener Utilities" update --user $USER --password $PASSWORD
```

### Export data

```sh
> python -m utility_bill_scraper.bin.ubs --utilty-name "Kitchener Utilities" export --output data.csv
```

### Options

```sh
> python -m utility_bill_scraper.bin.ubs --help
usage: ubs.py [-h] [-e ENV] [--data-path DATA_PATH] [--utility-name UTILITY_NAME]
              [--google-sa-credentials GOOGLE_SA_CREDENTIALS]
              {update,export} ...

ubs (Utility bill scraper)

optional arguments:
  -h, --help            show this help message and exit
  -e ENV, --env ENV     path to .env file.
  --data-path DATA_PATH
                        folder containing the history file
  --utility-name UTILITY_NAME
                        name of the utility
  --google-sa-credentials GOOGLE_SA_CREDENTIALS
                        google service account credentials

subcommands:
  {update,export}       available sub-commands
```

### Environment variables

Note that many options can be set via environment variables (useful for continuous integration and/or working with containers). The following can be set in your shell or via a `.env` file passed using the `-e` option.

```sh
DATA_PATH
UTILITY_NAME
GOOGLE_SA_CREDENTIALS
USER
PASSWORD
SAVE_STATEMENTS
MAX_DOWNLOADS
```

## Contributors

* Ryan Fobel ([@ryanfobel](https://github.com/ryanfobel)

