# Paired T-tests for Gradle Benchmarks

[![Flake8](https://img.shields.io/badge/codestyle-flake8-yellow)](https://flake8.pycqa.org/en/latest/)
[![Maintainability](https://api.codeclimate.com/v1/badges/a9fe25bd995710be45d2/maintainability)](https://codeclimate.com/github/dotanuki-labs/gradle-profiler-pttest/maintainability)
[![codecov](https://codecov.io/gh/dotanuki-labs/gradle-profiler-pttest/branch/master/graph/badge.svg)](https://codecov.io/gh/dotanuki-labs/gradle-profiler-pttest)
[![PyPI](https://img.shields.io/pypi/v/gradle-profiler-pttest)](https://pypi.org/project/gradle-profiler-pttest/)
[![Main](https://github.com/dotanuki-labs/gradle-profiler-pttest/workflows/Main/badge.svg)](https://github.com/dotanuki-labs/gradle-profiler-pttest/actions?query=workflow%3AMain)
[![License](https://img.shields.io/github/license/dotanuki-labs/gradle-profiler-pttest)](https://choosealicense.com/licenses/mit)

## Context

> _Complete blog post to come. Stay tunned_

`gradle-profiler-pttest` can analyse the outcomes of two benchmarks generated by [Gradle Profiler](https://github.com/gradle/gradle-profiler) with the [Paired T-test statistical technique](https://en.wikipedia.org/wiki/Student%27s_t-test).

The goal is provide a super easy way to compare two benchmarks for Gradle builds - for the same task - without being [mislead by simple means](https://towardsdatascience.com/why-averages-are-often-wrong-1ff08e409a5b), since we are leveraging on a more robust statistical evidence on top of the outcomes.

This tool is built on top of [pingouin](https://pingouin-stats.org/), an opinionated Statistics library which leverages NumPy, Pandas and SciPy. Among other things, `gradle-profiler-pttest` features :

- An opinionated [hyphotesis test](https://en.wikipedia.org/wiki/Statistical_hypothesis_testing) (left-tailed) where we want to check if **modified build conditions** (_h1_) have a mean **statistically smaller** than the one we get from **baseline build conditions** (_h0_), which means better execution times given the modifications applied
- Auto-correction for benchmark samples with different sizes

Note that the ideal size for samples should be a small one - ideally between 10 and 30 measured builds - in order to make sense of T-student analysis.

## Installing

Install from your CLI with [pip](https://pypi.org/project/pip/)


```shell
pip install gradle-profiler-pttest
```

Requires **Python 3.8.5** or newer.

When running this tool on MacOS Catalina or newer, [please check instructions](https://docs.scipy.org/doc/scipy/reference/building/macosx.html) in order to have SciPy properly installed in your local machine.

## Using

- Run the benchmarks with Gradle profiler for the status quo (`baseline`) and for the modifications applied the your Gradle project (`modified`)

- Supply the generated CSV files to `gradle-profiler-pttest`

```bash
gradle-profiler-pttest \
	-b <path/to/baseline/benchmark.csv> \
	-m <path/to/modified/benchmark.csv>
```

- Profit results

![](.github/assets/showcase.png)

## Limitations

Right now `gradle-profiler-pttest` supports **only one Gradle task per supplied benchmark sample**, taking the first task executed as reference for the analysis given a multi-task [benchmarked scenario](https://github.com/gradle/gradle-profiler#advanced-profiling-scenarios).

## Contributing

- Ensure you have Python 3.8.5 or newer installed
- Ensure you have [flake8](https://pypi.org/project/flake8/) installed and supported in your text editor / IDE
- Ensure you have [Poetry](https://python-poetry.org/) installed
- Check the [contribution guidelines](./CONTRIBUTING.md)
- Make sure you have a green build

```
make flake8
make test
```
- Submit your PR 🔥


## Credits

- [Raphael Vallat](https://github.com/raphaelvallat) for [pingouin](https://github.com/raphaelvallat/pingouin/), it made the task super easy
- [Will McGugan](https://github.com/willmcgugan) for [rich](https://github.com/willmcgugan/rich), I wish I had something awesome like this for my JVM projects

## Author

Coded by Ubiratan Soares (follow me on [Twitter](https://twitter.com/ubiratanfsoares))

## License

```
The MIT License (MIT)

Copyright (c) 2020 Dotanuki Labs

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
```
