Metadata-Version: 2.1
Name: viper-test
Version: 0.0.1
Summary: Simple, expressive pipeline syntax to transform and manipulate data with ease
Project-URL: Homepage, https://github.com/aropele/viper
Project-URL: Bug Tracker, https://github.com/aropele/viper/issues
Author-email: Andrea Ropele <andrea.ropele@gmail.com>
License-File: LICENSE
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.8
Requires-Dist: pandas
Description-Content-Type: text/markdown

# viper

<img src='docs/logo.png' align="right" width="150"/>

> Simple, expressive pipeline syntax to transform and manipulate data with ease 

## Overview

`viper` is a Python package that provides a simple, expressive way to work with data. It allows you to easily manipulate and transform data using a pipeline syntax similar to that of [dplyr](https://dplyr.tidyverse.org/).
 
Pipelining your DataFrame manipulation operations offers several benefits:

- improved code readability (no need to 'comment the what')
- no need to save intermediate dataframes
- ability to chain a long sequence of operations in a single command
- thinking of coding as a series of transformations between the input and the desired output can improve the design and make it less coupled

## Docs
Complete documentation and reference are available on the package's site.

## Quick Start

Installation:
``` shell
pip install viper
```

Here is an example of how to use `viper` to analyze the famed `mtcars` dataset.

We want to find:
- the average consumption, expressed in Miles/(US) gallon
- the average power

Furthermore:
- only consider those cars that weigh more than 2000lbs
- group the results by the number of cylinders and number of gears
- arrange in descending orders by the grouping variables


``` python
from viper.main import *
from viper.data import mtcars

pipeline(
    mtcars,
    rename(
        "hp = power",
        "mpg = consumption",
    ),
    mutate(
        consumption=lambda r: 1 / r["consumption"]
    ),
    filter(
        lambda r: r["wt"] > 2
    ),
    group_by("cyl", "gear"),
    summarize(
        "power = mean()",
        "consumption = mean()"
    ),
    arrange(
        "cyl desc",
        "gear desc"
    ),
)
#                power  consumption
# cyl gear
# 8   5     299.500000     0.064979
#     3     194.166667     0.068824
# 6   5     175.000000     0.050761
#     4     116.500000     0.050875
#     3     107.500000     0.050989
# 4   5      91.000000     0.038462
#     4      85.000000     0.041259
#     3      97.000000     0.046512
```

Here you can find more examples, particularly on joins.

## Roadmap

The future development of the package will probably focus on:

- adding `pivot_longer`and `pivot_wider` functions
- adding more `join_*` functions

## Contributions

You are welcome to contribute to the project or open [issues](https://github.com/aropele/viper/issues) if you have any ideas.