Metadata-Version: 2.3
Name: airsql
Version: 0.4.0
Summary: A decorator-based SQL execution framework for Airflow
Keywords: airflow,sql,etl,data-engineering,bigquery,postgres
Author: 1cadumagalhaes
Author-email: 1cadumagalhaes <github@cadumagalhaes.dev>
License: MIT License
         
         Copyright (c) 2025 Carlos Eduardo Magalhães
         
         Permission is hereby granted, free of charge, to any person obtaining a copy
         of this software and associated documentation files (the "Software"), to deal
         in the Software without restriction, including without limitation the rights
         to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
         copies of the Software, and to permit persons to whom the Software is
         furnished to do so, subject to the following conditions:
         
         The above copyright notice and this permission notice shall be included in all
         copies or substantial portions of the Software.
         
         THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
         IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
         FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
         AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
         LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
         OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
         SOFTWARE.
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Framework :: Apache Airflow
Requires-Dist: apache-airflow-providers-common-sql>=1.27.1
Requires-Dist: apache-airflow-providers-postgres>=6.2.0
Requires-Dist: apache-airflow[google,standard]>=3.0.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: jinja2>=3.0.0
Maintainer: 1cadumagalhaes
Maintainer-email: 1cadumagalhaes <github@cadumagalhaes.dev>
Requires-Python: >=3.12
Project-URL: Changelog, https://github.com/1cadumagalhaes/airsql/releases
Project-URL: Documentation, https://github.com/1cadumagalhaes/airsql#readme
Project-URL: Homepage, https://github.com/1cadumagalhaes/airsql
Project-URL: Issues, https://github.com/1cadumagalhaes/airsql/issues
Project-URL: Repository, https://github.com/1cadumagalhaes/airsql
Description-Content-Type: text/markdown

# AirSQL

A decorator-based SQL execution framework for Airflow that provides clean, Python-like syntax for data operations.

## Features

- 🎯 **Decorator-based syntax** - Clean, intuitive Python decorators for SQL operations
- 🔗 **Native Airflow integration** - Uses Airflow connections and follows Airflow patterns
- 🗃️ **Multi-database support** - Works with Postgres, BigQuery, and more
- 📄 **SQL file support** - Keep complex queries in separate `.sql` files with Jinja templating
- ⚡ **Flexible outputs** - Write to tables, return DataFrames, or save to files
- 🔄 **Smart operations** - Built-in support for replace, merge/upsert operations
- 🌐 **Cross-database queries** - Query across different databases (planned with DataFusion)
- 🔍 **Data quality checks** - Built-in SQL check operators compatible with dbt tests
- 📊 **Transfer operators** - Move data between BigQuery, Postgres, and GCS
- 👁️ **Smart sensors** - SQL sensors with retry logic for BigQuery and Postgres

## Installation

```bash
pip install airsql
```

Or if you're using uv:

```bash
uv add airsql
```

## Quick Start

### Basic Usage

#### 1. Simple DataFrame Query

```python
from airsql import sql, Table, File

@sql.dataframe(source_conn="postgres_conn")
def get_active_users():
    return "SELECT * FROM users WHERE active = true"

# Use in DAG
df_task = get_active_users()
```

#### 2. Query with Table References

```python
@sql.dataframe
def user_activity_analysis(users_table, events_table):
    return """
    SELECT u.id, u.name, COUNT(e.id) as event_count
    FROM {{ users_table }} u
    LEFT JOIN {{ events_table }} e ON u.id = e.user_id
    GROUP BY u.id, u.name
    """

# Use in DAG
analysis_task = user_activity_analysis(
    users_table=Table("postgres_conn", "users.active_users"),
    events_table=Table("bigquery_conn", "analytics.user_events")
)
```

#### 3. Replace Table Content

```python
@sql.replace(output_table=Table("postgres_conn", "reports.daily_summary"))
def create_daily_report(transactions_table):
    return """
    SELECT DATE(created_at) as date, SUM(amount) as total
    FROM {{ transactions_table }}
    GROUP BY DATE(created_at)
    """

# Use in DAG
report_task = create_daily_report(
    transactions_table=Table("postgres_conn", "transactions.orders")
)
```

#### 4. Data Quality Checks

```python
@sql.check(conn_id="bigquery_conn")
def test_no_nulls(table):
    return "SELECT COUNT(*) FROM {{ table }} WHERE id IS NULL"

@sql.check(conn_id="postgres_conn")
def test_row_count(table):
    return "SELECT CASE WHEN COUNT(*) > 0 THEN 1 ELSE 0 END FROM {{ table }}"

# Use in DAG
null_check = test_no_nulls(table=Table("bigquery_conn", "analytics.users"))
count_check = test_row_count(table=Table("postgres_conn", "staging.orders"))
```

#### 5. Transfer Operations

```python
from airsql import BigQueryToPostgresOperator, PostgresToBigQueryOperator

# Transfer from BigQuery to Postgres
bq_to_pg = BigQueryToPostgresOperator(
    task_id="transfer_users",
    source_project_dataset_table="my-project.analytics.users",
    postgres_conn_id="postgres_default",
    destination_table="staging.users",
    gcs_bucket="temp-bucket",
    gcp_conn_id="google_cloud_default"
)

# Transfer from Postgres to BigQuery
pg_to_bq = PostgresToBigQueryOperator(
    task_id="transfer_orders",
    postgres_conn_id="postgres_default",
    sql="SELECT * FROM orders WHERE date >= '2024-01-01'",
    destination_project_dataset_table="my-project.staging.orders",
    gcs_bucket="temp-bucket",
    gcp_conn_id="google_cloud_default"
)
```

For more examples and detailed documentation, see the [full documentation](src/airsql/).

## Migration from retize.sql

This package is the evolution of `retize.sql`. The main changes:
- Package renamed from `retize.sql` to `airsql`
- Table class `schema` field renamed to `dataset` (avoids Pydantic warnings)
- Asset URIs changed from `rtz://` to `airsql://`
- Improved organization with sensors and transfers in submodules

## License

This project is licensed under the MIT License.