Metadata-Version: 2.3
Name: unified-logger
Version: 0.1.0
Summary: Async interface to backend logging systems with support for Fluentd, Elasticsearch, Kafka, and Jaeger
Author: David Grimm
Author-email: David Grimm <David.Grimm@wellsfargo.com>
License: MIT
Requires-Dist: click>=8.0.0
Requires-Dist: aiofiles>=0.8.0
Requires-Dist: fluent-logger>=0.10.0 ; extra == 'aggregation'
Requires-Dist: msgpack>=1.0.0 ; extra == 'aggregation'
Requires-Dist: elasticsearch[async]>=8.0.0,<9.0.0 ; extra == 'all'
Requires-Dist: aiohttp>=3.8.0 ; extra == 'all'
Requires-Dist: fluent-logger>=0.10.0 ; extra == 'all'
Requires-Dist: msgpack>=1.0.0 ; extra == 'all'
Requires-Dist: glean-klient>=0.1.0 ; extra == 'all'
Requires-Dist: jaeger-client>=4.8.0 ; extra == 'all'
Requires-Dist: thrift>=0.13.0 ; extra == 'all'
Requires-Dist: pytest>=7.0.0 ; extra == 'all'
Requires-Dist: pytest-asyncio>=0.21.0 ; extra == 'all'
Requires-Dist: pytest-cov>=4.0.0 ; extra == 'all'
Requires-Dist: black>=23.0.0 ; extra == 'all'
Requires-Dist: ruff>=0.1.0 ; extra == 'all'
Requires-Dist: mypy>=1.5.0 ; extra == 'all'
Requires-Dist: glean-klient>=0.1.0 ; extra == 'all'
Requires-Dist: elasticsearch[async]>=8.0.0,<9.0.0 ; extra == 'backends'
Requires-Dist: aiohttp>=3.8.0 ; extra == 'backends'
Requires-Dist: fluent-logger>=0.10.0 ; extra == 'backends'
Requires-Dist: msgpack>=1.0.0 ; extra == 'backends'
Requires-Dist: glean-klient>=0.1.0 ; extra == 'backends'
Requires-Dist: jaeger-client>=4.8.0 ; extra == 'backends'
Requires-Dist: thrift>=0.13.0 ; extra == 'backends'
Requires-Dist: pytest>=7.0.0 ; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0 ; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0 ; extra == 'dev'
Requires-Dist: black>=23.0.0 ; extra == 'dev'
Requires-Dist: ruff>=0.1.0 ; extra == 'dev'
Requires-Dist: mypy>=1.5.0 ; extra == 'dev'
Requires-Dist: elasticsearch[async]>=8.0.0,<9.0.0 ; extra == 'elasticsearch'
Requires-Dist: aiohttp>=3.8.0 ; extra == 'elasticsearch'
Requires-Dist: fluent-logger>=0.10.0 ; extra == 'fluentd'
Requires-Dist: msgpack>=1.0.0 ; extra == 'fluentd'
Requires-Dist: jaeger-client>=4.8.0 ; extra == 'jaeger'
Requires-Dist: thrift>=0.13.0 ; extra == 'jaeger'
Requires-Dist: glean-klient>=0.1.0 ; extra == 'kafka'
Requires-Dist: elasticsearch[async]>=8.0.0,<9.0.0 ; extra == 'search'
Requires-Dist: aiohttp>=3.8.0 ; extra == 'search'
Requires-Dist: glean-klient>=0.1.0 ; extra == 'streaming'
Requires-Dist: pytest>=7.0.0 ; extra == 'test'
Requires-Dist: pytest-asyncio>=0.21.0 ; extra == 'test'
Requires-Dist: pytest-cov>=4.0.0 ; extra == 'test'
Requires-Dist: jaeger-client>=4.8.0 ; extra == 'tracing'
Requires-Dist: thrift>=0.13.0 ; extra == 'tracing'
Requires-Python: >=3.12
Project-URL: Documentation, https://github.com/your-org/unified_logger#readme
Project-URL: Homepage, https://github.com/your-org/unified_logger
Project-URL: Issues, https://github.com/your-org/unified_logger/issues
Project-URL: Repository, https://github.com/your-org/unified_logger.git
Provides-Extra: aggregation
Provides-Extra: all
Provides-Extra: backends
Provides-Extra: dev
Provides-Extra: elasticsearch
Provides-Extra: fluentd
Provides-Extra: jaeger
Provides-Extra: kafka
Provides-Extra: search
Provides-Extra: streaming
Provides-Extra: test
Provides-Extra: tracing
Description-Content-Type: text/markdown

# Unified Logger

A Python logging library designed to provide consistent, structured logging across different environments and applications. The library offers an async interface to multiple backend logging systems with support for batching, rate limiting, and structured formats.

## Features

### Core Capabilities

- **Async Interface**: Non-blocking logging operations with async/await support
  - Async context managers for automatic connection management
  - Thread pool executor for concurrent operations
  - Configurable max workers and batching parameters

- **Multiple Backends**: Support for various logging destinations:
  - **Console**: stdout/stderr with color support and real-time output
  - **File**: Automatic rotation, configurable size limits, async file operations
  - **Elasticsearch**: Structured search and analytics, time-based indices, batch operations
  - **Fluentd**: Log aggregation and forwarding, structured data processing
  - **Kafka**: High-throughput streaming, partitioned topics, compression support
  - **OpenTelemetry**: Distributed tracing via OTLP protocol (Jaeger, Tempo, etc.)
  - **Jaeger** (deprecated): Direct Jaeger integration (use OpenTelemetry instead)

- **Structured Logging**: Rich contextual information
  - JSON and plain text formatters
  - Custom extra fields for business context
  - Exception and stack trace support
  - Timestamp and log level metadata

### Advanced Features

- **Batching System**: Automatic log batching for optimal performance
  - Configurable batch size and timeout
  - Per-backend batch operations
  - Automatic flush on shutdown

- **Filtering**: Fine-grained control over log processing
  - **Level Filter**: Filter by minimum log level
  - **Field Filter**: Require/forbid specific fields
  - **Rate Limit Filter**: Prevent log flooding with burst support

- **Error Handling**: Robust error management
  - Graceful backend failure handling
  - Automatic retry with exponential backoff
  - Health checking for backend monitoring
  - Fallback to standard Python logging

- **Configuration Management**:
  - File-based configuration (JSON)
  - Programmatic configuration
  - Environment-specific settings
  - CLI template generation

- **CLI Interface**: Complete command-line tool
  - Generate configuration templates
  - Test backend connectivity
  - Send test log messages
  - Support for all backends and log levels
  - Structured logging with extra fields
  - Verbose mode for debugging

- **Thread Safety**: Safe for concurrent use
  - Asyncio locks for shared state
  - Thread pool for blocking operations
  - Safe for multi-threaded applications

- **Context Manager Support**: Clean resource management
  - Automatic connection/disconnection
  - Proper cleanup on exit
  - Exception handling during shutdown

## Installation

```bash
# Basic installation (console + file backends only)
uv add unified-logger
# or: pip install unified-logger

# Individual backends
uv add "unified-logger[elasticsearch]"  # Search & analytics
uv add "unified-logger[fluentd]"        # Log aggregation  
uv add "unified-logger[kafka]"          # Log streaming
uv add "unified-logger[otel]"           # OpenTelemetry tracing (recommended)
uv add "unified-logger[jaeger]"         # Legacy Jaeger (deprecated)

# Backend groups
uv add "unified-logger[backends]"       # All backends
uv add "unified-logger[all]"            # Backends + dev tools

# Multiple backends
uv add "unified-logger[elasticsearch,kafka]"
```

> 📖 **See [INSTALL.md](INSTALL.md) for detailed installation instructions including uv and pip usage.**

## Quick Start

### Basic Usage

```python
import asyncio
import logging
from unified_logger import UnifiedLogger, ConsoleBackend

async def main():
    # Create logger
    logger = UnifiedLogger(
        name="my_app",
        level=logging.INFO,
        format_type="json"
    )
    
    # Add console backend
    logger.add_backend(ConsoleBackend())
    
    # Use with async context manager
    async with logger:
        await logger.info("Application started")
        await logger.info(
            "User action completed",
            extra={
                "user_id": "user123",
                "action": "login",
                "duration_ms": 150
            }
        )
```

### Multiple Backends

```python
from unified_logger import (
    UnifiedLogger, ConsoleBackend, FileBackend, 
    ElasticsearchBackend, JSONFormatter
)
from unified_logger.handlers.file import FileBackendConfig
from unified_logger.handlers.elasticsearch import ElasticsearchBackendConfig

async def setup_logger():
    logger = UnifiedLogger(name="multi_backend_app")
    
    # Console output
    logger.add_backend(ConsoleBackend())
    
    # File output with rotation
    file_config = FileBackendConfig(
        file_path="app.log",
        max_bytes=10*1024*1024,  # 10MB
        backup_count=5
    )
    logger.add_backend(FileBackend(config=file_config))
    
    # Elasticsearch for searching/analytics
    es_config = ElasticsearchBackendConfig(
        host="localhost",
        port=9200,
        index_prefix="app-logs"
    )
    logger.add_backend(ElasticsearchBackend(
        config=es_config,
        formatter=JSONFormatter()
    ))
    
    return logger
```

### Error Handling with Context

```python
async def handle_request():
    try:
        # Process request
        result = await process_user_request()
        await logger.info(
            "Request processed successfully",
            extra={
                "request_id": "req_123",
                "processing_time_ms": 45,
                "result_count": len(result)
            }
        )
    except Exception as e:
        await logger.error(
            "Request processing failed",
            extra={
                "request_id": "req_123",
                "error_type": type(e).__name__,
                "user_id": "user456"
            },
            exc_info=(type(e), e, e.__traceback__)
        )
```

## CLI Usage

The library includes a command-line interface for testing, configuration, and sending logs to multiple backends.

### Basic CLI Options

All commands support these global options:

```bash
# Show help
unified-logger --help

# Enable verbose output
unified-logger -v <command>

# Use custom configuration file
unified-logger -c config.json <command>
```

### Available Commands

#### 1. Generate Configuration Template

Create a complete configuration file with all backend options:

```bash
# Output to stdout
unified-logger config-template

# Save to file
unified-logger config-template -o config.json
unified-logger config-template --output myconfig.json
```

The generated template includes:
- Logger settings (name, level, format type, batching)
- All backend configurations with default values
- Console, file, Elasticsearch, Fluentd, Kafka, and Jaeger options
- Comments on required dependencies for each backend

#### 2. Test Backend Connectivity

Verify that backends are accessible and healthy:

```bash
# Test all available backends
unified-logger test

# Test specific backends
unified-logger test -b console
unified-logger test -b file -b elasticsearch
unified-logger test -b kafka -b jaeger -b fluentd
```

Output shows:
- ✅ Backend is healthy and connected
- ⚠️ Backend connected but unhealthy
- ❌ Backend connection failed

#### 3. Send Log Messages

Send test log messages to configured backends:

```bash
# Basic message to console (default)
unified-logger send -m "Test message"

# Specify log level
unified-logger send -m "Error occurred" -l ERROR
unified-logger send -m "Debug info" -l DEBUG
unified-logger send -m "Critical alert" -l CRITICAL

# Available log levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
```

##### Multiple Backends

```bash
# Send to multiple backends simultaneously
unified-logger send \
  -m "Application started" \
  -l INFO \
  -b console \
  -b file \
  -b elasticsearch

# Use all backends
unified-logger send -m "Test" -b console -b file -b elasticsearch -b kafka -b fluentd -b jaeger
```

##### Structured Logging with Extra Fields

Add contextual information using key=value pairs:

```bash
# Single extra field
unified-logger send -m "User login" -e "user_id=123"

# Multiple extra fields
unified-logger send \
  -m "User login" \
  -l INFO \
  -b console \
  -e "user_id=123" \
  -e "ip=192.168.1.1" \
  -e "action=login" \
  -e "duration_ms=45"

# Complex logging scenario
unified-logger send \
  -m "Payment processed" \
  -l INFO \
  -b elasticsearch \
  -b kafka \
  -e "transaction_id=tx_789" \
  -e "amount=99.99" \
  -e "currency=USD" \
  -e "status=success"
```

##### Format Types

```bash
# JSON format (default)
unified-logger send -m "Test" -f json

# Plain text format
unified-logger send -m "Test" -f plain
```

### CLI Examples

**Development Testing**
```bash
# Quick console test
unified-logger send -m "Dev test" -l DEBUG

# Test file backend with rotation
unified-logger send -m "Testing file logging" -b file -l INFO
```

**Production Monitoring**
```bash
# Send structured logs to Elasticsearch and Kafka
unified-logger send \
  -m "Service health check" \
  -l INFO \
  -b elasticsearch \
  -b kafka \
  -e "service=api" \
  -e "status=healthy" \
  -e "response_time=23ms"
```

**Error Reporting**
```bash
# Log errors with context
unified-logger send \
  -m "Database connection failed" \
  -l ERROR \
  -b console \
  -b file \
  -b fluentd \
  -e "error_code=DB_CONN_TIMEOUT" \
  -e "retry_count=3" \
  -e "host=db.example.com"
```

**Tracing and Observability**
```bash
# Send trace to Jaeger
unified-logger send \
  -m "API request completed" \
  -l INFO \
  -b jaeger \
  -e "trace_id=abc123" \
  -e "span_id=def456" \
  -e "duration_ms=150"
```

## Configuration

Create a configuration file for easy setup:

```json
{
  "logger": {
    "name": "my_application",
    "level": "INFO", 
    "format_type": "json",
    "enable_standard_logging": true,
    "batch_size": 100,
    "batch_timeout": 5.0,
    "backends": ["console", "file", "elasticsearch"]
  },
  "backends": {
    "console": {
      "enabled": true,
      "stream": "stdout",
      "use_colors": true
    },
    "file": {
      "enabled": true,
      "file_path": "app.log",
      "max_bytes": 10485760,
      "backup_count": 5
    },
    "elasticsearch": {
      "enabled": true,
      "host": "localhost",
      "port": 9200,
      "index_prefix": "logs",
      "batch_size": 100
    }
  }
}
```

## Supported Backends

### Console Backend
- Outputs to stdout/stderr
- Color support for different log levels
- Real-time output

### File Backend  
- Automatic log rotation
- Configurable file size and backup count
- Async file operations

### Elasticsearch Backend *(requires elasticsearch[async])*
- Structured log storage
- Time-based indices
- Batch operations for performance
- Full-text search capabilities

### Fluentd Backend *(requires fluent-logger)*
- Log aggregation and forwarding
- Structured data processing
- Enterprise log management

### Kafka Backend *(requires glean-kafka)*
- High-throughput log streaming
- Partitioned topics
- Async message publishing
- Built-in compression

### OpenTelemetry Backend *(requires opentelemetry-api, opentelemetry-sdk, opentelemetry-exporter-otlp-proto-http)*
- Modern distributed tracing via OTLP protocol
- Compatible with Jaeger, Tempo, and other OTLP collectors
- Rich span context with attributes and events
- Parent-child span relationships for request flows
- Automatic batching and export

### Jaeger Backend *(deprecated - use OpenTelemetry)*
- Legacy Jaeger direct integration
- Consider migrating to OpenTelemetry backend

## Filters

Control which logs are processed:

### Level Filter
```python
from unified_logger.filters import LevelFilter

# Only allow WARNING and above
level_filter = LevelFilter(min_level=logging.WARNING)
```

### Field Filter
```python
from unified_logger.filters import FieldFilter

# Require specific fields
field_filter = FieldFilter(
    required_fields={"user_id", "request_id"},
    forbidden_fields={"password", "secret"}
)
```

### Rate Limit Filter
```python
from unified_logger.filters import RateLimitFilter

# Limit to 10 logs/second per logger
rate_filter = RateLimitFilter(
    max_logs_per_second=10.0,
    burst_size=20
)
```

## Performance Considerations

- **Batching**: Logs are automatically batched for better performance
- **Async Operations**: Non-blocking I/O operations
- **Connection Pooling**: Efficient connection reuse for backends
- **Rate Limiting**: Prevents log flooding and resource exhaustion
- **Lazy Evaluation**: Expensive operations only performed when needed

## Error Handling

The library provides graceful error handling:

- Backend failures don't stop other backends
- Automatic retry with exponential backoff
- Health checking for backend monitoring
- Fallback to standard Python logging

## Development

```bash
# Clone repository
git clone <repository-url>
cd unified_logger

# Install development dependencies
pip install -e .[all]

# Run example
python example.py

# Run CLI tests
unified-logger test
```

## Examples

The `examples/` directory contains real-world usage examples:

### trade_log_example.py
Demonstrates sending structured trading transaction logs to Elasticsearch:
- JSON formatted logging with rich context
- Trading-specific fields (symbol, quantity, price, etc.)
- Async logging with proper connection management

```python
# Run the example
python examples/trade_log_example.py
```

### trade_trace_example.py
Shows distributed tracing of a complete trading workflow using OpenTelemetry:
- Parent-child span relationships
- Trading flow: order → risk check → execution → settlement
- Rich span attributes and events
- OTLP export to Jaeger

```python
# Run the example
python examples/trade_trace_example.py
# View traces in Jaeger UI at http://<jaeger-host>:16686
```

See `example.py` for additional examples showing:
- Multiple backend configuration
- Structured logging patterns
- Error handling with context
- Performance monitoring
- Business event logging

## CLI Command Reference

| Command | Options | Description |
|---------|---------|-------------|
| `--help` | - | Show help message and exit |
| `--config`, `-c` | `<path>` | Specify configuration file path |
| `--verbose`, `-v` | - | Enable verbose output |
| `config-template` | `--output`, `-o` | Generate configuration file template |
| `test` | `--backend`, `-b` | Test backend connectivity (repeatable) |
| `send` | `--message`, `-m` | Send log message (required) |
| | `--level`, `-l` | Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL |
| | `--backend`, `-b` | Target backend(s) - repeatable |
| | `--format-type`, `-f` | Format: json or plain |
| | `--extra`, `-e` | Extra fields in key=value format (repeatable) |

### Available Backends

- `console` - Standard output with color support
- `file` - File system with rotation
- `elasticsearch` - Search and analytics
- `fluentd` - Log aggregation
- `kafka` - Stream processing
- `otel` - OpenTelemetry distributed tracing (recommended)
- `jaeger` - Legacy Jaeger tracing (deprecated)

## Architecture

```
src/unified_logger/
├── __init__.py         # Public API exports
├── core/              # Core logging functionality
│   ├── logger.py      # UnifiedLogger main class
│   ├── backend.py     # Abstract backend interface
│   └── exceptions.py  # Custom exceptions
├── formatters/        # Log formatters
│   ├── base.py        # Abstract formatter
│   ├── json_formatter.py
│   └── plain_formatter.py
├── handlers/          # Backend implementations
│   ├── console.py     # Console output
│   ├── file.py        # File system
│   ├── elasticsearch.py
│   ├── fluentd.py
│   ├── kafka.py
│   ├── otel.py        # OpenTelemetry (OTLP)
│   └── jaeger.py      # Legacy Jaeger
├── filters/           # Log filtering logic
│   ├── level_filter.py
│   ├── field_filter.py
│   └── rate_limit.py
└── cli/              # Command-line interface
    └── main.py       # Click-based CLI implementation
```

### Key Classes

- **UnifiedLogger**: Main async logger interface with batching and backend management
- **LoggingBackend**: Abstract base class for all backend implementations
- **LogRecord**: Structured log record with timestamp, level, message, and context
- **LogFormatter**: Abstract formatter interface for JSON/plain text output
- **Filters**: Level, field, and rate-limit filtering for log processing

## API Reference

### UnifiedLogger Methods

#### Initialization

```python
logger = UnifiedLogger(
    name="app_name",           # Logger name
    level=logging.INFO,         # Minimum log level
    format_type="json",         # Default format: "json" or "plain"
    backends=None,              # List of backend instances
    enable_standard_logging=True,  # Also log to Python logging
    max_workers=4,              # Thread pool size
    batch_size=100,             # Max logs per batch
    batch_timeout=5.0           # Max seconds before flushing batch
)
```

#### Logging Methods

All logging methods support:
- `message`: Log message string (required)
- `extra`: Dictionary of additional context fields (optional)
- `exc_info`: Exception tuple for error logging (optional)

```python
# Log levels (async methods)
await logger.debug("Debug message", extra={...})
await logger.info("Info message", extra={...})
await logger.warning("Warning message", extra={...})  # or logger.warn()
await logger.error("Error message", extra={...}, exc_info=...)
await logger.critical("Critical message", extra={...})  # or logger.fatal()

# Generic log method
await logger.log(logging.INFO, "Message", extra={...})
```

#### Backend Management

```python
# Add a backend
logger.add_backend(ConsoleBackend())

# Remove a backend by name
logger.remove_backend("console")

# Connect all backends
results = await logger.connect_backends()  # Returns: {backend_name: success}

# Disconnect all backends
results = await logger.disconnect_backends()

# Flush pending logs
await logger.flush()
```

#### Context Manager

```python
# Automatic connection/disconnection
async with logger:
    await logger.info("Message")
    # Backends auto-connected on enter, disconnected on exit
```

### Backend Configuration

Each backend accepts a configuration dataclass:

```python
from unified_logger.handlers.console import ConsoleBackendConfig
from unified_logger.handlers.file import FileBackendConfig
from unified_logger.handlers.elasticsearch import ElasticsearchBackendConfig

# Console configuration
console_config = ConsoleBackendConfig(
    enabled=True,
    timeout=1.0,
    stream="stdout",  # or "stderr"
    use_colors=True
)

# File configuration
file_config = FileBackendConfig(
    enabled=True,
    timeout=5.0,
    file_path="app.log",
    max_bytes=10*1024*1024,  # 10MB
    backup_count=5,
    encoding="utf-8"
)

# Elasticsearch configuration
es_config = ElasticsearchBackendConfig(
    enabled=True,
    timeout=10.0,
    host="localhost",
    port=9200,
    index_prefix="logs",
    username=None,
    password=None,
    use_ssl=False,
    batch_size=100
)
```

### Filter Application

```python
from unified_logger.filters import LevelFilter, FieldFilter, RateLimitFilter
import logging

# Create filters
level_filter = LevelFilter(min_level=logging.WARNING)
field_filter = FieldFilter(
    required_fields={"user_id"},
    forbidden_fields={"password"}
)
rate_filter = RateLimitFilter(
    max_logs_per_second=10.0,
    burst_size=20
)

# Apply to backend (if backend supports filtering)
backend = ConsoleBackend()
backend.add_filter(level_filter)
```

## Important Notes

### UTC Timestamps
All timestamps in the unified-logger are automatically converted to UTC using timezone-aware datetime objects. This ensures consistent log correlation across distributed systems and different time zones.

### OpenTelemetry vs Jaeger Backend
The **OpenTelemetry backend** (`otel`) is the recommended approach for distributed tracing:
- Modern OTLP protocol support
- Works with Jaeger, Tempo, and other OTLP-compatible collectors
- Better span context management and attributes
- Active development and community support

The legacy **Jaeger backend** is deprecated and maintained only for backward compatibility.

## Requirements

- Python 3.10+
- Optional backend-specific dependencies (see Installation section)

### CLI Entry Point

The `unified-logger` command is automatically installed as a console script when you install the package:

```bash
# After installation, the CLI is available globally
which unified-logger

# View available commands
unified-logger --help
```

## License

[Add your license information here]

## Contributing

[Add contribution guidelines here]