Metadata-Version: 2.4
Name: market-data-core
Version: 1.2.15
Summary: Core contracts (protocols, models, errors, settings, telemetry, federation) for market data systems.
Author: Matt
License-Expression: MIT
Keywords: market-data,schemas,contracts,telemetry
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Typing :: Typed
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic<3,>=2.7
Requires-Dist: python-dotenv<2,>=1.0
Provides-Extra: dev
Requires-Dist: pytest<9,>=8.2; extra == "dev"
Requires-Dist: pytest-asyncio<1,>=0.23; extra == "dev"
Requires-Dist: ruff<1,>=0.6; extra == "dev"
Requires-Dist: black<25,>=24.8; extra == "dev"
Requires-Dist: mypy<2,>=1.11; extra == "dev"
Provides-Extra: bus
Requires-Dist: redis<6,>=5.0; extra == "bus"
Provides-Extra: api
Requires-Dist: fastapi<0.116,>=0.115; extra == "api"
Requires-Dist: uvicorn[standard]<0.33,>=0.30; extra == "api"
Requires-Dist: httpx<0.29,>=0.28; extra == "api"
Requires-Dist: websockets<13,>=12; extra == "api"
Provides-Extra: observability
Requires-Dist: tenacity<9,>=8.2; extra == "observability"
Requires-Dist: loguru<0.8,>=0.7; extra == "observability"
Requires-Dist: prometheus-client<0.21,>=0.20; extra == "observability"
Requires-Dist: orjson<4,>=3.10; extra == "observability"
Provides-Extra: pandas
Requires-Dist: pandas<3,>=2.2.2; extra == "pandas"
Requires-Dist: orjson<4,>=3.10; extra == "pandas"
Provides-Extra: ibkr
Requires-Dist: ib_async>=1.0.0; extra == "ibkr"
Provides-Extra: compat
Requires-Dist: fastapi<0.116,>=0.115; extra == "compat"
Requires-Dist: uvicorn[standard]<0.33,>=0.30; extra == "compat"
Requires-Dist: httpx<0.29,>=0.28; extra == "compat"
Requires-Dist: websockets<13,>=12; extra == "compat"
Requires-Dist: tenacity<9,>=8.2; extra == "compat"
Requires-Dist: loguru<0.8,>=0.7; extra == "compat"
Requires-Dist: prometheus-client<0.21,>=0.20; extra == "compat"
Requires-Dist: orjson<4,>=3.10; extra == "compat"
Requires-Dist: pandas<3,>=2.2.2; extra == "compat"
Requires-Dist: ib_async>=1.0.0; extra == "compat"
Requires-Dist: redis<6,>=5.0; extra == "compat"
Dynamic: license-file

# 🚀 Market Data Core

> **Enterprise-grade real-time market data streaming platform with Interactive Brokers integration**

[![CI](https://github.com/mjdevaccount/market-data-core/workflows/CI/badge.svg)](https://github.com/mjdevaccount/market-data-core/actions)
[![PyPI version](https://img.shields.io/pypi/v/market-data-core.svg)](https://pypi.org/project/market-data-core/)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.104+-green.svg)](https://fastapi.tiangolo.com/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A production-ready, high-performance market data streaming platform built with **SOLID principles**, **async generators**, and **enterprise-grade resilience patterns**. Features true streaming APIs, automatic reconnection, backpressure controls, and comprehensive observability.

## 📦 Installation

**From PyPI (Recommended):**
```bash
pip install market-data-core
```

**Specific version:**
```bash
pip install market-data-core==1.2.0
```

**From source:**
```bash
git clone https://github.com/mjdevaccount/market-data-core.git
cd market-data-core
pip install -e .[dev]
```

## 🐳 Docker Deployment

**NEW: Docker Compose Platform Integration** - Deploy as part of the unified market data platform!

This service is designed to run within the [market_data_infra](https://github.com/mjdevaccount/market_data_infra) Docker Compose stack:

```bash
# From the market_data_infra directory
cd market_data_infra

# Start core services (database + registry + core)
make up-core

# Check health
curl http://localhost:8081/health

# View logs
docker compose logs -f core
```

**Service Details**:
- **Port**: 8081 (changed from 8000 for platform consistency)
- **Health Check**: `GET /health` (checked every 10s)
- **Metrics**: `GET /metrics` (Prometheus-compatible)
- **Profiles**: `core` (use `--profile core` to start)

**Environment Variables**:
```bash
REGISTRY_URL=http://registry:8080      # Schema registry URL
REGISTRY_TRACK=v1                       # Registry version track
DATABASE_URL=postgresql://...           # Database connection
LOG_LEVEL=INFO                          # Logging level
```

For detailed Docker deployment documentation, see:
- **[DOCKER_DEPLOYMENT.md](DOCKER_DEPLOYMENT.md)** - Comprehensive deployment guide
- **[INFRASTRUCTURE_READINESS.md](INFRASTRUCTURE_READINESS.md)** - Platform integration overview
- **[PHASE_1_CHECKLIST.md](PHASE_1_CHECKLIST.md)** - Quick reference checklist

## 🆕 What's New in v1.2.0

**Phase 11.1: Enforcement & Drift Intelligence** - Schema governance with enforcement modes and drift detection!

### Schema Enforcement
```python
from market_data_core import get_enforcement_mode, should_enforce_strict

# Check enforcement mode (set via REGISTRY_ENFORCEMENT env var)
mode = get_enforcement_mode()  # Returns 'warn' or 'strict'

if should_enforce_strict():
    raise SchemaValidationError("Invalid payload")
else:
    logger.warning("Validation failed (warn mode)")
```

### Schema Lifecycle Events
```python
from market_data_core import (
    SchemaPublishedEvent,
    SchemaDeprecatedEvent,
    SchemaDriftEvent,
)

# Emit events when schemas change or drift
drift_event = SchemaDriftEvent(
    service="market-data-pipeline",
    schema_name="telemetry.FeedbackEvent",
    track="v1",
    local_sha="abc123...",
    registry_sha="def456...",
    drift_type="content_mismatch",
    detected_at=time.time()
)
```

### What's New in v1.1.0

**Unified Telemetry & Federation Contracts** - Now with standardized contracts for multi-component observability and federation!

### Telemetry Contracts

Standardized backpressure, health, metrics, and control contracts for unified observability:

```python
from market_data_core.telemetry import (
    FeedbackEvent,
    BackpressureLevel,
    HealthStatus,
    MetricPoint,
    ControlAction,
)

# Backpressure feedback
event = FeedbackEvent(
    coordinator_id="bars_coordinator",
    queue_size=800,
    capacity=1000,
    level=BackpressureLevel.soft,
    ts=time.time()
)

# Health status
status = HealthStatus(
    service="market-data-core",
    state="healthy",
    components=[...],
    version="1.1.0",
    ts=time.time()
)
```

### Federation Contracts

Multi-node deployment support with cluster topology contracts:

```python
from market_data_core.federation import (
    ClusterTopology,
    NodeStatus,
    NodeRole,
)

# Define cluster topology
topology = ClusterTopology(
    cluster_id=ClusterId(value="prod-us-east"),
    region=Region(name="us-east-1"),
    nodes=[
        NodeStatus(
            node_id=NodeId(value="orchestrator-01"),
            role=NodeRole.orchestrator,
            health="healthy",
            version="0.3.0",
            last_seen_ts=time.time()
        ),
    ]
)
```

### Registry Contracts

Declarative provider/sink specifications for wiring:

```python
from market_data_core.registry import ProviderSpec, SinkSpec
from market_data_core.settings import WiringPlan, ProviderConfig

# Describe providers and sinks
plan = WiringPlan(
    providers=[
        ProviderConfig(name="ibkr", params={"host": "127.0.0.1", "port": 4002}),
    ],
    sinks=[
        SinkConfig(name="bars_sink", params={"db_url": "postgresql://..."}),
    ]
)
```

**📋 See [CHANGELOG.md](CHANGELOG.md) for complete list of additions.**

**✅ 100% Backward Compatible** - All additions are optional. Existing code works unchanged.

---

## 🏗️ Architecture Highlights

### **True Streaming Architecture**
- **Async Generators**: Real-time data streams using `async for` loops
- **Shared Tickers**: Single subscription per symbol, fan-out to multiple clients
- **Backpressure Control**: Per-client rate limiting with `asyncio.Queue`
- **Hot Symbol Cache**: Instant snapshots for late joiners

### **Enterprise Resilience**
- **Circuit Breaker**: Automatic failure detection and recovery
- **Exponential Backoff**: Intelligent retry with jitter
- **Connection Rehydration**: Automatic subscription recovery after reconnects
- **Graceful Degradation**: Synthetic data fallbacks during outages

### **Production Observability**
- **Prometheus Metrics**: Request rates, error counts, latency histograms
- **Structured Logging**: JSON logs with request tracing
- **Health Endpoints**: `/healthz` and `/readyz` for monitoring
- **Performance Tracking**: WebSocket connection stats and queue metrics

## 🚀 Quick Start

### Prerequisites
- **Interactive Brokers TWS/IB Gateway** running on `127.0.0.1:4001`
- **Python 3.11+** with virtual environment
- **IBKR account** with API access enabled

### Installation
```bash
# Clone and setup
git clone https://github.com/mjdevaccount/market-data-core.git
cd market-data-core
python -m venv .venv
.\.venv\Scripts\Activate.ps1  # Windows PowerShell

# Install with development dependencies
pip install -e .[dev]

# Run pre-commit hooks
pre-commit install
```

### Start the Platform
```bash
# Development server with hot reload
uvicorn market_data_core.services.api:app --host 0.0.0.0 --port 8000 --reload

# Production server
uvicorn market_data_core.services.api:app --host 0.0.0.0 --port 8000 --workers 4
```

### API Documentation
- **Interactive Docs**: http://localhost:8000/docs
- **ReDoc**: http://localhost:8000/redoc
- **OpenAPI Schema**: http://localhost:8000/openapi.json

## 🗄️ Data Persistence Integration

Market Data Core integrates with **market-data-store** for comprehensive data persistence:

### **Market Data Store Package**
```python
# Import the market data store package
import market_data_store

# Access version information
print(f"Market Data Store version: {market_data_store.__version__}")

# Direct access to persistence operations
from mds_client import MDS, AMDS, Bar, Fundamentals, News, OptionSnap
```

### **Available Data Operations**
- **Bars/OHLCV**: Time-series price data with multiple timeframes
- **Fundamentals**: Company financial data (assets, liabilities, earnings)
- **News**: Market news with sentiment analysis
- **Options**: Options market data with Greeks (delta, gamma, IV)

### **CLI Operations** (via `mds` command)
```bash
# Health & Schema
mds ping                    # Database connectivity check
mds schema-version         # Get current schema version
mds latest-prices          # Get latest prices for symbols

# Individual Write Operations
mds write-bar              # Write single OHLCV bar
mds write-fundamental      # Write company fundamentals
mds write-news             # Write news article
mds write-option           # Write options data

# Bulk Operations
mds ingest-ndjson          # Bulk ingest from NDJSON files
mds ingest-ndjson-async    # Async bulk ingest

# Export/Import Operations
mds dump                    # Export to CSV
mds restore                 # Import from CSV
mds restore-async           # Async CSV import
mds dump-ndjson             # Export to NDJSON
mds dump-ndjson-async       # Async NDJSON export

# Job Queue Operations
mds enqueue-job             # Queue background jobs
```

### **Python Library Integration**
```python
# Synchronous operations
from mds_client import MDS
mds = MDS({"dsn": "postgresql://...", "tenant_id": "uuid"})

# Write market data
mds.upsert_bars([bar_data])
mds.upsert_fundamentals([fundamental_data])
mds.upsert_news([news_data])
mds.upsert_options([option_data])

# Read operations
latest_prices = mds.latest_prices(["AAPL", "MSFT"], vendor="ibkr")

# Async operations for high-performance scenarios
from mds_client import AMDS, AsyncBatchProcessor
amds = AMDS({"dsn": "postgresql://...", "tenant_id": "uuid", "pool_max": 10})
```

### **Key Features**
- **Tenant Isolation**: Row Level Security (RLS) ensures data separation
- **TimescaleDB Integration**: Optimized for time-series data
- **Connection Pooling**: High-performance async/sync connection management
- **Batch Processing**: Efficient bulk operations with configurable batching
- **Idempotent Operations**: Safe retry and upsert semantics
- **Production Ready**: Comprehensive error handling, logging, and monitoring

### **Quick Reference**
For detailed operation documentation, see:
- **CLI Operations**: [Market Data Store Operations Cheat Sheet](../market_data_store/cursorrules/rules/market_data_store_operations.mdc)
- **Python Library**: [Market Data Store Client Library](../market_data_store/src/mds_client/)
- **Data Models**: [Market Data Store Models](../market_data_store/src/mds_client/models.py)

## 📡 API Reference

### **REST Endpoints**

#### Market Data
```http
GET /prices?symbol=AAPL&interval=1d&limit=100&what=TRADES
GET /options?symbol=AAPL&expiry=20241220&max_contracts=50
```

#### Account Management
```http
GET /positions
GET /account?account_id=U123456
```

#### System Health
```http
GET /healthz    # IBKR connection + heartbeat
GET /readyz     # Streaming registry status
GET /metrics    # Prometheus metrics
```

### **WebSocket Streaming**

#### Real-time Quotes
```javascript
const ws = new WebSocket('ws://localhost:8000/ws/quotes/AAPL');
ws.onmessage = (event) => {
    const quote = JSON.parse(event.data);
    console.log(`AAPL: ${quote.bid}/${quote.ask} @ ${quote.last}`);
};
```

#### Market Depth (Level 2)
```javascript
const ws = new WebSocket('ws://localhost:8000/ws/depth/AAPL');
ws.onmessage = (event) => {
    const depth = JSON.parse(event.data);
    console.log('Bids:', depth.bids);  // [[price, size], ...]
    console.log('Asks:', depth.asks);  // [[price, size], ...]
};
```

#### Portfolio Updates
```javascript
const ws = new WebSocket('ws://localhost:8000/ws/portfolio/U123456');
ws.onmessage = (event) => {
    const portfolio = JSON.parse(event.data);
    console.log(`P&L: ${portfolio.unrealized_pnl}`);
    console.log(`Positions: ${portfolio.positions.length}`);
};
```

## 🏛️ Architecture Deep Dive

### **Streaming Engine**
```python
# True async generators - no polling!
async def stream_quotes(symbol: str) -> AsyncIterator[Quote]:
    """Real-time quote stream using shared tickers."""
    ticker = await get_or_create_ticker(symbol)
    while True:
        await ib.waitOnUpdate()  # Wait for IBKR updates
        yield Quote(
            symbol=symbol,
            bid=ticker.bid,
            ask=ticker.ask,
            last=ticker.last,
            volume=ticker.volume,
            delayed=False
        )
```

### **Connection Resilience**
```python
# Automatic reconnection with subscription rehydration
@retry(wait=wait_exponential(multiplier=1, min=1, max=30))
async def ensure_connection(self) -> None:
    """Reconnect and rehydrate all active subscriptions."""
    if not self.connected:
        await self.connect()
        await self.rehydrate_subscriptions()  # Restore all streams
```

### **Backpressure Control**
```python
# Per-client rate limiting with queue management
class WebSocketManager:
    def __init__(self):
        self.client_queues: dict[str, asyncio.Queue] = {}
        self.max_queue_size = 100
        self.max_messages_per_second = 10
    
    async def _process_client_queue(self, client_id: str):
        """Process client queue with rate limiting."""
        queue = self.client_queues[client_id]
        while True:
            message = await queue.get()
            await self._send_with_backpressure(client_id, message)
```

## 🔧 Configuration

### **Environment Variables**
```bash
# IBKR Connection
IB_HOST=127.0.0.1
IB_PORT=4001
IB_CLIENT_ID=7

# Options Pacing (Rate Limiting)
OPTIONS_SEMAPHORE_SIZE=5
OPTIONS_BASE_DELAY=0.1
OPTIONS_BACKOFF_MULTIPLIER=1.5
OPTIONS_MAX_CONTRACTS=50

# WebSocket Settings
WS_MAX_QUEUE_SIZE=100
WS_MAX_MESSAGES_PER_SECOND=10
WS_HEARTBEAT_INTERVAL=30

# Observability
METRICS_ENABLED=true
METRICS_PORT=8000
LOG_LEVEL=INFO
```

### **Advanced Configuration**
```python
# Custom pacing controls
config = get_options_config()
config.semaphore_size = 10
config.base_delay = 0.05
config.backoff_multiplier = 2.0

# WebSocket backpressure tuning
websocket_manager.max_queue_size = 200
websocket_manager.max_messages_per_second = 20
```

## 📊 Monitoring & Observability

### **Prometheus Metrics**
```bash
# Request metrics
market_data_requests_total{method="GET",endpoint="/prices",status_code="200"}
market_data_request_duration_seconds{method="GET",endpoint="/prices"}

# WebSocket metrics
websocket_clients_total{stream_type="quotes"}
websocket_messages_sent_total{stream_type="quotes"}
websocket_dropped_messages_total{reason="queue_full"}

# IBKR connection metrics
ibkr_connection_status{status="connected"}
ibkr_reconnects_total
ibkr_subscriptions_total{type="quotes"}
```

### **Health Checks**
```bash
# Basic health
curl http://localhost:8000/healthz
# {"status": "healthy", "ibkr_connected": true, "last_heartbeat": "2024-01-15T10:30:00Z"}

# Readiness check
curl http://localhost:8000/readyz
# {"status": "ready", "subscriptions_rehydrated": true, "active_streams": 5}
```

## 🧪 Testing

### **Run Test Suite**
```bash
# Full test suite
pytest -v

# Specific test categories
pytest tests/test_api.py -v
pytest tests/test_ibkr_streams.py -v
pytest tests/test_websocket_streaming.py -v

# With coverage
pytest --cov=src --cov-report=html
```

### **Test Categories**
- **Unit Tests**: Individual component testing
- **Integration Tests**: IBKR adapter functionality
- **WebSocket Tests**: Real-time streaming validation
- **Error Handling**: Resilience pattern testing
- **Performance Tests**: Backpressure and rate limiting

## 🚀 Production Deployment

### **Docker Deployment**
```dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY src/ ./src/
CMD ["uvicorn", "market_data_core.services.api:app", "--host", "0.0.0.0", "--port", "8000"]
```

### **Kubernetes Deployment**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: market-data-core
spec:
  replicas: 3
  selector:
    matchLabels:
      app: market-data-core
  template:
    metadata:
      labels:
        app: market-data-core
    spec:
      containers:
      - name: market-data-core
        image: market-data-core:latest
        ports:
        - containerPort: 8000
        env:
        - name: IB_HOST
          value: "ib-gateway-service"
        - name: IB_PORT
          value: "4001"
```

### **Load Balancing**
```nginx
upstream market_data {
    server market-data-1:8000;
    server market-data-2:8000;
    server market-data-3:8000;
}

server {
    listen 80;
    location / {
        proxy_pass http://market_data;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
    }
}
```

## 🔒 Security & Compliance

### **API Security**
- **Input Validation**: Pydantic models with strict type checking
- **Rate Limiting**: Per-endpoint and per-client throttling
- **Error Handling**: Structured error responses without data leakage
- **CORS**: Configurable cross-origin resource sharing

### **Data Privacy**
- **No Data Persistence**: All data is streamed, not stored
- **Secure Connections**: TLS/SSL for production deployments
- **Access Control**: Environment-based configuration management

## 🤝 Contributing

### **Development Setup**
```bash
# Install development dependencies
pip install -e .[dev]

# Install pre-commit hooks
pre-commit install

# Run code quality checks
ruff check src tests
black --check src tests
mypy src
pytest
```

### **Code Standards**
- **SOLID Principles**: Clean architecture with separation of concerns
- **Type Safety**: Full mypy compliance with strict type checking
- **Code Quality**: Ruff linting with 100% compliance
- **Testing**: Comprehensive test coverage with pytest
- **Documentation**: Comprehensive docstrings and type hints

## 📈 Performance Characteristics

### **Throughput**
- **REST API**: 1000+ requests/second
- **WebSocket**: 10,000+ messages/second per connection
- **Memory Usage**: <100MB base + 1MB per active stream
- **Latency**: <10ms for REST, <5ms for WebSocket

### **Scalability**
- **Horizontal**: Stateless design supports multiple instances
- **Vertical**: Async architecture scales with CPU cores
- **Connection Pooling**: Efficient IBKR connection reuse
- **Backpressure**: Automatic flow control prevents memory issues

## 🛠️ Troubleshooting

### **Common Issues**

#### Connection Problems
```bash
# Check IBKR connection
curl http://localhost:8000/healthz

# Verify TWS/IB Gateway is running
netstat -an | grep 4001
```

#### WebSocket Issues
```bash
# Check WebSocket connections
curl http://localhost:8000/metrics | grep websocket

# Monitor queue sizes
curl http://localhost:8000/readyz
```

#### Performance Issues
```bash
# Check request rates
curl http://localhost:8000/metrics | grep requests_total

# Monitor memory usage
curl http://localhost:8000/metrics | grep memory
```

## 📚 Additional Resources

- **Interactive Brokers API**: https://interactivebrokers.github.io/tws-api/
- **FastAPI Documentation**: https://fastapi.tiangolo.com/
- **WebSocket Standards**: https://tools.ietf.org/html/rfc6455
- **Prometheus Metrics**: https://prometheus.io/docs/concepts/metric_types/

## 📄 License

MIT License - see [LICENSE](LICENSE) file for details.

## 🔗 Cross-Repo Contract Testing

Core automatically validates contracts against downstream repos (Pipeline, Store, Orchestrator) when PRs are opened.

### How It Works

```
PR opened → contracts workflow → schemas exported → fanout triggered → downstream tests run
```

### For Developers

**Opening a PR:**
- Contracts automatically export 19 schemas
- Fanout workflow dispatches to downstream repos with exact Core SHA
- Downstream tests run in parallel
- Results visible in Actions tab

**Manual Testing:**
```bash
# Test contracts
gh workflow run contracts.yml --ref your-branch

# Test multiple Core versions
gh workflow run contracts_matrix.yml -f refs='["v1.1.0","base"]'
```

### Downstream Integration

Each downstream repo needs:
- `.github/workflows/_contracts_reusable.yml` — Test runner
- `.github/workflows/dispatch_contracts.yml` — Dispatcher
- `tests/contracts/` — Contract compatibility tests
- `REPO_TOKEN` secret configured

See [Cross-Repo Testing Guide](docs/CROSS_REPO_TESTING.md) for complete integration instructions.

### Key Features

- ✅ **Automatic validation** — No manual steps required
- ✅ **Exact versions** — Tests against precise Core SHA
- ✅ **Early detection** — Breaking changes caught before merge
- ✅ **Parallel execution** — All repos tested simultaneously
- ✅ **Fast** — Complete validation in ~5 minutes

---

**Built with ❤️ for high-frequency trading and real-time market data applications.**
