Textforge Performance Guide

This document describes performance characteristics, optimizations, and measurement tools for Textforge.

## Performance Features

### Caching Systems

**Grapheme Width Cache**: LRU cache with 4096 entries for Unicode grapheme width calculations to avoid repeated complex Unicode processing.

**Color Caching**: Theme-aware color caching with automatic invalidation on theme changes. Prevents redundant ANSI escape sequence generation for repeated color requests.

**Lazy Rendering**: `LazyRenderable` class with optional caching for deferred component construction and measurement.

### Rendering Optimizations

**VDOM Diffing**: Line-level diffing algorithm for efficient live updates. Only modified lines are rewritten using ANSI cursor positioning, avoiding full screen redraws.

**Live Session Updates**: Optimized terminal updates with cursor movement and selective line rewriting. Maintains steady-state performance for real-time interfaces.

**Threading**: GUI backend uses dedicated UI thread with proper synchronization to avoid blocking the main rendering thread.

### Measurement Tools

Benchmarks and metrics are built into the CLI and utils.

Quick start

- Import time (cold):
  textforge bench --metrics --python python

- First render and live update metrics:
  textforge bench --metrics

- Representative rendering suite:
  textforge bench

- Micro-benchmarks (markup/layout/diff/gradient):
  textforge bench --micro

## Performance Characteristics

### Import Time
- Fresh interpreter import in subprocess to avoid warm caches
- Target: under ~50 ms on modern hardware
- Optional dependencies (Jupyter, legacy Windows) excluded from base import

### Rendering Performance
- **First render latency**: Console.print of short string to in-memory stream
- **Live update cost**: Average ms/frame for LiveSession.update with small diffs over 120 frames
- **Layout solving**: Flex/grid layout computation with constraint resolution

### Micro-benchmarks
- **markup-parse**: Nested tags, color resets, and custom tag handlers (500 iterations)
- **gradient-gen**: Foreground gradient across long strings (200 iterations)
- **layout-solve**: Flex row with wrapping and gaps (300 iterations)
- **vdom-diff**: Line diffing over small edits (500 iterations)

### Profiling Utilities

**@time_callable decorator**: Measures function execution time with automatic reporting.

**time_block context manager**: Times code blocks with labeled output.

**Built-in profiling**: Integrated into benchmark suite for consistent measurement.

## Optimization Strategies

### Text Processing
- Grapheme-aware Unicode handling with caching for width calculations
- Bidirectional text reordering for complex scripts
- ANSI sequence optimization to minimize escape code generation

### Layout Engine
- Constraint-based flex layout with efficient constraint resolution
- Grid layout computation optimized for typical UI patterns
- Padding, margin, and gap calculations with clamping

### Memory Management
- LRU caches with reasonable size limits (4096 entries for graphemes)
- Lazy evaluation with optional caching for expensive operations
- Theme-based cache invalidation to prevent stale color data

### Terminal I/O
- Selective line updates for live rendering vs full redraws
- ANSI escape sequence optimization for cursor movement
- Encoding-aware output with fallback handling

## Performance Notes

- All numbers are approximate and hardware dependent
- Use deltas to track regressions, not absolute values
- Export/Jupyter stacks are optional extras and don't impact base import time
- GUI backend performance depends on platform-specific windowing (Win32 optimized)
- Benchmark results include warm-up iterations for stable measurements

## Architecture Performance Considerations

### Core Rendering
- Console abstraction provides consistent performance across backends
- Renderable protocol enables component-specific optimizations
- Stream abstraction allows for different output targets (TTY, file, memory)

### Component System
- Composable components with measurement caching
- Lazy evaluation prevents unnecessary computation
- Theme integration with efficient color resolution

### Export Pipeline
- Multi-format export (HTML/SVG/PDF) optimized for different use cases
- Plain text export uses simple ANSI stripping for maximum speed
- Render tree utilities enable structured output processing

