# pylogsentinel

A lightweight, zero-dependency (standard library only) log monitoring sentinel intended
to be invoked periodically (e.g. once per minute via `cron`). It incrementally scans
configured log files for user-defined regular expression rules and triggers shell
actions with rich contextual environment variables.

`pylogsentinel` is intentionally simple: no daemons, no background threads, no
long-running processes. Each run processes just the _new_ bytes appended since the
previous run (tracked per-file, per-inode) and then exits.

## Key Features

- INI-style configuration (`pylogsentinel.conf`).
- Two log discovery modes: static paths or shell command
- Per-rule regular expressions with optional flags (`/pattern/i` style).
- Per-rule or actions executed as shell commands with rich contextual environment variables.
- Reliable state tracking via inode-based files (`<inode>` in `state_dir`).
- Safe handling of truncation & rotation (offset reset if file shrinks).
- Strict lock file (`LOCK`) prevents concurrent overlapping runs.
- Dry-run mode (`--dry-run`) to test rule matching without executing actions.
- Human-friendly size option for `max_block_size` (e.g. `512K`, `10M`).
- Automatic configuration file discovery when -c/--config is not supplied or the given path is missing: `~/.pylogsentinel.conf`, `~/.config/pylogsentinel.conf`, `/etc/pylogsentinel.conf`, `/usr/local/etc/pylogsentinel.conf`.
- Directory path monitoring uses the system 'file' command; only paths whose description contains 'text' are monitored (explicit file paths are always processed).
- Python 3.8+ (only standard library imports).

## Installation

```sh
pip install pylogsentinel
```

## Quick Start

1. Copy the example configuration to a standard location:
   ```
   cp pylogsentinel.conf.example /etc/pylogsentinel.conf
   chmod 600 /etc/pylogsentinel.conf
   ```
2. Edit paths, rules, and actions to your needs.
3. Add a cron entry (run every minute):
   ```
   * * * * * /usr/bin/env python3 -m pylogsentinel -c /etc/pylogsentinel.conf
   ```
   If you omit the `-c` option (or the specified file does not exist), pylogsentinel will look for a configuration file in this order:
   `~/.pylogsentinel.conf`, `~/.config/pylogsentinel.conf`, `/etc/pylogsentinel.conf`, `/usr/local/etc/pylogsentinel.conf`.
4. Verify operation (first in dry-run mode):
   ```
   python -m pylogsentinel -c /etc/pylogsentinel.conf --dry-run
   ```

## Configuration File Reference

The configuration is an INI-style file. An example:

```
state_dir = /var/run/pylogsentinel
max_block_size = 10M

[logs]
# Choose exactly one, `paths` or `cmd`:
paths = /var/log /custom/app/logs/app.log
cmd = find /var/log -type f -name '*.log'

[action.default]
cmd = echo "Matched $RULE_ID in $FILE at line $LINE" | mail -s "Sentinel alert" root

[action.another]
cmd = echo "Another action"

[rule.error]
description = Error-like conditions
pattern = /(error|fatal|exception|kill|crash)/i
action = another # if omitted, use default action
```

### `[system]` Section

| Option           | Required | Description                                                                                             | Default |
| ---------------- | -------- | ------------------------------------------------------------------------------------------------------- | ------- |
| `state_dir`      | Yes      | Directory storing lock + per-inode state files. Must be writable. (Defined under `[system]`.)           | (none)  |
| `max_block_size` | No       | Maximum newly appended bytes to read per file per run. Supports suffixes `K`, `M`, `G` (in `[system]`). | `10M`   |

### `[logs]` Section

Supply **either**:

- `paths`: Space-separated list of file and/or directory paths. Directories are traversed recursively; only files whose `file -b` output contains the substring "text" are monitored (others are skipped). Explicit file paths are always processed.
- `cmd`: A shell command producing **one path per line** on stdout.

Exactly one must be present. If using `cmd`, ensure it returns relatively quickly (recommended under a second) as it executes on every run.

### `[rule.<rule_id>]` Sections

| Field         | Required | Description                                                                                                    |
| ------------- | -------- | -------------------------------------------------------------------------------------------------------------- |
| `pattern`     | Yes      | Regular expression in the mandatory form `/pattern/flags` (flags optional). Supported flags: `i` (IGNORECASE). |
| `description` | No       | Human-readable description for environment variable `RULE_DESCRIPTION`.                                        |
| `action`      | No       | The `action_id` to invoke. Defaults to `default` if omitted.                                                   |

At least one rule is required.

#### Pattern Forms Examples

| Pattern                 | Meaning                                             |
| ----------------------- | --------------------------------------------------- |
| `/error/i`              | Case-insensitive match of "error"                   |
| `/^panic:.+/`           | Match lines starting with "panic:" (case-sensitive) |
| `/^panic:.+/i`          | Same, case-insensitive                              |
| `/\b(?:CRIT\|FATAL)\b/` | Word boundary critical terms                        |

If you supply an invalid regex or an unsupported flag, startup will fail with a configuration error.

### `[action.<action_id>]` Sections

Each action defines a shell `cmd` executed when a matching rule triggers. The special action id `default` **must** exist; it is used as a fallback when a rule references it explicitly or omits `action`.

Environment variables available to the command:

| Variable           | Description                                                            |
| ------------------ | ---------------------------------------------------------------------- |
| `RULE_ID`          | The rule identifier (e.g. `error`).                                    |
| `RULE_PATTERN`     | The normalized pattern string (e.g. `/error/i`).                       |
| `RULE_DESCRIPTION` | Description text or placeholder string if not provided.                |
| `FILE`             | Absolute path of the log file where the match occurred.                |
| `LINE`             | 1-based absolute line number within the file at time of scan.          |
| `CONTEXT`          | Concatenated lines around the match (default radius 2 before & after). |

Your `cmd` can reference these with typical shell expansion, for example:

```
[action.notify]
cmd = printf "%s\n%s\n" "$RULE_DESCRIPTION" "$CONTEXT" | mail -s "Alert $RULE_ID: $FILE:$LINE" ops@example.com
```

## State Tracking

For each processed file:

1. The file's inode (`st_ino`) is determined.
2. A state file named `<inode>` inside `state_dir` stores:
   ```
   <byte_offset> <line_number>
   ```
3. On the next run only new bytes (up to `max_block_size`) beyond `<byte_offset>` are read.
4. If the file shrinks (truncation), the stored offset is treated as invalid and scanning restarts from the beginning with line numbers reset.

This scheme tolerates typical log rotation patterns.

## Lock File

A lock file named `LOCK` in `state_dir` prevents overlapping runs. If it exists the process exits immediately. Remove a stale lock only after verifying no instance is active.

## Dry Run

`--dry-run` scans and shows which actions would run without executing them.

## Exit Codes

| Code | Meaning              |
| ---- | -------------------- |
| 0    | Success              |
| 2    | Configuration error  |
| 130  | Interrupted (Ctrl+C) |
| 1    | Other runtime error  |

## License

MIT License (see LICENSE).
