Metadata-Version: 2.4
Name: insta-rag
Version: 0.1.0b2
Summary: A RAG (Retrieval-Augmented Generation) library for document processing and retrieval.
Keywords: rag,retrieval-augmented-generation,llm,ai,nlp
Author: Aukik Aurnab, Tahmidul Islam, MD Ikramul Kayes
Author-email: Aukik Aurnab <aukikaurnabx@gmail.com>, Tahmidul Islam <me@tahmidul612.com>, MD Ikramul Kayes <ikramul.kayesgg@gmail.com>
License-Expression: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Text Processing :: Indexing
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Natural Language :: English
Requires-Dist: openai>=1.12.0
Requires-Dist: qdrant-client>=1.7.0
Requires-Dist: pdfplumber>=0.10.3
Requires-Dist: pypdf2>=3.0.1
Requires-Dist: tiktoken>=0.5.2
Requires-Dist: numpy>=1.24.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: cohere>=4.47.0
Requires-Dist: pydantic>=2.5.0
Requires-Dist: requests>=2.32.5
Requires-Dist: rank-bm25>=0.2.2
Requires-Python: >=3.9
Project-URL: Bug Tracker, https://github.com/AI-Buddy-Catalyst-Labs/insta_rag/issues
Project-URL: Documentation, https://github.com/AI-Buddy-Catalyst-Labs/insta_rag/wiki
Project-URL: Homepage, https://github.com/AI-Buddy-Catalyst-Labs/insta_rag
Description-Content-Type: text/markdown

# insta_rag

`insta_rag` is a modular, plug-and-play Python library for building advanced Retrieval-Augmented Generation (RAG) pipelines. It abstracts the complexity of document processing, embedding, and hybrid retrieval into a simple, configuration-driven client.

## Core Features

- **Semantic Chunking**: Splits documents at natural topic boundaries to preserve context.
- **Hybrid Retrieval**: Combines semantic vector search with BM25 keyword search for the best of both worlds.
- **Query Transformation (HyDE)**: Uses an LLM to generate hypothetical answers, improving retrieval relevance.
- **Reranking**: Integrates with state-of-the-art rerankers like Cohere to intelligently re-order results.
- **Pluggable Architecture**: Easily extend the library by adding new chunkers, embedders, or vector databases.
- **Hybrid Storage**: Optional integration with MongoDB for cost-effective content storage, keeping Qdrant lean for vector search.

## Quick Start

### 1. Installation

```bash
# Recommended: using uv
uv pip install insta-rag

# Or with pip
pip install insta-rag
```

### 2. Basic Usage

```python
from insta_rag import RAGClient, RAGConfig, DocumentInput

# Load configuration from environment variables (.env file)
config = RAGConfig.from_env()
client = RAGClient(config)

# 1. Add documents to a collection
documents = [DocumentInput.from_text("Your first document content.")]
client.add_documents(documents, collection_name="my_docs")

# 2. Retrieve relevant information
response = client.retrieve(
    query="What is this document about?", collection_name="my_docs"
)

# Print the most relevant chunk
if response.chunks:
    print(response.chunks[0].content)
```

## Documentation

For detailed guides on installation, configuration, and advanced features, please see the **[Full Documentation](https://github.com/AI-Buddy-Catalyst-Labs/insta_rag/wiki)**.

Key sections include:

- **[Installation Guide](https://github.com/AI-Buddy-Catalyst-Labs/insta_rag/wiki/installation)**
- **[Quickstart Guide](https://github.com/AI-Buddy-Catalyst-Labs/insta_rag/wiki/quickstart)**
- **Guides**
  - [Document Management](https://github.com/AI-Buddy-Catalyst-Labs/insta_rag/wiki/guides/document-management)
  - [Advanced Retrieval](https://github.com/AI-Buddy-Catalyst-Labs/insta_rag/wiki/guides/retrieval)
  - [Storage Backends](https://github.com/AI-Buddy-Catalyst-Labs/insta_rag/wiki/guides/storage-backends)

## License

This project is licensed under the [MIT License](https://github.com/AI-Buddy-Catalyst-Labs/insta_rag/blob/main/LICENSE).
