Metadata-Version: 2.1
Name: holo-search-sdk
Version: 0.2.2
Summary: A Python SDK for database search operations with vector and full-text search capabilities
Author-email: Tiancheng YANG <yangtiancheng.ytc@alibaba-inc.com>
License: MIT
Project-URL: Homepage, https://github.com/hologram/holo-search-sdk
Project-URL: Documentation, https://holo-search-sdk.readthedocs.io
Project-URL: Repository, https://github.com/hologram/holo-search-sdk
Project-URL: Issues, https://github.com/hologram/holo-search-sdk/issues
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: numpy>=1.20.0
Requires-Dist: typing-extensions>=4.0.0
Requires-Dist: psycopg>=3.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: pre-commit>=3.0.0; extra == "dev"

# Holo Search SDK

一个用于Hologres数据库搜索操作的 Python SDK，支持向量搜索和全文搜索功能。

## ✨ 特性

- **🔍 向量搜索**: 基于语义相似性的搜索功能
- **📝 全文搜索**: 传统的基于关键词的搜索
- **🛡️ 类型安全**: 使用类型提示和数据验证
- **🧩 模块化设计**: 清晰的分层架构，便于扩展和维护

## 📦 安装

### 从 PyPI 安装

```bash
pip install holo-search-sdk
```

## 🚀 快速开始

### 基本使用

```python
import holo_search_sdk as holo

# 连接到数据库
client = holo.connect(
    host="your-host",
    port=80,
    database="your-database",
    access_key_id="your-access-key-id",
    access_key_secret="your-access-key-secret",
    schema="public"
)

# 建立连接
client.connect()

# 打开表
columns = {
    "id": ("INTEGER", "PRIMARY KEY"),
    "content": "TEXT",
    "vector": "FLOAT8[]",
    "metadata": "JSONB"
}
table = client.open_table("table_name")

# 插入数据
data = [
    [1, "Hello world", [0.1, 0.2, 0.3], {"category": "greeting"}],
    [2, "Python SDK", [0.4, 0.5, 0.6], {"category": "tech"}],
    [3, "Vector search", [0.7, 0.8, 0.9], {"category": "search"}]
]
table.insert_multi(data, ["id", "content", "vector", "metadata"])

# 设置向量索引
table.set_vector_index(
    column="vector",
    distance_method="Cosine",
    max_degree=64,
    ef_construction=400
)

# 向量搜索
query_vector = [0.1, 0.2, 0.3]
results = table.search_vector(query_vector, "vector").limit(10)

# 关闭连接
client.disconnect()
```

### 使用上下文管理器

```python
import holo_search_sdk as holo

with holo.connect(
    host="your-host",
    port=80,
    database="your-database",
    access_key_id="your-access-key-id",
    access_key_secret="your-access-key-secret"
) as client:
    client.connect()
    
    # 执行数据库操作
    table = client.open_table("table_name")
    results = table.search_vector([0.1, 0.2, 0.3], "vector_column")
    
    # 连接会自动关闭
```

## 📚 详细文档

### 核心概念

#### 1. 客户端 (Client)

客户端是与数据库交互的主要接口：

```python
from holo_search_sdk import connect

# 创建客户端
client = connect(
    host="localhost",
    port=80,
    database="test_db",
    access_key_id="your_key",
    access_key_secret="your_secret"
)

# 建立连接
client.connect()

# 执行 SQL
result = client.execute("SELECT COUNT(*) FROM users", fetch_result=True)

# 表操作
table = client.open_table("table_name")
```

#### 2. 表操作 (Table Operations)

表是数据存储和搜索的基本单位：

```python
# 打开现有表
table = client.open_table("table_name")

# 检查表是否存在
exists = client.check_table_exist("table_name")

# 删除表
client.drop_table("table_name")
```

#### 3. 数据插入

支持单条和批量数据插入：

```python
# 插入单条记录
table.insert_one(
    [1, "标题", "内容", [0.1, 0.2, 0.3]],
    ["id", "title", "content", "vector"]
)

# 批量插入
data = [
    [1, "文档1", "内容1", [0.1, 0.2, 0.3]],
    [2, "文档2", "内容2", [0.4, 0.5, 0.6]],
    [3, "文档3", "内容3", [0.7, 0.8, 0.9]]
]
table.insert_multi(data, ["id", "title", "content", "vector"])
```

#### 4. 向量索引

为向量列创建高效的搜索索引：

```python
# 设置单个向量索引
table.set_vector_index(
    column="vector",
    distance_method="Cosine",  # 可选: "Euclidean", "InnerProduct", "Cosine"
    base_quantization_type="rabitq",  # 可选: "sq8", "sq8_uniform", "fp16", "fp32", "rabitq"
    max_degree=64,
    ef_construction=400,
    use_reorder=False,
    precise_quantization_type="fp32"
)

# 设置多个向量索引
table.set_vector_indexes({
    "content_vector": {
        "distance_method": "Cosine",
        "max_degree": 64,
        "ef_construction": 400
    },
    "title_vector": {
        "distance_method": "Euclidean",
        "max_degree": 32,
        "ef_construction": 200
    }
})

# 删除所有向量索引
table.delete_vector_indexes()
```

#### 5. 向量搜索

执行语义相似性搜索：

```python
# 基本向量搜索
query_vector = [0.1, 0.2, 0.3]
results = table.search_vector(
    vector=query_vector,
    column="vector",
    distance_method="Cosine"
)

# 带输出别名的搜索
results = table.search_vector(
    vector=query_vector,
    column="vector",
    output_name="similarity_score",
    distance_method="Cosine"
)
```

### 配置选项

#### 连接配置

```python
from holo_search_sdk.types import ConnectionConfig

config = ConnectionConfig(
    host="your-host.com",
    port=80,
    database="production_db",
    access_key_id="user...",
    access_key_secret="secret...",
    schema="analytics"  # 默认为 "public"
)
```

#### 向量索引配置

- **distance_method**: 距离计算方法
  - `"Euclidean"`: 欧几里得距离
  - `"InnerProduct"`: 内积距离
  - `"Cosine"`: 余弦距离

- **base_quantization_type**: 基础量化类型
  - `"sq8"`, `"sq8_uniform"`, `"fp16"`, `"fp32"`, `"rabitq"`
- **max_degree**: 图构建过程中每个顶点尝试连接的最近邻数量 (默认: 64)
- **ef_construction**: 图构建过程中的搜索深度控制 (默认: 400)
- **use_reorder**: 是否使用 HGraph 高精度索引 (默认: False)
- **precise_quantization_type**: 精确量化类型 (默认: "fp32")
- **precise_io_type**: 精确 IO 类型 (默认: "block_memory_io")

## 🔧 API 参考

### 主要类

- **`Client`**: 数据库客户端，管理连接和表操作
- **`HoloTable`**: 表操作接口，支持数据插入和向量搜索
- **`ConnectionConfig`**: 连接配置数据类

### 主要函数

- **`connect()`**: 创建数据库客户端连接
- **`open_table()`**: 打开现有表
- **`insert_one()`**: 插入单条记录
- **`insert_multi()`**: 批量插入记录
- **`set_vector_index()`**: 设置向量索引
- **`search_vector()`**: 执行向量搜索

### 异常类

- **`HoloSearchError`**: 基础异常类
- **`ConnectionError`**: 连接相关错误
- **`QueryError`**: 查询执行错误
- **`SqlError`**: SQL 生成错误
- **`TableError`**: 表操作错误

## 📄 许可证

本项目采用 MIT 许可证 - 查看 [LICENSE.txt](LICENSE.txt) 文件了解详情。

---

**Holo Search SDK** - 让Hologres向量和全文搜索变得简单高效 🚀
