Metadata-Version: 2.3
Name: vis3
Version: 1.2.0
Summary: Visualize s3 data
License: Apache 2.0
Author: shenguanlin
Author-email: shenguanlin@pjlab.org.cn
Requires-Python: >=3.9.2
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: alembic (>=1.15.2,<2.0.0)
Requires-Dist: appdirs (>=1.4.4,<2.0.0)
Requires-Dist: beautifulsoup4 (>=4.13.4,<5.0.0)
Requires-Dist: boto3 (>=1.38.14,<2.0.0)
Requires-Dist: cchardet (==2.2.0a2)
Requires-Dist: cryptography (>=44.0.3,<45.0.0)
Requires-Dist: ebooklib (>=0.19,<0.20)
Requires-Dist: fastapi (>=0.115.12,<0.116.0)
Requires-Dist: fastwarc (>=0.15.2,<0.16.0)
Requires-Dist: httpx (>=0.28.1,<0.29.0)
Requires-Dist: loguru (>=0.6.0,<0.7.0)
Requires-Dist: mobi (>=0.3.3,<0.4.0)
Requires-Dist: orjson (>=3.10.18,<4.0.0)
Requires-Dist: passlib[bcrypt] (>=1.7.4,<2.0.0)
Requires-Dist: pydantic (>=2.11.4,<3.0.0)
Requires-Dist: pydantic-settings (>=2.9.1,<3.0.0)
Requires-Dist: python-jose (>=3.4.0,<4.0.0)
Requires-Dist: python-multipart (>=0.0.20,<0.0.21)
Requires-Dist: sqlalchemy (>=2.0.40,<3.0.0)
Requires-Dist: typer[all] (>=0.15.3,<0.16.0)
Requires-Dist: uvicorn (>=0.34.2,<0.35.0)
Requires-Dist: warcio (>=1.7.5,<2.0.0)
Project-URL: Repository, https://github.com/OpenDataLab/Vis3
Description-Content-Type: text/markdown

<div align="center">
  <article style="display: flex; flex-direction: column; align-items: center; justify-content: center;">
    <p align="center"><img width="300" src="./web/app/src/assets/logo.svg" /></p>
    <h1 style="width: 100%; text-align: center;"></h1>
    <p align="center">
        English | <a href="./README_zh-CN.md" >简体中文</a>
    </p>
  </article>
</div>

> Data browser based on s3

Vis3 is a visualization tool for large language models and machine learning data, supporting cloud storage platforms with S3 protocol (AWS, Aliyun OSS, Tencent Cloud) and various data formats (json, jsonl.gz, warc.gz, md, mobi, epub, etc.). It offers interactive visualization through JSON, HTML, Markdown, and image views for efficient data analysis.

## Features

- Supports JSON, JSONL, WARC, and more, automatically recognizing data structures for clear, visual insights.
- One-click field previews with seamless switching between Html, Markdown, and image views for intuitive operation.
- Integrates with S3-compatible cloud storage (Aliyun OSS, AWS, Tencent Cloud) and local file parsing for easy data access.

https://github.com/user-attachments/assets/aa8ee5e8-c6d3-4b20-ae9d-2ceeb2eb2c41


## Getting Started

```bash
# python >= 3.9.2
pip install vis3
```

Or create a Python environment using conda:

> Install [miniconda](https://docs.conda.io/en/latest/miniconda.html)

```bash
# 1. Create Python 3.11 environment using conda
conda create -n vis3 python=3.11

# 2. Activate environment
conda activate vis3

# 3. Install vis3
pip install vis3

# 4. Launch (no authentication)
vis3 --open
```

### Upgrade to the latest version

```bash
pip install vis3 -U
```

## Variables

### `ENABLE_AUTH`

Enable authentication.

```bash
ENABLE_AUTH=1 vis3
```

### `BASE_DATA_DIR`

Specify database (SQLite) directory.

```bash
BASE_DATA_DIR=your/database/path vis3
```

### `BASE_URL`

Specity base url to the api call.

```bash
BASE_URL=/a/b/c
```

## Local Development

```bash
conda create -n vis3-dev python=3.11

# Activate virtual environment
conda activate vis3-dev

# Install poetry
# https://python-poetry.org/docs/#installing-with-the-official-installer

# Install Python dependencies
poetry install

# Install frontend dependencies (install pnpm: https://pnpm.io/installation)
cd web && pnpm install

# Build frontend assets (in web directory)
pnpm build

# Start vis3
uvicorn vis3.main:app --reload
```

## React Component [![npm](https://img.shields.io/npm/v/%40vis3/kit.svg)](https://www.npmjs.com/package/@vis3/kit)

We provide a [React component](./web/packages/vis3-kit/) via npm for customizing your data preview ui.

![](./web/packages/vis3-kit/example/screenshot.png)

```bash
npm i @vis3/kit
```

## Community

Welcome to join the Opendatalab official WeChat group!

<p align="center">
<img style="width: 400px" src="https://user-images.githubusercontent.com/25022954/208374419-2dffb701-321a-4091-944d-5d913de79a15.jpg">
</p>

## Related Projects

- [LabelU](https://github.com/opendatalab/labelU) Image / Video / Audio annotation tool  
- [LabelU-kit](https://github.com/opendatalab/labelU-Kit) Web frontend annotation kit (LabelU is developed based on this kit)
- [LabelLLM](https://github.com/opendatalab/LabelLLM) Open-source LLM dialogue annotation platform
- [Miner U](https://github.com/opendatalab/MinerU) One-stop high-quality data extraction tool

## License

This project is licensed under the [Apache 2.0 license](./LICENSE).

