Metadata-Version: 2.1
Name: parsehub
Version: 1.1.4
Summary: 支持AI总结的社交媒体聚合解析器
Author-email: 梓澪 <zilingmio@gmail.com>
License: MIT
Project-URL: Repository, https://github.com/z-mio/parsehub
Project-URL: Issues, https://github.com/z-mio/parsehub/issues
Requires-Python: >=3.12.0
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aiocache>=0.12.3
Requires-Dist: aiofiles<24.dev0,>=23.2
Requires-Dist: beautifulsoup4==4.12.3
Requires-Dist: dynamicadaptor>=0.5.2
Requires-Dist: dynrender-skia>=0.2.5
Requires-Dist: langchain>=0.2.5
Requires-Dist: langchain-core>=0.2.8
Requires-Dist: langchain-openai>=0.1.14
Requires-Dist: loguru>=0.6.0
Requires-Dist: openai>=1.54.5
Requires-Dist: opencv-python>=4.10.0.84
Requires-Dist: playwright>=1.48.0
Requires-Dist: pydub>=0.25.1
Requires-Dist: python-dotenv>=1.0.1
Requires-Dist: tenacity>=8.5.0
Requires-Dist: urlextract>=1.9.0
Requires-Dist: yt-dlp
Requires-Dist: textual<=0.63.0
Requires-Dist: pyperclip>=1.9.0
Requires-Dist: lxml>=5.3.0
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: aiosqlite>=0.20.0
Requires-Dist: click>=8.1.7
Requires-Dist: rookiepy>=0.5.6
Requires-Dist: fastapi>=0.112.1
Requires-Dist: uvicorn>=0.30.6
Requires-Dist: emoji>=2.14.0
Requires-Dist: instaloader>=4.14
Requires-Dist: pydantic>=1.10.19
Requires-Dist: markdownify>=0.14.1
Requires-Dist: markdown>=3.7
Requires-Dist: requests

# ParseHub

**支持AI总结的社交媒体聚合解析器**
> 视频总结会调用 `whisper-1` 模型

**基于该项目开发的 Tg Bot:**   
[@ParsehubBot](https://t.me/ParsehubBot) | https://github.com/z-mio/parse_hub_bot

**支持的平台:**  
`Twitter 视频|图文`  
`Instagram 视频|图文`  
`微博 视频|图文`  
`贴吧 视频|图文`  
`小红书 视频|图文`  
`Youtube 视频|音乐`  
`Facebook 视频`  
`Bilibili 视频|动态`  
`抖音|TikTok 视频|图文`  
`......`

## 安装

`pip install parsehub`

---

> [!IMPORTANT]
><details>
><summary>注意</summary>
>
>Linux用户在导入skia-python包时可能会遇到以下报错
>
>```bash
>libGL.so.1: cannot open shared object file: No such file or directory
>```
>
>Windows用户在缺少Microsoft Visual C++ Runtime时可能会遇到以下报错
>
>```commandline
>ImportError: DLL load failed while importing skia: The specified module could not be found.
>```
>
>## 解决方法
>
>> ubuntu用户
>
>```bash
># Ubuntu 22 安装
>apt install libgl1-mesa-glx
># Ubuntu 24 安装
>apt install libgl1 libglx-mesa0
>```
>
>> ArchLinux用户
>
>```bash
>pacman -S libgl
>```
>
>> centos用户
>
>```bash
>yum install mesa-libGL -y
>```
>
>> Windows用户
>
>下载链接[Microsoft Visual C++ 2015 Redistributable Update 3 RC](microsoft.com/en-US/download/details.aspx?id=52685)
>
>
></details>

## 使用

```python
from parsehub import ParseHub
from parsehub.config import ParseConfig, DownloadConfig
import asyncio


async def main():
    ph = ParseHub(config=ParseConfig())
    result = await ph.parse('https://twitter.com/aobuta_anime/status/1827284717848424696')
    print(result)

    # dr = await result.download()
    # print(dr.media)
    # sr = await dr.summary()
    # await dr.delete()

    sr = await result.summary(download_config=DownloadConfig())
    print(sr.content)


if __name__ == '__main__':
    asyncio.run(main())
```

## 环境变量

| 名称       | 描述                | 默认值                                                                                                                                                                        |
|----------|-------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PROVIDER | 模型提供商, 暂只支持openai | openai                                                                                                                                                                     |
| API_KEY  | API Key           |                                                                                                                                                                            |
| BASE_URL | API 地址            | https://api.openai.com/v1                                                                                                                                                  |
| MODEL    | AI总结使用的模型         | gpt-4o-mini                                                                                                                                                                |
| PROMPT   | AI总结提示词           | You are a useful assistant to summarize the main points of articles and video captions. Summarize 3 to 8 points in "Simplified Chinese" and summarize them all at the end. ||                       |                                                                          |                                                                                                                                                                            |

## 参考项目

- [Evil0ctal/Douyin_TikTok_Download_API](https://github.com/Evil0ctal/Douyin_TikTok_Download_API)
- [BalconyJH/DynRender-skia](https://github.com/BalconyJH/DynRender-skia)
- [langchain-ai/langchain](https://github.com/langchain-ai/langchain)
- [yt-dlp/yt-dlp](https://github.com/yt-dlp/yt-dlp)
- [instaloader/instaloader](https://github.com/instaloader/instaloader)
- [JoeanAmier/XHS-Downloader](https://github.com/JoeanAmier/XHS-Downloader)
