# scrapeMM: Multimodal Web Retrieval
Simple web scraper to asynchronously retrieve webpages and access social media contents, fetching text along with media, i.e., images and videos.

This library aims to help developers and researchers to easily access multimodal data from the web and use it for LLM processing.

## Usage
```python
from scrapemm import retrieve

url = "https://example.com"
result = retrieve(url)
result.render()
```

## How it works
```
Input:                                  Output:
URL (string)   -->   retrieve()   -->   MultimodalSequence
```
The `MultimodalSequence` is a sequence of Markdown-formatted text and media provided by the [ezMM](https://github.com/multimodal-ai-lab/ezmm) library.

Web scraping is done with [Firecrawl](https://github.com/mendableai/firecrawl).

## Supported Proprietary APIs
- ✅ X/Twitter
- ✅ Telegram
- ⏳ Facebook
- ⏳ Instagram
- ⏳ Threads
- ⏳ TikTok
