Metadata-Version: 2.1
Name: crawlio
Version: 1.0.0
Summary: Simple website crawler built with Python's asyncio
Home-page: https://github.com/maximiliancw/crawlio
Author: Maximilian Wolf
Author-email: max@w0lf.me
Maintainer: Maximilian Wolf
Maintainer-email: max@w0lf.me
License: UNKNOWN
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Framework :: AsyncIO
Description-Content-Type: text/markdown
License-File: COPYING

<img width="300" src="https://raw.githubusercontent.com/maximiliancw/crawlio/master/static/logo.png" alt="crawlio">

# crawlio
Simple website crawler built with Python's `asyncio`


## Features

- Asynchronous "deep" crawling using `asyncio`, `aiohttp` and `Parsel` (by Scrapy authors)
- Zero-configuration
- Customizable XPath selectors

## Setup
```bash
pip install crawlio
```

## Usage

### Synchronous ()
```python
import asyncio
from crawlio import Crawler

fields = {
    'title': '/html/head/title/text()',
    # ...
}
crawler = Crawler('https://quotes.toscrape.com/', selectors=fields)
results = asyncio.run(crawler.run(), debug=True)
for item in results:
    print(item)
```

### Asynchronous
```python
import asyncio
from crawlio import Crawler

async def some_coroutine():
    fields = {
        'title': '/html/head/title/text()',
        # ...
    }
    loop = asyncio.get_event_loop()
    crawler = Crawler('https://quotes.toscrape.com/', selectors=fields)
    results = await crawler.run()
    return results
```


## Contribute
...


# License
...

