Metadata-Version: 2.1
Name: scrawler
Version: 0.3.1
Summary: Tool for General Purpose Web Scraping and Crawling
Home-page: https://github.com/dglttr/scrawler
Author: Daniel Glatter
Author-email: d.glatter@outlook.com
License: MIT
Project-URL: Bug Tracker, https://github.com/dglttr/scrawler/issues
Project-URL: Documentation, https://scrawler.readthedocs.io/
Project-URL: Source Code, https://github.com/dglttr/scrawler
Keywords: Web Scraping,Crawling,asyncio,multithreading
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
License-File: LICENSE

scrawler
========

*"scrawler" = "scraper" + "crawler"*

Provides functionality for the automatic collection of website data
(`web scraping <https://en.wikipedia.org/wiki/Web_scraping>`__) and
following links to map an entire domain
(`crawling <https://en.wikipedia.org/wiki/Web_crawler>`__). It can
handle these tasks individually, or process several websites/domains in
parallel using ``asyncio`` and ``multithreading``.

This project was initially developed while working at the `Fraunhofer
Institute for Systems and Innovation
Research <https://www.isi.fraunhofer.de/en.html>`__. Many thanks for the
opportunity and support!

Installation
------------

You can install scrawler from PyPI:

::

    pip install scrawler

.. note::
    Alternatively, you can find the ``.whl`` and ``.tar.gz`` files on GitHub
    for each respective `release <https://github.com/dglttr/scrawler/releases>`__.

Getting Started
---------------

Check out the `Getting Started Guide <https://scrawler.readthedocs.io/en/latest/getting_started.html>`__.

Documentation
-------------

Documentation is available at `Read the Docs <https://scrawler.readthedocs.io/en/latest/>`__.

