Metadata-Version: 2.1
Name: arxivabscraper
Version: 0.2
Summary: Get arXiv.org abstracts within a date range and category
Home-page: https://github.com/MohamedElashri/Arxiv-Aabstract-scraper
Author: Mohamed Elashri
Author-email: muhammadelashri@gmail.com
License: MIT
Download-URL: https://github.com/MohamedElashri/Arxiv-Aabstract-scraper/archive/0.2.tar.gz
Description: 
        # arxivabscraper
        An ArXiV scraper to retrieve abstracts from given categories and date range.
        
        ## Install
        
        Use `pip` (or `pip3` for python3):
        
        ```bash
        $ pip install arxivabscraper
        ```
        
        or download the source and use `setup.py`:
        
        ```bash
        $ python setup.py install
        ```
        
        or if you do not want to install the module, copy `arxivabscraper.py` into your working
        directory.
        
        To update the module using `pip`:
        ```bash
        pip install arxivabscraper --upgrade
        ```
        
        ## Examples
        
        
        You can directly use `arxivabscraper` in your scripts. Let's import `arxivabscraper`
        and create a scraper to fetch all preprints in condensed matter physics category
        from 2 May 2018 until 2 June 2020 (for other categories, see below):
        
        ```python
        import arxivabscraper
        scraper = arxivabscraper.Scraper(category='physics:cond-mat', date_from='2018-05-02',date_until='2020-06-02')
        ```
        Once we built an instance of the scraper, we can start the scraping:
        
        ```python
        output = scraper.scrape()
        ```
        While scraper is running, it prints its status:
        
        ```
        fetching up to  1000 records...
        fetching up to  2000 records...
        Got 503. Retrying after 30 seconds.
        fetching up to  3000 records...
        fetching is complete.
        ```
        
        Finally you can save the output in your favorite format or readily convert it into a pandas dataframe:
        ```python
        import pandas as pd
        cols = ('categories', 'abstract')
        df = pd.DataFrame(output,columns=cols)
        ```
        
        
        ## Categories
        Here is a list of all categories available on ArXiv.
        
        | Category | Code |
        | --- | --- |
        | Computer Science | `cs` |
        | Economics | `econ` |
        | Electrical Engineering and Systems Science | `eess` |
        | Mathematics | `math` |
        | Physics | `physics` |
        | Astrophysics | `physics:astro-ph` |
        | Condensed Matter | `physics:cond-mat` |
        | General Relativity and Quantum Cosmology | `physics:gr-qc` |
        | High Energy Physics - Experiment | `physics:hep-ex` |
        | High Energy Physics - Lattice | `physics:hep-lat` |
        | High Energy Physics - Phenomenology | `physics:hep-ph` |
        | High Energy Physics - Theory | `physics:hep-th` |
        | Mathematical Physics | `physics:math-ph` |
        | Nonlinear Sciences | `physics:nlin` |
        | Nuclear Experiment | `physics:nucl-ex` |
        | Nuclear Theory | `physics:nucl-th` |
        | Physics (Other) | `physics:physics` |
        | Quantum Physics | `physics:quant-ph` |
        | Quantitative Biology | `q-bio` |
        | Quantitative Finance | `q-fin` |
        | Statistics | `stat` |
        
        ## Contributing
        Ideas/bugs/comments? Please open an issue or submit a pull request on Github.
        
        ## License
        This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
        
        ## Acknowledgments
        This work is based on the arxivscraper from 
        Mahdi Sadjadi (2017). arxivscraper: Zenodo. http://doi.org/10.5281/zenodo.889853
        
Keywords: arxiv,scraper,api,citation
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Text Processing :: Markup :: LaTeX
Description-Content-Type: text/markdown
