# -*- coding: utf-8 -*-
from setuptools import setup

packages = \
['findpapers',
 'findpapers.models',
 'findpapers.searchers',
 'findpapers.tools',
 'findpapers.utils']

package_data = \
{'': ['*']}

install_requires = \
['colorama>=0.4.3,<0.5.0',
 'edlib>=1.3.8,<2.0.0',
 'inquirer>=2.7.0,<3.0.0',
 'lxml>=4.5.2,<5.0.0',
 'requests>=2.24.0,<3.0.0',
 'typer>=0.3.2,<0.4.0',
 'xmltodict>=0.12.0,<0.13.0']

extras_require = \
{':python_version < "3.8"': ['importlib-metadata>=1.0,<2.0']}

entry_points = \
{'console_scripts': ['findpapers = findpapers.cli:main']}

setup_kwargs = {
    'name': 'findpapers',
    'version': '0.4.1',
    'description': 'Findpapers is an application that helps researchers who are looking for references for their research.',
    'long_description': '# Findpapers\n\n[![PyPI - License](https://img.shields.io/pypi/l/findpapers)](https://gitlab.com/jonatasgrosman/findpapers/-/blob/master/LICENSE)\n[![PyPI](https://img.shields.io/pypi/v/findpapers)](https://pypi.org/project/findpapers)\n[![pipeline status](https://gitlab.com/jonatasgrosman/findpapers/badges/master/pipeline.svg)](https://gitlab.com/jonatasgrosman/findpapers/-/commits/master)\n<!--[![coverage report](https://gitlab.com/jonatasgrosman/findpapers/badges/master/coverage.svg)](https://gitlab.com/jonatasgrosman/findpapers/-/commits/master)-->\n\n\n\nFindpapers is an application that helps researchers who are looking for references for their research. The application will perform searches in several databases (currently ACM, arXiv, IEEE, PubMed, and Scopus) from a user-defined search query.\n\nIn summary, this tool will help you to perform the process below:\n\n![Workflow](docs/workflow.png)\n\n# Requirements\n\n- Python 3.7+\n\n# Installation\n\n```console\n$ pip install findpapers\n```\n\nYou can check your findpapers version running:\n\n```console\n$ findpapers version\n```\n\nIf you have an old version of the tool and want to upgrade it run the following command:\n\n```console\n$ pip install findpapers --upgrade\n```\n\n# How to use it?\n\nAll application actions are command-line based. The available commands are \n\n- ```findpapers search```: Search for papers metadata using a query. This search will be made by matching the query with the paper\'s title, abstract, and keywords.\n\n- ```findpapers refine```: Refine the search results by selecting/classifying the papers\n\n- ```findpapers download```: Download full-text papers using the search results\n\n- ```findpapers bibtex```: Generate a BibTeX file from the search results\n\nYou can control the commands logging verbosity by the **-v** (or **--verbose**) argument.\n\nIn the following sections, we will show how to use the findpapers commands. However, all the commands have the **--help** argument to display some summary about their usage, E.g., ```findpapers search --help```.\n\n## Search query construction\n\nFirst of all, we need to know how to build the search queries. The search queries must follow the rules:\n\n- All the query terms need to be not empty and enclosed by square brackets. E.g., **[term a]**\n\n- The query can contain boolean operators, but they must be uppercase. The allowed operators are AND, OR, and NOT. E.g., **[term a] AND [term b]**\n\n- All the operators must have one and only one whitespace before and after them. E.g., **[term a] OR [term b] OR [term c]**\n\n- The NOT operator must always be preceded by an AND operator E.g., **[term a] AND NOT [term b]**\n\n- A subquery needs to be enclosed by parentheses. E.g., **[term a] AND ([term b] OR [term c])**\n\n- The composition of terms is only allowed through boolean operators. Queries like "**[term a] [term b]**" are invalid\n\nYou can use some wildcards in the query too. Use question mark (?) to replace exactly one character, and use an asterisk (*) to replace zero or more characters:\n\n- **[son?]** will match song, sons, ... (But won\'t match "son")\n\n- **[son\\*]** will match son, song, sons, sonic, songwriting, ...\n\nThere are some rules that you\'ll need to follow when using wildcards:\n\n- Cannot be used at the start of a search term;\n- A minimum of 3 characters preceding the asterisk wildcard (*) is required;\n- The asterisk wildcard (*) can only be used at the end of a search term;\n- Can be used only in single terms;\n- Only one wildcard can be included in a search term.\n\nNote: The IEEE and PubMed databases don\'t support the "?" wildcard.\n\nLet\'s see some examples of valid and invalid queries:\n\n| Query  | Valid? |\n| ------------- | ------------- |\n| [term a]   |  Yes  |\n| ([term a] OR [term b])   |  Yes  |\n| [term a] OR [term b]  |  Yes  |\n| [term a] AND [term b]   |  Yes  |\n| [term a] AND NOT ([term b] OR [term c])  |  Yes  |\n| [term a] OR ([term b] AND ([term\\*] OR [t?rm]))  |  Yes |\n| [term a]OR[term b]   |  **No** (no whitespace between terms and boolean operator)  |\n| [term a] &nbsp;&nbsp;OR&nbsp;&nbsp; [term b]  |  **No** (more than 1 whitespace between terms and boolean operator)  |\n| ([term a] OR [term b]  |  **No** (missing parentheses)  |\n| [term a] or [term b]  |  **No** (lowercase boolean operator)  |\n| term a OR [term b]  |  **No** (missing square brackets)  |\n| [term a] [term b]  |  **No** (missing boolean operator)  |\n| [term a] XOR [term b] |  **No** (invalid boolean operator)   |\n| [term a] OR NOT [term b] |  **No** (NOT boolean operator must be preceded by AND)   |\n| [] AND [term b]  |  **No** (empty term)  |\n|[some term\\*]  |  **No** (wildcards can be used only in single terms)  |\n|[?erm]  |  **No** (wildcards cannot be used at the start of a search term)  |\n|[te*]  |  **No** (a minimum of 3 characters preceding the asterisk wildcard is required)  |\n|[ter*s]  |  **No** (the asterisk wildcard can only be used at the end of a search term)  |\n|[t?rm?]  |  **No** (only one wildcard can be included in a search term)  |\n\n## Basic example (TL;DR)\n\n- Searching for papers:\n\n```console\n$ findpapers search /some/path/search.json -q "[happiness] AND ([joy] OR [peace of mind]) AND NOT [stressful]"\n```\n\n- Refining search results:\n\n```console\n$ findpapers refine /some/path/search.json\n```\n\n- Downloading full-text from selected papers:\n\n```console\n$ findpapers download /some/path/search.json /some/path/papers/ -s\n```\n\n- Generating BibTeX file from selected papers:\n\n```console\n$ findpapers bibtex /some/path/search.json /some/path/mybib.bib -s\n```\n\n## Advanced example\n\nThis advanced usage documentation can be a bit boring to read (and write), so I think it\'s better to go for a storytelling approach here.\n\n*Let\'s take a look at Dr. McCartney\'s research. He\'s a computer scientist interested in AI and music, so he created a search query to collect papers that can help with his research and exported this query to an environment variable.*\n\n```console\n$ export QUERY="([artificial intelligence] OR [AI] OR [machine learning] OR [ML] OR [deep learning] OR [DL]) AND ([music] OR [s?ng])"\n```\n\n*Dr. McCartney is interested in testing his query, so he decides to collect only 20 papers to test whether the query is suitable for his research (the Findpapers results are sorted by publication date in descending order).*\n\n```console\n$ findpapers search /some/path/search_paul.json --query "$QUERY" --limit 20\n```\n\n*But after taking a look at the results contained in the ```search_paul.json``` file, he notices two problems:*\n - *Only one database was used to collect all the 20 papers*\n - *Some collected papers were about drums, but he doesn\'t like drums or drummers*\n\n*So he decides to solve these problems by:*\n- *Reformulating his query, and also placing it inside a file to make his life easier.*\n\n```/some/path/query.txt```\n```\n([artificial intelligence] OR [AI] OR [machine learning] OR [ML] OR [deep learning] OR [DL]) AND ([music] OR [s?ng]) AND NOT [drum*]\n```\n\n- *Performing the search limiting the number of papers that can be collected by each database.*\n\n```console\n$ findpapers search /some/path/search_paul.json --query-file /some/path/query.txt --limit-db 4\n```\n\n*Now his query returned the papers he wanted, but he realized one thing, no papers were collected from Scopus or IEEE databases. Then he noticed that he needed to pass his Scopus and IEEE API access keys when calling the search command. So he went to https://dev.elsevier.com and https://developer.ieee.org, generated the access keys, and used them in the search.*\n\n```console\n$ export IEEE_TOKEN=SOME_SUPER_SECRET_TOKEN\n\n$ export SCOPUS_TOKEN=SOME_SUPER_SECRET_TOKEN\n\n$ findpapers search /some/path/search_paul.json --query-file /some/path/query.txt --limit-db 4 --token-ieee "$IEEE_TOKEN" --token-scopus "$SCOPUS_TOKEN"\n```\n\n*Now everything is working as he expected, so it\'s time to do the final papers search. So he defines that he wants to collect only works published between 2000 and 2020. He also decides that he only wants papers collected from ACM, IEEE, and Scopus. And he only wants to papers published on a journal or conference proceedings (Tip: The available publication types on Findpapers are: journal, conference proceedings, book, other. When a particular publication does not fit into any of the other types it is classified as "other", e.g., magazines, newsletters, unpublished manuscripts)*\n\n```console\n$ findpapers search /some/path/search_paul.json --query-file /some/path/query.txt --token-ieee "$IEEE_TOKEN" --token-scopus "$SCOPUS_TOKEN" --since 2000-01-01 --until 2020-12-31 --databases "acm,ieee,scopus" --publication-types "journal,conference proceedings"\n```\n\n*The searching process took a long time, but after many cups of coffee, Dr. McCartney finally has a good list of papers with the potential to help in his research. All the information collected is in the ```search_paul.json``` file. He could access this file now and manually filter which works are most interesting for him, but he prefers to use the Findpapers ```refine``` command for this.*\n\n*First, he wants to filter the papers looking only at their basic information.*\n\n```console\n$ findpapers refine /some/path/search_paul.json\n```\n\n![refine-01](docs/refine-01.jpeg)\n\n*After completing the first round filtering of the collected papers, he decides to do new filtering on the selected ones looking at the paper\'s extra info (citations, DOI, publication name, etc.) and abstract now. He also chooses to perform some classification while doing this further filtering (tip: he\'ll need to use spacebar for categories selection). And to help in this process, he also decides to highlight some keywords contained in the abstract.*\n\n```console\n$ export CATEGORIES_CONTRIBUTION="Contribution:Metric,Tool,Model,Method"\n\n$ export CATEGORIES_RESEARCH_TYPE="Research Type:Validation Research,Solution Proposal,Philosophical,Opinion,Experience,Other"\n\n$ export HIGHLIGHTS="propose, achiev, accuracy, method, metric, result, limitation, state of the art"\n\n$ findpapers refine /some/path/search_paul.json --selected --abstract --extra-info --categories "$CATEGORIES_CONTRIBUTION" --categories "$CATEGORIES_RESEARCH_TYPE" --highlights "$HIGHLIGHTS"\n```\n\n![refine-02](docs/refine-02.jpeg)\n\nAn interesting point to stand out from the tool is that it automatically prevents duplication of papers, merging their information when the same paper is found in different databases. You can see this in the image above, where the Findpapers found the same work on the IEEE and Scopus databases (see "Paper found in" value) and merged the paper information on a single record.\n\n\n*Now that Dr. McCartney has selected all the papers he wanted, he wants to see all of them.*\n\n```console\n$ findpapers refine /some/path/search_paul.json --selected --abstract --extra-info --list\n```\n\n*He wants to see all the removed papers too.*\n\n```console\n$ findpapers refine /some/path/search_paul.json --removed --abstract --extra-info --list\n```\n\n*Then, he decides to download the full-text from all the selected papers which have a "Model" or "Tool" as a contribution.*\n\n```console\n$ findpapers download /some/path/search_paul.json /some/path/papers --selected --categories "Contribution:Tool,Model"\n```\n\n*He also wants to generate the BibTeX file from these papers.*\n\n```console\n$ findpapers bibtex /some/path/search_paul.json /some/path/mybib.bib --selected --categories "Contribution:Tool,Model"\n```\n\n*But when he compared the papers\' data in the ```/some/path/mybib.bib```  and PDF files in the ```/some/path/papers``` folder, he noticed that many papers had not been downloaded.*\n\n*So when he opened the ```/some/path/papers/download.log``` file, he could see the URL of all papers that weren\'t downloaded correctly. After accessing these links, he noticed that some of them weren\'t downloaded due to some limitations of Findpapers (currently, the tool has a set of heuristics to perform the download that may not work in all cases). However, the vast majority of papers weren\'t downloaded because they were behind a paywall. But, Dr. McCartney has access to these papers when he\'s connected to the network at the university where he works, but unfortunately, he is at home right now.*\n\n*But he discovers two things that could save him from this mess. First, the university provides a proxy for tunneling requests. Second, Findpapers accepts the configuration of a proxy URL. And of course, he\'ll use this feature.*\n\n```console\n$ findpapers download /some/path/search_paul.json /some/path/papers --selected --categories "Contribution:Tool,Model" --proxy "https://mccartney:super_secret_pass@liverpool.ac.uk:1234"\n```\n\n*Now the vast majority of the papers he has access have been downloaded correctly.*\n\n*And at the end of it, he decides to download the full-text from all the selected works (regardless of their categorization) and generate their BibTeX file too.*\n\n```console\n$ findpapers download /some/path/search_paul.json /some/path/papers --selected\n\n$ findpapers bibtex /some/path/search_paul.json /some/path/mybib.bib --selected\n```\n\n*That\'s all, folks! We have reached the end of our journey. I hope Dr. McCartney can continue his research and publish his work without any major problems now.*\n\nAs you could see, all the information collected and enriched by the Findpapers is placed in a single JSON file. From this file, it is possible to create interesting visualizations about the collected data ...\n\n![charts](docs/charts.png)\n\n... so, use your imagination! (the above visualization was made by the [samples/charts.py](https://gitlab.com/jonatasgrosman/findpapers/-/blob/master/samples/charts.py) script)\n\nWith the story above, we cover all the commands available in Findpapers. I know this documentation is unconventional, but I haven\'t had time to write a more formal version of the documentation. But you can help us to improve this, take a look at the next section and see how you can do that.\n\n\n# Want to help?\n\nSee the [contribution guidelines](https://gitlab.com/jonatasgrosman/findpapers/-/blob/master/CONTRIBUTING.md)\nif you\'d like to contribute to Findpapers project.\n\nYou don\'t even need to know how to code to contribute to the project. Even the improvement of our documentation is an outstanding contribution.\n\nIf this project has been useful for you, please share it with your friends. This project could be helpful for them too.\n\nIf you like this project and want to motivate the maintainers, give us a :star:. This kind of recognition will make us very happy with the work that we\'ve done :heart:\n\n\n---\n\n**Note**: If you\'re seen this project from GitHub, this is just a mirror, \nthe official project source code is hosted [here](https://gitlab.com/jonatasgrosman/findpapers) on GitLab.',
    'author': 'Jonatas Grosman',
    'author_email': 'jonatasgrosman@gmail.com',
    'maintainer': 'Jonatas Grosman',
    'maintainer_email': 'jonatasgrosman@gmail.com',
    'url': 'https://gitlab.com/jonatasgrosman/findpapers',
    'packages': packages,
    'package_data': package_data,
    'install_requires': install_requires,
    'extras_require': extras_require,
    'entry_points': entry_points,
    'python_requires': '>=3.7,<4.0',
}


setup(**setup_kwargs)
