Metadata-Version: 2.1
Name: pdf2df
Version: 0.0.3
Summary: Extract data from pdf to a dataframe
Home-page: UNKNOWN
Author: Rohit Thakur
Author-email: fordatascience12@gmail.com
License: UNKNOWN
Description: ## Pdf2Df
        
        This is a simple python package to create a dataframe with the text extracted from PDFs.
        
        To install:
        
        ```
        $ pip install pdf2df
        $ pip install PyMuPDF==1.16.14
        ```
        
        ### Get Started
        
        To use the package, first import it:
        
        ```
        from pdf2df import Pdf2df
        
        sfd = Pdf2df(path, page=True, single_file=False)
        df = sfd.get_text()
        ```
        
        ### Arguments
        
         - **path** (str) : Where the files are located. It could be a single file or a folder containing multiple pdf files
         - **page** (bool) : If True, the dataframe will contain each page of the pdf in a new row, if flase, all the text in the pdf will be in the same row.
         - **single_file** (bool) : This tell is method if the path is a folder containing multiple pdf files or a single pdf file.
Keywords: python,pymupdf,extract pdf,extract text,dataframe
Platform: UNKNOWN
Classifier: Development Status :: 1 - Planning
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: Unix
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Description-Content-Type: text/markdown
