Metadata-Version: 2.1
Name: chatting-with-pdfs
Version: 0.0.11
Summary: Load a PDF file and ask questions via llama_index and GPT.
Author: Morne
License: MIT
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Provides-Extra: dev
License-File: LICENSE

# About Python Chat PDF (GPT Index) Project

Load your PDFs data folder and ask questions via llama_index and GPT.

---

## What is LlamaIndex
LlamaIndex (GPT Index) is a <b>data framework</b> for your LLM application.

### Context
- LLMs are a phenomenonal piece of technology for knowledge generation and reasoning. They 
are pre-trained on large amounts of publicly available data.
- The best approach to augment LLMs with our own private data, we need a comprehensive toolkit 
to help perform this data augmentation for LLMs.

---

### Proposed Solution
That’s where LlamaIndex comes in. LlamaIndex is a “data framework” to help you build LLM apps. 
It provides the following tools:

- Offers data connectors to ingest your existing data sources and data formats (APIs, PDFs, docs, SQL, etc.)

- Provides ways to structure your data (indices, graphs) so that this data can be easily used with LLMs.

- Provides an advanced retrieval/query interface over your data: Feed in any LLM input prompt, get 
back retrieved context and knowledge-augmented output.

- Allows easy integrations with your outer application framework (e.g. with LangChain, Flask, Docker, 
ChatGPT, anything else).

LlamaIndex provides tools for both beginner users and advanced users. Our high-level API allows 
beginner users to use LlamaIndex to ingest and query their data in 5 lines of code. Our lower-level 
APIs allow advanced users to customize and extend any module (data connectors, indices, retrievers, 
query engines, reranking modules), to fit their needs.

---

### What does load_index_from_storage do and how does it work? 
load_index_from_storage is a function that loads an index from a 
StorageContext object. It takes in a StorageContext object and an optional 
index_id as parameters. If the index_id is not specified, it assumes there is 
only one index in the index store and loads it. It then passes the index_ids 
and any additional keyword arguments to the load_indices_from_storage 
function. This function then retrieves the index structs from the index store 
and creates a list of BaseGPTIndex objects. If the index_ids are specified, it
will only load the indices with the specified ids. Finally, the function 
returns the list of BaseGPTIndex objects.

---

## Getting Started

## Instructions

- Install the requirements

```bash
pip install -r requirements.txt
```

- Get a GPT API key from [OpenAI](https://platform.openai.com/account/api-keys) if you don't have one already.

- Run the script.

```bash
python3 chat_with_pdfs.py <"data_folder_path"> <"open_api_key">
```

- Ask any questions about the content of the PDF.
