This project is a Document Retrieval application that utilizes Retrieval-Augmented Generation (RAG) techniques to enable users to interact with uploaded PDF documents. By leveraging a Large Language Model (LLM), users can ask questions about the content of the documents and receive accurate answers based on the information retrieved.
- PDF Upload: Users can upload PDF files for processing.
- AI Interaction: Ask questions about the content of the uploaded PDFs.
- Machine Learning Integration: Utilizes advanced machine learning models for document processing and question answering.
- Backend: FastAPI
- Frontend: Streamlit
- Machine Learning: Langchain, Hugging Face Transformers
- Vector Store: FAISS for efficient similarity search
- 
Clone the repository: git clone https://github.com/yourusername/chatpdf.git cd chatpdf
- 
Create a virtual environment and activate it: python -m venv .venv source .venv/bin/activate # On Windows use .venv\Scripts\activate 
- 
Install the required packages: pip install -r requirements.txt 
- 
Start the FastAPI server: uvicorn app.main:app --reload 
- 
Open the Streamlit app in another terminal: streamlit run app/streamlit_app.py 
- 
Navigate to http://localhost:8501in your web browser to access the application.
- 
GET /: Returns a welcome message. 
- 
POST /upload_pdf/: Uploads a PDF file for processing. - Request: Multipart form data with the PDF file.
- Response: Success message upon successful upload and processing.
 
- 
POST /ask/: Asks a question about the uploaded PDF. - Request: JSON body with the question.
- Response: The answer to the question based on the PDF content.
 
- 
To run the tests, use: streamlit run app/streamlit_app.py