# Project Document: MCQ Generator from PDF
## 1. Introduc on
- **Project Overview**: The MCQ Generator from PDF is a web applica on designed to extract mul ple-choice
ques ons (MCQs) from uploaded PDF files and generate them into a structured CSV format.
- **Objec ves**:
- Provide a user-friendly interface for uploading PDF files.
- Automa cally extract MCQs from the PDF content.
- Generate MCQs in a structured format suitable for educa onal purposes.
- **Audience**: Teachers, educators, and educa onal ins tu ons seeking to automate the process of crea ng
quizzes and tests.
## 2. Technologies Used
- **Programming Languages**: Python
- **Frameworks and Libraries**: Streamlit, Pandas, PyMuPDF, Langchain, Python-dotenv
- **APIs**: OpenAI API (for text genera on)
## 3. Directory Structure
- **File Structure**:
- `app.py`: Main Streamlit applica on script handling user interface and interac on.
- `mcq_extractor.py`: Module for PDF text extrac on and MCQ genera on.
- `requirements.txt`: List of project dependencies.
## 4. Installa on Instruc ons
- **Prerequisites**: Python 3.x, pip package manager
- **Setup Instruc ons**:
1. Clone the repository: `git clone h ps://github.com/your-repo/mcq_generator.git`
2. Navigate to the project directory: `cd mcq_generator`
3. Install dependencies: `pip install -r requirements.txt`
4. Create a `.env` file and add your OpenAI API key.
## 5. Usage
- **Running the Applica on**: Execute `streamlit run app.py` in the terminal.
- **Func onality**:
- Upload a PDF file containing educa onal content.
- Specify the number of MCQs, subject, and tone for quiz genera on.
- View generated MCQs in a table format.
- Download the generated MCQs as a CSV file.
## 6. Detailed Components
- **app.py**: Handles the Streamlit user interface and integrates with `mcq_extractor.py` for MCQ genera on.
- **mcq_extractor.py**: U lizes PyMuPDF for PDF text extrac on and Langchain/OpenAI for MCQ genera on
based on extracted content.
## 7. Code Snippets
- **Example Code Snippet**: (Insert relevant code snippets from `app.py` and `mcq_extractor.py`)
## 8. Project Dependencies
- **List of Dependencies**: Refer to `requirements.txt` for a complete list of Python packages and versions.
## 9. Tes ng
- **Tes ng Strategy**: Manual tes ng performed for user interface interac ons and automated unit tes ng for
cri cal func ons.
- **Tools Used**: Built-in tes ng frameworks for Python.
## 10. Limita ons and Future Enhancements
- **Current Limita ons**: PDFs with complex layouts may not extract MCQs accurately.
- **Future Enhancements**: Implement machine learning models for improved MCQ extrac on and natural
language understanding.
## 11. Contributors
- **Project Team**: Ganesh Jagadeesan (Developer)
## 12. Conclusion
- **Summary**: The MCQ Generator from PDF simplifies the process of crea ng educa onal quizzes and
assessments from PDF documents, enhancing efficiency for educators and learners alike.
- **Acknowledgments**: Special thanks to OpenAI for their API support and the Streamlit community for their
user-friendly framework.
### Appendices
- **Addi onal Resources**: Links to project repository h ps://github.com/Ganlak/mcqgenerator.git.