mdfy is a FastAPI-based web service that converts various document formats to Markdown.
This repository contains the source code for the PDF to Markdown conversion service, which uses code from the RAG project.
- Convert PDF, DOCX, XLSX, PPTX, HWP, OXPS, EPUB, and MOBI files to Markdown
- Process files from URLs or direct uploads
- Redis caching for improved performance
- OAuth2 authentication for secure access
- Python 3.11+
- Poetry
- Redis server
- Docker (optional)
- Clone the repository:
git clone https://github.com/jaigouk/mdfy.git
cd mdfy- Create a virtual environment and activate it:
conda create -n mdfy python=3.11
conda activate mdfy- Install the required packages:
poetry install- Copy the
.env.examplefile to.envand fill in the required values:
cp .env.example .envpoetry run uvicorn mdfy:app --reloadThe server will be available at http://127.0.0.1:8000.
- Build the Docker image:
docker build -t mdfy .- Run the container:
docker run -p 8000:8000 --env-file .env mdfyThe server will be available at http://localhost:8000.
GET /: Welcome messageGET /health: Health check endpointPOST /process_url/: Convert a document from a URL to MarkdownPOST /process_upload/: Convert an uploaded document to Markdown
FastAPI automatically generates interactive API documentation:
- Swagger UI:
http://127.0.0.1:8000/docs - ReDoc:
http://127.0.0.1:8000/redoc
You can use these interfaces to explore and test the API endpoints.
The OpenAPI (Swagger) specification is available at: http://127.0.0.1:8000/openapi.json
poetry run pytest
This project is licensed under the GNU AGPL v3.0 License - see the LICENSE file for details.