Doctly is a Python client library that provides a simple way to interact with the Doctly backend API. With Doctly, you can effortlessly upload PDF documents, process them to Markdown, and retrieve the converted content—all with just a few lines of code.
Doctly can be easily installed using pip. Run the following command in your terminal:
pip install doctlyHere's a quick example to get you started with Doctly:
import doctly
# Initialize the Doctly client with your API key
client = doctly.Client(api_key='YOUR_API_KEY')
# Process a PDF file
try:
content = client.process('path/to/your/file.pdf')
# Save the processed content to a file
with open('output.md', 'w') as f:
f.write(content)
print("Processing successful! Content saved as 'output.md'")
except doctly.DoctlyError as e:
print(f"An error occurred: {e}")To begin using Doctly, initialize the Client class with your API key:
import doctly
# Replace 'YOUR_API_KEY' with your actual API key
client = doctly.Client(api_key='YOUR_API_KEY')The primary functionality of Doctly is to upload a PDF file, process it, and retrieve the converted content. Here's how to do it:
try:
content = client.process('path/to/your/file.pdf')
# Optional: Save the content to a file
with open('output.md', 'w') as f:
f.write(content)
print("Processing successful!")
except doctly.DoctlyError as e:
print(f"An error occurred: {e}")Doctly handles the asynchronous nature of the backend API by polling the document status. You can customize the polling interval (wait_time) and the maximum waiting duration (timeout) as needed:
content = client.process(
'path/to/your/file.pdf',
wait_time=10, # Time in seconds between each status check
timeout=600 # Maximum time in seconds to wait for processing
)Doctly supports different accuracy levels for processing documents. You can specify the accuracy level using the accuracy parameter:
from doctly import Accuracy
# Process with LITE accuracy (faster, default)
content_lite = client.process('path/to/your/file.pdf', accuracy=Accuracy.LITE)
# Process with Precistion ULTRA accuracy (more accurate but slower)
content_ultra = client.process('path/to/your/file.pdf', accuracy=Accuracy.ULTRA)Errors are handled with the DoctlyError exception. Catch this exception to handle any issues that arise during the upload, processing, or download processes:
try:
content = client.process('file.pdf')
except doctly.DoctlyError as e:
print(f"Error: {e}")
# Additional error handling logicThe Client class encapsulates all interactions with the Doctly backend API.
-
Description: Initializes the Doctly client with the provided API key and optional base URL.
-
Parameters:
api_key(str): Your Doctly API key.base_url(str, optional): The base URL for the Doctly API. Defaults to "https://api.doctly.ai".
-
Example:
client = doctly.Client(api_key='YOUR_API_KEY')
process(file_path: str, accuracy: Accuracy = Accuracy.LITE, wait_time: int = 5, timeout: int = 300, **kwargs) -> str
-
Description: Uploads a PDF file to the backend, polls for processing status, and returns the processed content.
-
Parameters:
file_path(str): Path to the PDF file to upload.accuracy(Accuracy, optional): Processing accuracy level (LITE or ULTRA). Defaults toAccuracy.LITE.wait_time(int, optional): Time in seconds between each status check. Defaults to5seconds.timeout(int, optional): Maximum time in seconds to wait for processing. Defaults to300seconds (5 minutes).**kwargs: Additional parameters for future extensions.
-
Returns:
content(str): The processed content.
-
Raises:
DoctlyError: If there's an error during upload, processing, or download.
-
Example:
content = client.process('document.pdf', accuracy=Accuracy.ULTRA)
An enumeration that defines the available accuracy levels for document processing.
Accuracy.LITE: Precision - Faster processing with great accuracy.Accuracy.ULTRA: Precision Ultra - Extremly good accuracy, but may take longer. This process generates multiple versions for each page, picking the highest accuracy one.
from doctly import Accuracy
# Process with ULTRA accuracy
content = client.process('file.pdf', accuracy=Accuracy.ULTRA)A custom exception class for handling errors specific to the Doctly library.
try:
content = client.process('file.pdf')
except doctly.DoctlyError as e:
print(f"Doctly encountered an error: {e}")Contributions are welcome! To contribute to Doctly, please follow these steps: Please ensure that your code follows the project's coding standards and includes relevant tests.
For any questions, issues, or feature requests, please open an issue on GitHub