|
|
PDF Tools |
A comprehensive PDF manipulation platform that provides all essential PDF processing tools online. From basic operations like merging and splitting to advanced features like watermarking and password protection, FileMaster offers a complete solution for all your PDF needs.
Explore the docs »
View Demo
·
Report Bug
·
Request Feature
Table of Contents
FileMaster PDF Tools is a modern, full-stack web application designed to provide comprehensive PDF manipulation capabilities through an intuitive web interface. Built with cutting-edge technologies, it offers a complete suite of PDF processing tools that cater to both individual users and businesses.
The platform features a microservices architecture with separate client and server components, ensuring scalability and maintainability. The application supports real-time file processing, secure user authentication, and provides a seamless user experience across all devices.
Key highlights of the platform include:
- Secure Processing: All files are processed securely and automatically deleted after completion
- High Performance: Optimized cloud infrastructure ensures fast processing times
- User-Friendly Interface: Modern, responsive design with intuitive navigation
- Comprehensive Toolset: Covers all essential PDF operations from basic to advanced features
This section lists the major frameworks and libraries used to build the project.
- PDF Merging - Combine multiple PDF files into a single document
- PDF Splitting - Split PDFs by pages or custom ranges
- PDF Compression - Reduce file size while maintaining quality
- PDF Conversion - Convert various formats (DOCX, PPTX, images) to PDF and vice-versa
- PDF Protection - Add password protection and encryption
- PDF Unlocking - Remove password protection from PDFs
- PDF Rotation - Rotate pages by 90, 180, or 270 degrees
- Watermarking - Add text or image watermarks to PDFs
- Page Numbering - Add page numbers with custom formatting
- User Authentication - Secure login with email/password and Google OAuth
- File Preview - Preview PDFs before processing
- Batch Processing - Process multiple files simultaneously
- Real-time Progress - Track processing status in real-time
- Secure File Handling - Automatic file cleanup after processing
- Responsive Design - Works seamlessly on desktop and mobile devices
The following prerequisites must be met before proceeding with the installation. Note that an AWS S3 account is a mandatory requirement for both installation methods to facilitate file storage.
- Docker & Docker Compose: Ensure both are correctly installed and the Docker daemon is active.
- Node.js: Version 20 or a higher version is required.
- MongoDB: A running instance of MongoDB, accessible from the local machine.
- Redis: A running instance of Redis.
- LibreOffice: Required for document conversion functionalities.
- Ubuntu/Debian: Execute
sudo apt-get install libreoffice - macOS (via Homebrew): Execute
brew install --cask libreoffice - Windows: Download the installer from the official website.
- Ubuntu/Debian: Execute
- qpdf: Required for core PDF manipulation tasks.
- Ubuntu/Debian: Execute
sudo apt-get install qpdf - macOS (via Homebrew): Execute
brew install qpdf - Windows: Install using a package manager such as Chocolatey (
choco install qpdf) or obtain it from the official source repository.
- Ubuntu/Debian: Execute
This method is recommended for its simplicity and consistent environment setup.
-
Clone the Repository
git clone [https://github.com/taovuzu/file-master.git](https://github.com/taovuzu/file-master.git) cd file-master -
Create Environment Files
# Copy the templates for the client, server, and Docker Compose cp client/.env.example client/.env cp server/.env.example server/.env cp .env.example .env -
Configure Environment Variables
- Open the newly created
.env,client/.env, andserver/.envfiles. - Populate the files with the necessary credentials and configuration values, particularly for the AWS S3 account. The database and Redis variables in the root
.envfile are utilized by Docker Compose to provision the respective services.
- Open the newly created
-
Build and run the Docker containers
docker compose up --build
-
Accessing the Application
- The application will be accessible at
http://localhost:5173.
- The application will be accessible at
-
Start Dependent Services
- Ensure that your local instances of MongoDB and Redis are running before proceeding.
-
Clone the Repository
git clone [https://github.com/taovuzu/file-master.git](https://github.com/taovuzu/file-master.git) cd file-master -
Set Up Environment Variables
# Copy the environment templates for the client and server cp client/.env.example client/.env cp server/.env.example server/.env cp .env.example .env- Open
client/.envandserver/.envto apply your configuration. - Important: In
server/.env, theMONGO_URIandREDIS_URLvariables must be updated to point to your running local services. AWS S3 credentials must also be provided.
- Open
-
Install Project Dependencies
# Install server-side dependencies cd server npm install # Install client-side dependencies from the project root in a new terminal cd ../client npm install
-
Run the Application
- Three separate terminal sessions are required to run the application components.
# In Terminal 1 (from the /server directory): Start the API server npm run dev:server # In Terminal 2 (from the /server directory): Start the background worker npm run dev:worker # In Terminal 3 (from the /client directory): Start the frontend client npm run dev
-
Accessing the Application
- The application will be accessible at
http://localhost:5173.
- The application will be accessible at
FileMaster PDF Tools provides an intuitive web interface for all PDF operations. Here's how to use the platform:
- Choose Tool: Select the desired PDF operation from the available tools
- Upload Files: Drag and drop PDF files or click to browse and select files
- Configure Options: Set specific parameters for the operation (compression level, page ranges, etc.)
- Process: Click the process button to start the operation
- Download: Once processing is complete, download the result
Example Workflows
Merging PDFs:
- Upload multiple PDF files
- Select "Merge PDF" tool
- Arrange files in desired order
- Click "Merge" to combine files
Compressing PDFs:
- Upload a PDF file
- Select "Compress PDF" tool
- Choose compression level (Low, Medium, High)
- Click "Compress" to reduce file size
Adding Watermarks:
- Upload a PDF file
- Select "Watermark PDF" tool
- Enter watermark text or upload image
- Configure position, transparency, and rotation
- Click "Add Watermark" to process
Architecture Overview
FileMaster follows a microservices architecture with the following components:
- Client Application: React-based frontend with Vite build system
- API Server: Express.js REST API with JWT authentication
- Worker Service: Background job processor for PDF operations
- Database: MongoDB for user data and job management
- Cache: Redis for job queues and rate-limiting and others
- Storage: AWS S3 for file storage for processing and downloading
Security Architecture: The application uses containerized microservices with strict isolation between the API server and worker processes. This separation provides enhanced security by:
- Process Isolation: Worker containers run with limited permissions and restricted access
- Fault Containment: If a worker crashes or executes malicious code, it cannot affect the main server
- Resource Isolation: Each container has its own filesystem, network, and process space
- Minimal Attack Surface: Workers only have access to necessary resources for PDF processing
Technology Stack
Frontend:
- React 19.1.0 with functional components and hooks
- Redux Toolkit for state management
- Ant Design for UI components
- React Router for navigation
- PDF.js for PDF preview functionality
- Vite for build tooling and development server
Backend:
- Node.js 20 with Express.js framework
- MongoDB with Mongoose ODM
- Redis with BullMQ for job queues
- JWT for authentication
- Passport.js for OAuth integration
- AWS SDK for S3 integration
PDF Processing:
- pdf-lib for PDF manipulation
- LibreOffice for document conversion
- QPDF for PDF operations
- Poppler utilities for PDF processing
Infrastructure:
- Docker for containerization
- Docker Compose for orchestration
- Nginx for reverse proxy
- AWS S3 for file storage
Database Schema
User Model:
- email: String (unique, required)
- fullName: String (required)
- password: String (hashed with bcrypt)
- loginType: Array (email, google)
- refreshToken: String
- createdAt: Date
- updatedAt: Date
Job Model:
- jobId: String (unique)
- userId: ObjectId (reference to User)
- operation: String (merge, split, compress, etc.)
- status: String (pending, processing, completed, failed)
- progress: Number (0-100)
- inputFiles: Array of S3 keys
- outputFile: String (S3 key)
- createdAt: Date
- updatedAt: Date
API Endpoints
Authentication:
- POST /api/v1/users/register-email - Register with email
- POST /api/v1/users/register-user - Complete registration
- GET /api/v1/users/verify-email-link - Verify email by link
- POST /api/v1/users/verify-email-otp - Verify email by OTP
- POST /api/v1/users/login - User login
- POST /api/v1/users/resend-verification - Resend email verification
- POST /api/v1/users/request-password-reset - Request password reset
- POST /api/v1/users/reset-forgot-password - Reset forgotten password
- POST /api/v1/users/logout - User logout
- POST /api/v1/users/change-password - Change current password
- GET /api/v1/users/current-user - Get current user
- GET /api/v1/users/refresh-access-token - Refresh access token
- GET /api/v1/users/google - Google OAuth login
- GET /api/v1/users/google/callback - Google OAuth callback
PDF Operations:
- POST /api/v1/pdf-tools/compress - Compress PDF
- POST /api/v1/pdf-tools/merge - Merge PDFs
- POST /api/v1/pdf-tools/split - Split PDF
- POST /api/v1/pdf-tools/rotate - Rotate PDF
- POST /api/v1/pdf-tools/protect - Protect PDF
- POST /api/v1/pdf-tools/unlock - Unlock PDF
- POST /api/v1/pdf-tools/watermark/text - Add text watermark
- POST /api/v1/pdf-tools/page-numbers - Add page numbers
- POST /api/v1/pdf-tools/convert/doc-to-pdf - Convert document to PDF
- POST /api/v1/pdf-tools/convert/images-to-pdf - Convert images to PDF
- POST /api/v1/pdf-tools/convert/pdf-to-doc - Convert PDF to document
- POST /api/v1/pdf-tools/convert/pdf-to-ppt - Convert PDF to PowerPoint
File Management:
- POST /api/v1/upload/presign - Get presigned URL for file upload
- GET /api/v1/download/status/:jobId - Check job status
- GET /api/v1/download/:jobId - Download processed files
Health Check:
- GET /api/v1/health - General health check
- GET /api/v1/health/redis - Redis health check
- GET /api/v1/health/mongodb - MongoDB health check
Processing Flow
- File Upload: User uploads files through the web interface
- Validation: Files are validated for type, size, and security and abuse prevention
- S3 Storage: Files are uploaded to AWS S3 with unique keys
- Job Creation: A processing job is created and added to Redis queue
- Worker Processing: Background worker picks up the job and processes files and real-time progress is updated in cache
- Progress Updates: Client polls the server for real-time progress
- Result Storage: Processed files are stored in S3
- Download: User can download the processed files
- Cleanup: Temporary files are automatically deleted
Security Features
- JWT Authentication: Secure token-based authentication
- CSRF Protection: Cross-site request forgery protection
- Rate Limiting: API rate limiting to prevent abuse
- File Validation: Comprehensive file type and size validation
- Secure File Storage: Files stored securely in AWS S3
- Automatic Cleanup: Files are automatically deleted after processing
- Input Sanitization: All user inputs are sanitized and validated
- Container Isolation: Separate worker and server containers with limited permissions
- Fault Containment: Worker crashes or malicious code execution cannot affect the main server
- Process Isolation: Each container runs with restricted access and minimal attack surface
Performance Optimizations
Currently Implemented:
- Background Processing: PDF operations run in isolated background workers
- Redis Caching: Job data and usage metrics cached in Redis
- Rate Limiting: Multi-tier rate limiting (global slowdown, sensitive endpoints, upload limits)
- Queue Management: BullMQ job queue with retry mechanisms and job reservation system
- Secure Process Spawning: Isolated PDF processing with QPDF and LibreOffice
- Progress Tracking: Real-time job status updates with detailed progress reporting
- Resource Cleanup: Automatic temporary file cleanup after processing
- Usage Limits: Per-user and anonymous usage tracking with daily limits
- S3 Streaming: Stream-based file upload/download for large files
Planned Optimizations:
- Stream Processing: Convert all processors to stream data instead of file downloads
- Enhanced S3 Validation: Implement better S3 stream-based file validation
- Response Compression: Add gzip compression for API responses
- Connection Pooling: Implement database connection pooling
- CDN Integration: Serve static assets through CDN
Performance Enhancements
- Implement streaming data processing to replace file-based operations
- Add comprehensive S3 stream validation for enhanced security
- Integrate response compression and connection pooling
Business Features
- Develop payment integration for premium services
- Create subscription-based usage tiers
- Implement advanced analytics and reporting
Infrastructure & Security
- Deploy to production environment with full monitoring
- Enhance security measures and vulnerability assessments
- Implement comprehensive logging and audit trails
Development
- Expand API documentation and developer resources
- Create comprehensive testing suite
- Establish CI/CD pipeline for automated deployments
For detailed feature requests and bug reports, visit our GitHub Issues.
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Distributed under the MIT License. See LICENSE for more information.
taovuzu - @taovuzu - [email protected]
Project Link: https://github.com/taovuzu/file-master
Acknowledgments
We extend our gratitude to the open-source projects and their maintainers that make this application possible:
Core Technologies
- React - Frontend framework
- Node.js - Backend runtime
- Express.js - Web framework
- MongoDB - Database
- Redis - Caching
PDF Processing
- PDF-lib - PDF manipulation
- QPDF - PDF toolkit
- LibreOffice - Document conversion
- Poppler - PDF utilities
Infrastructure
- AWS S3 - Cloud storage
- Docker - Containerization
- BullMQ - Job queue
- Ant Design - UI components