This project demonstrates how to upload large files (20GB+) directly to Amazon S3 using signed URLs — with a Python + FastAPI backend, MongoDB for tracking, and a React + Vite frontend.
📖 Read the full article on my blog: How to Upload Large Files (20GB+) to S3 using Signed URLs
This repo focuses on showcasing the complete flow — from multipart upload creation to chunk uploads, progress tracking, and final completion.
Feel free to explore, fork, and adapt it to your use case!
This project is structured as a monorepo for several key reasons:
-
Unified Development Experience: The frontend and backend components are tightly coupled in their functionality. A monorepo allows for easier coordination between these components during development.
-
Simplified Dependency Management: Shared configurations and dependencies can be managed more efficiently, reducing duplication and ensuring consistency.
-
Atomic Changes: Changes that span both frontend and backend can be committed together, ensuring the system remains in a consistent state.
-
Streamlined CI/CD: Deployment pipelines can be configured to understand the relationships between components, allowing for more intelligent build and deployment processes.
- React: Provides a component-based architecture ideal for building the interactive UI elements needed for file upload tracking.
- TypeScript: Adds type safety to prevent common errors and improve developer experience.
- Vite: Offers lightning-fast HMR (Hot Module Replacement) and optimized builds, significantly improving development speed.
- FastAPI: Chosen for its high performance, automatic API documentation, and built-in validation through Pydantic.
- Async Support: Efficiently handles concurrent S3 operations, crucial for multipart uploads.
- Type Hints: Provides automatic validation and editor support, reducing bugs.
- Document Model: Perfect match for the variable metadata of upload sessions.
- Flexible Schema: Adapts to changing requirements without migrations.
- Performance: Document-level concurrency supports high-throughput operations for tracking many simultaneous uploads.
- Scalability: Handles files of any size with consistent performance.
- Multipart Upload API: Native support for chunked uploads of large files.
- Direct Upload: Signed URLs allow clients to upload directly to S3, reducing server load.
The large file upload process follows these steps:
- Initialization: Frontend requests a multipart upload from the backend
- Chunking: File is split into manageable chunks (typically 5-10MB each)
- Parallel Uploads: Chunks are uploaded directly to S3 using pre-signed URLs
- Progress Tracking: Backend tracks upload progress in MongoDB
- Completion: Once all chunks are uploaded, backend completes the multipart upload
This approach offers several advantages:
- Bypasses API Gateway and server upload limits
- Reduces server load by having clients upload directly to S3
- Provides resilience through resumable uploads
- Enables real-time progress tracking
See the README files in the frontend/ and backend/ directories for specific setup instructions.
This project is licensed under the MIT License - see the LICENSE file for details.