feat: add GCP Cloud Run deployment scripts and optimize service startup#365
feat: add GCP Cloud Run deployment scripts and optimize service startup#365emondarock wants to merge 1 commit intopresenton:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds GCP Cloud Run deployment capabilities and optimizes the service startup sequence for cloud environments. The changes introduce PowerShell deployment scripts for GCP, modify the startup orchestration to handle service readiness checks before nginx initialization, add an optional custom title parameter to the presentation generation API, and optimize the Docker image build process.
Key Changes:
- Added two GCP deployment scripts (full and simplified) for Cloud Run with configurable parameters
- Refactored service startup to use a wait-for-services script that ensures backend services are ready before nginx starts
- Enhanced the presentation API to support optional custom titles with automatic fallback to generated titles
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| wait-for-services.sh | New bash script that waits for FastAPI and Next.js to be ready before starting nginx |
| start.js | Refactored startup sequence to return process handles and coordinate nginx startup with backend readiness |
| servers/fastapi/models/generate_presentation_request.py | Added optional title field to allow custom presentation titles |
| servers/fastapi/api/v1/ppt/endpoints/presentation.py | Updated to use custom title with fallback to auto-generated title |
| deploy-gcp.ps1 | New full-featured GCP Cloud Run deployment script with artifact registry and configurable parameters |
| deploy-gcp-simple.ps1 | New simplified GCP deployment script using Cloud Build |
| Dockerfile | Added netcat-openbsd package and wait-for-services script, plus ChromaDB model pre-download optimization |
| .gcloudignore | New file to exclude unnecessary files from GCP deployments |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| --port 80 ` | ||
| --memory $Memory ` | ||
| --cpu $Cpu ` | ||
| --timeout $Timeout ` |
There was a problem hiding this comment.
The timeout is set to 300 seconds (5 minutes), but this is the maximum time Cloud Run will wait for an HTTP request to complete. For a service that generates presentations and may involve AI processing, this might be too short for complex operations. Consider increasing this value or documenting that users should adjust it based on their workload requirements.
| --region $Region ` | ||
| --allow-unauthenticated ` | ||
| --port 80 ` | ||
| --memory 2Gi ` |
There was a problem hiding this comment.
The memory is set to 2Gi which is half of what the full deployment script uses (4Gi). This inconsistency could lead to confusion or out-of-memory issues. Consider documenting why this "simple" deployment uses less memory or making the default values consistent across both deployment scripts.
| --memory 2Gi ` | |
| --memory 4Gi ` |
| --memory 2Gi ` | ||
| --cpu 2 ` | ||
| --timeout 300 ` | ||
| --max-instances 10 ` |
There was a problem hiding this comment.
The max-instances is set to 10, which is significantly higher than the full deployment script's default of 2. This could lead to unexpected costs in production. Consider making these values consistent or documenting the reasoning for the higher limit in the "simple" deployment.
| --max-instances 10 ` | |
| --max-instances 2 ` |
| --memory 2Gi ` | ||
| --cpu 2 ` | ||
| --timeout 300 ` | ||
| --max-instances 10 ` |
There was a problem hiding this comment.
The simple deployment script doesn't include the --min-instances flag while the full script sets it to 1. This means Cloud Run will scale to zero, potentially causing cold start delays. Consider adding --min-instances 0 or --min-instances 1 explicitly to document the intended behavior.
| --max-instances 10 ` | |
| --max-instances 10 ` | |
| --min-instances 1 ` |
| --timeout $Timeout ` | ||
| --min-instances $MinInstances ` | ||
| --max-instances $MaxInstances ` | ||
| --cpu-boost ` |
There was a problem hiding this comment.
The --cpu-boost flag enables CPU boost for startup but is only available in certain regions and may incur additional costs. Consider adding a comment to document this or making it an optional parameter for users who want to optimize costs.
| console.error(`Nginx failed to start with exit code: ${code}`); | ||
| } | ||
| console.log(`Nginx exited with code: ${code}`); | ||
| process.exit(code); |
There was a problem hiding this comment.
The nginx exit handler calls process.exit(code) immediately, but the same exit event is also being listened to in the main() function (lines 218-220). This creates a race condition where the exit handler on line 193 will execute first, preventing the Promise.race from resolving and bypassing the cleanup logic in main(). Consider removing this exit handler and letting only the main() function handle process termination.
|
|
||
| # Wait for FastAPI to be ready | ||
| echo "Waiting for FastAPI to start on port 8000..." | ||
| timeout=300 |
There was a problem hiding this comment.
The timeout variable is hardcoded to 300 seconds. Consider making this configurable via an environment variable (e.g., STARTUP_TIMEOUT) to allow flexibility in different deployment environments where startup times may vary.
| timeout=300 | |
| timeout="${STARTUP_TIMEOUT:-300}" |
| --cpu 2 ` | ||
| --timeout 300 ` | ||
| --max-instances 10 ` | ||
| --set-env-vars "LLM=google,GOOGLE_API_KEY=$GoogleApiKey,IMAGE_PROVIDER=gemini_flash,CAN_CHANGE_KEYS=false" |
There was a problem hiding this comment.
The Google API key is passed directly as a plaintext environment variable in the deployment command. This exposes the sensitive API key in command history, logs, and the Cloud Run console. Consider using Google Secret Manager to securely store and reference the API key instead:
--set-secrets="GOOGLE_API_KEY=google-api-key:latest"This requires creating a secret first with gcloud secrets create and granting the Cloud Run service account access to it.
| --image $ImageTag ` | ||
| --platform managed ` | ||
| --region $Region ` | ||
| --allow-unauthenticated ` |
There was a problem hiding this comment.
The deployment command sets --allow-unauthenticated, exposing the Cloud Run service publicly without authentication. If the service handles sensitive operations or data, an attacker can directly invoke endpoints over the internet. Remove --allow-unauthenticated and enable IAM or an authentication layer (e.g., --no-allow-unauthenticated with Cloud Run IAM, or configure Cloud Run with IAP/OIDC).
| --allow-unauthenticated ` |
| --source . ` | ||
| --platform managed ` | ||
| --region $Region ` | ||
| --allow-unauthenticated ` |
There was a problem hiding this comment.
The deployment uses --allow-unauthenticated, making the Cloud Run service publicly accessible without any auth controls. Attackers can invoke service endpoints directly if exposed, risking unauthorized access and data exposure. Omit --allow-unauthenticated (use --no-allow-unauthenticated) and enforce Cloud Run IAM or integrate IAP/OIDC for authenticated access.
| --allow-unauthenticated ` |
No description provided.