Tags: Zipstack/unstract
Tags
UN-2901 [FIX] Prevent invalid status updates (EXECUTING/ERROR) from d… …uplicate file processing runs (#1606) * UN-2901 [FIX] Prevent invalid status updates (EXECUTING/ERROR) from duplicate file processing runs Fixes race condition where late-arriving workers overwrite COMPLETED status with invalid EXECUTING or ERROR states, causing files to appear failed/stuck even though processing succeeded. Changes: - FileAPIClient: Fixed URL construction and method call bugs - Fresh DB validation: Check current status before updating to EXECUTING - Grace period optimization: Early exit when duplicate detected during tool polling - File count accuracy: Include skipped files in total_files calculation Impact: Files now correctly maintain COMPLETED status; no duplicate processing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * coderabbit commnets addressed --------- Co-authored-by: Claude <[email protected]>
UN-2901 [FIX] Container startup race condition with polling grace per… …iod (#1602) * UN-2901 [FIX] Container startup race condition with polling grace period * UN-2901 [FIX] Add Redis retry resilience and fix container failure detection - Add configurable Redis retry decorator with exponential backoff - Fix critical bug where containers that never start are marked as SUCCESS - Add robust env var validation for retry configuration - Apply retry logic to FileExecutionStatusTracker and ToolExecutionTracker - Document REDIS_RETRY_MAX_ATTEMPTS and REDIS_RETRY_BACKOFF_FACTOR env vars 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * UN-2901 [FIX] Address CodeRabbitAI review feedback for race condition fix This commit addresses all valid CodeRabbitAI review comments on PR #1602: 1. **Fix retry loop semantics**: Changed retry loop to use range(max_retries + 1) where max_retries means "retries after initial attempt", not total attempts. Updated default from 5 to 4 (total 5 attempts) for clarity. 2. **Fix TypeError in file_execution_tracker.py**: Fixed json.loads() receiving dict instead of string by using string fallback values. 3. **Fix unsafe env var parsing**: Added _safe_get_env_int/_safe_get_env_float helpers with validation and fallback to defaults with warning logs. 4. **Fix status None check**: Added defensive None check before calling .get() on status dict in grace period reset logic. 5. **Update sample.env defaults**: Changed REDIS_RETRY_MAX_ATTEMPTS from 5 to 4 and updated comments to clarify retry semantics. 6. **Improve transient failure handling**: Changed logger.error to logger.warning for transient status fetch failures, added sleep before continue to respect polling interval and avoid API hammering. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: Claude <[email protected]>
UN-2897 [FIX] Google Drive connector SIGSEGV crashes in Celery ForkPo… …olWorker processes (#1597) UN-2897 [FIX] Google Drive connector SIGSEGV crashes in Celery ForkPoolWorker Implements lazy initialization for Google Drive API client to prevent segmentation faults when Celery forks worker processes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <[email protected]>
UN-2893 [FIX] Fix duplicate process handling status updates and UI er… …ror logs (#1594) * UN-2893 [FIX] Fix duplicate process handling status updates and UI error logs Prevent duplicate worker processes from updating file execution status and showing UI error logs during GKE race conditions. - Added is_duplicate_skip flag to FileProcessingResult dataclass - Fixed destination_processed default value for correct duplicate detection - Skip status updates and UI logs when duplicate is detected - Only first worker updates status, second worker silently exits * logger.error converted to logger.exception * error to exception in logs
UN-2889 [FIX] Handle Celery logger with empty request_id to prevent S… …IGSEGV crashes (#1591) * UN-2889 [FIX] Handle Celery logger with empty request_id to prevent SIGSEGV crashes - Simplified logging filters into RequestIDFilter and OTelFieldFilter - Removed custom DjangoStyleFormatter and StructuredFormatter classes - Removed Celery's worker_log_format config that created formatters without filters - Removed LOG_FORMAT environment variable and all format options - All workers now use single standardized format with filters always applied * addressd coderabiit comment * addressd coderabiit comment
UN-2882 [FIX] Fix BigQuery float precision issue in metadata serializ… …ation (#1589) * Fix BigQuery float precision issue by normalizing floats before JSON serialization - Added _sanitize_floats_for_database() helper function to recursively normalize float values to 6 decimal precision using string formatting - Modified _add_processing_columns() to sanitize metadata before json.dumps() - Fixes BigQuery insertion failures caused by floats that can't round-trip through string representation (e.g., 22.770092) - Solution normalizes internal binary representation via float(f"{x:.6f}") - Handles edge cases: NaN and Infinity converted to None - Works recursively on nested dicts/lists - Backward compatible, preserves meaningful precision - Protects all database types (BigQuery, PostgreSQL, MySQL, Snowflake) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * addressing PR comments add sanitized method over data feild too * moved math import to top of the file --------- Co-authored-by: Claude <[email protected]>
Fix organization context pollution in shared HTTP sessions - Remove X-Organization-ID from session headers in _setup_session() - Remove X-Organization-ID from set_organization_context() method - Update clear_organization_context() to only clear instance variables - Use per-request headers in _make_request() to prevent pollution This prevents callback workers from inheriting wrong organization context when using shared HTTP sessions with singleton pattern. Fixes: UN-2877 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
UN-2866 [FIX] Fix duplicate detection parameter name mismatch causing… … false positives on worker retry Fixed parameter name from 'exclude_execution_id' to 'current_execution_id' in worker API client to match backend endpoint expectations. This allows worker retries after pod crashes to properly exclude current execution from duplicate detection. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
UN-2871 [FEATURE] Log sharing across shared workflows/deployments (#1580 ) * UN-2871 [FEATURE] Add shared workflow executions filter to enable multi-user access Update WorkflowExecutionManager.for_user() to include executions from workflows shared with users, ensuring consistent access control across workflow and execution models. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Update backend/workflow_manager/workflow_v2/models/execution.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Rahul Johny <[email protected]> * UN-2871 [FIX] Move Q import to top-level for PEP8 compliance Move django.db.models.Q import from function-level to module-level to comply with linting standards and improve code organization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * UN-2871 [SECURITY] Fix execution filtering to respect independent workflow and deployment sharing Update WorkflowExecutionManager.for_user() to properly handle independent sharing between workflows and API deployments/pipelines. Previous implementation only checked workflow sharing, allowing users to see executions for unshared deployments. Key changes: - Add separate filters for API deployments and pipelines access - Implement proper logic for independent sharing scenarios: * Workflow shared + no pipeline -> User sees workflow-level executions * API/Pipeline shared (regardless of workflow) -> User sees those executions * Both shared -> User sees all related executions * Neither shared -> User cannot see executions This ensures users can only view executions for resources they have explicit access to, preventing unauthorized data exposure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * UN-2871 [PERF] Optimize ExecutionFilter to use EXISTS instead of values_list for large datasets Replace inefficient values_list() queries with EXISTS subqueries in filter_execution_entity(). This significantly improves performance when filtering by entity type on large datasets. Performance improvements: - API filter: Uses EXISTS check instead of fetching all API deployment IDs - ETL filter: Uses EXISTS check instead of fetching all ETL pipeline IDs - TASK filter: Uses EXISTS check instead of fetching all TASK pipeline IDs - Workflow filter: Simplified to use isnull check (removed redundant workflow_id filter) EXISTS is more efficient because: 1. Stops at first match (short-circuits) 2. Doesn't transfer data from database to application 3. Better query optimizer hints for the database 4. Reduced memory usage The queryset is already filtered by user permissions via get_queryset(), so this change only optimizes the entity type filtering step. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * UN-2871 [FIX] Move Exists and OuterRef imports to module level for PEP8 compliance Move django.db.models.Exists and OuterRef imports from function-level to module-level to comply with linting standards and improve code organization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> --------- Signed-off-by: Rahul Johny <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
UN-2869 [FIX] Add broker heartbeat configuration to prevent RabbitMQ … …connection timeouts causing false duplicate detection (#1578) UN-2869 [FIX] Add broker heartbeat configuration to prevent RabbitMQ connection timeouts This fix addresses false duplicate file detection caused by stale IN_PROGRESS records when RabbitMQ disconnects idle workers after 60 seconds. Changes: - Added broker_heartbeat=30s to WorkerCeleryConfig in workers/shared/models/worker_models.py - Configurable via CELERY_BROKER_HEARTBEAT env var (default: 30s) - Prevents RabbitMQ connection drops during long-running tasks - Eliminates stale cache/DB entries that cause false duplicate detection Technical Details: - RabbitMQ default timeout: 60 seconds - Recommended heartbeat: 30 seconds (half of timeout) - Uses get_celery_setting() for hierarchical config: worker-specific -> global -> default 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <[email protected]>
PreviousNext