[COR-11] Rewrite generate_config #22090

mpoeter · 2025-11-11T15:51:06Z

Scope & Purpose

Rewrote generate_config script with clear separation of concerns and unit tests. Code in src is shared with Oskar.

🔨 Refactoring/simplification

Note

Rewrites the CircleCI config generator into a modular, tested library and updates the pipeline to use new parameters and CLI options.

CI generator overhaul:
- Replace legacy .circleci/generate_config.py with a Click-based, modular implementation using src/ library (typed models, filters, CircleCI output, sizing, branch overrides).
- Add comprehensive unit tests under .circleci/tests_unit and pytest.ini for parsing, filtering, sizing, and CircleCI generation.
CircleCI pipeline updates:
- Extend/rename parameters and flags (e.g., ui-deployments, --driver-branch-overrides), add descriptions, and print selected params.
- Update generate step to pip install pyyaml click and invoke new generator options; adjust arg names (--ui-deployments, --driver-branch-overrides).
- Minor cleanup in cancel-pipelines logging.

^{Written by Cursor Bugbot for commit 04981d7. This will update automatically on new commits. Configure here.}

- Introduce config_lib.py with type-safe data models for test definitions - Support both single and multi-suite test jobs - Add validation for deployment types, buckets, and option overrides - Implement YAML parsing for all test definition formats - Add 79 passing unit and integration tests - Remove unused requires_2 and requires_dcl fields This library will replace the parsing logic in generate_config.py and test_launch_controller.py with a shared, well-tested foundation.

- Create base OutputGenerator abstract class with GeneratorArgs - Implement filters module for job filtering based on deployment type, full/nightly runs, and platform exclusions - Add comprehensive CircleCI generator with support for: - Multiple architectures (x64, aarch64) - Sanitizer builds (tsan, alubsan) - Replication version 2 tests - UI tests with RTA - Docker image creation - Resource sizing based on sanitizer and cluster mode - Special cases for chaos and replication_sync tests This provides a clean separation between parsing, filtering, and output generation, making it easier to maintain and extend.

Major refactoring to improve code organization and testability: **Refactored modules:** - filters.py: Removed hardcoded platform detection, introduced PlatformFlags and FilterCriteria dataclasses for explicit configuration - base.py: Split GeneratorArgs into focused dataclasses (TestExecutionConfig, CircleCIConfig, GeneratorConfig) following single responsibility principle - circleci.py: Restructured into smaller focused methods, removed sys.path hacks, now uses proper relative imports - sizing.py: New module extracting resource sizing logic with sanitizer overhead calculations **Improvements:** - Fixed all imports to use proper relative imports (..module pattern) - Removed mutable default arguments with field(default_factory) - Better separation of concerns across modules - All code formatted with black **Testing:** - Added 32 new tests for filters.py (16 tests) and sizing.py (16 tests) - All 111 tests passing (79 original + 32 new) - Tests cover platform flags, filter criteria, deployment type filtering, gtest filtering, resource class mappings, and sanitizer overhead logic 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

**Refactoring:** - sizing.py: Use ResourceSize enum instead of hardcoded strings - sizing.py: Add architecture alias support (amd64, x86_64, arm64) - sizing.py: Extract _apply_sanitizer_overhead() for better separation - filters.py: Extract is_gtest_suite() and matches_deployment_filter() - filters.py: Add GTEST_PREFIX constant, use field(default_factory) - filters.py: Merge nested if statement **Tests:** - Update all tests to use ResourceSize enum - Add test coverage for architecture aliases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Add dependency injection for environment and date access, support pre-loaded config dict, fix hardcoded size strings to use ResourceSize enum, and add 25 unit tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Create generate_config_new.py that uses the refactored architecture to generate CircleCI configuration from test definitions. Supports all key features: test filtering, sanitizers, UI tests, and multi-file definitions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Add init_driver_repo_command field - Preserve extraArgs: prefix in argument keys - Add optionsJson for multi-suite jobs with nested structure - Handle moreArgv and suffix support - Support duplicate jobs with unique keys - Add filtering for full flag at job and suite level - Fix workflow naming (with_ui_tests) - Add non-maintainer build job for x64 enterprise - Update tests for unique job keys 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Old generator defaulted to medium for cluster/mixed and small for single. New generator was always defaulting to small, causing size mismatches for cluster tests without explicit size specification. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Add init_command field to RepositoryConfig for driver init scripts - Support 'alubsan' as sanitizer alias (treated as 'asan') - Fix comparison script to continue testing all configs even on failures All test configurations now generate successfully with only cosmetic differences (argument ordering) compared to old generator. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Updated generate_config_new.py to only accept 'tsan' and 'alubsan' as valid sanitizers, matching the old generator's behavior. Individual sanitizers (asan, ubsan, lsan) are not accepted as they are not used in production. alubsan = asan + lsan + ubsan combined Internally normalized to 'asan' for sizing logic. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Added Sanitizer enum (TSAN, ALUBSAN) to replace string literals: - Sanitizer.from_string() validates and parses CLI input - BuildConfig.sanitizer now uses Optional[Sanitizer] - FilterCriteria.full now uses Optional[Sanitizer] instead of bool - Updated all code to use sanitizer.value when converting to strings - Removed test for invalid sanitizer strings (now handled by enum) - Updated tests to use Sanitizer.TSAN and Sanitizer.ALUBSAN This provides type safety and prevents typos in sanitizer names. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Added Architecture enum (X64, AARCH64) to replace string literals: - Architecture.from_string() validates and parses input with alias support (x64, x86_64, amd64 → X64; aarch64, arm64 → AARCH64) - BuildConfig.architecture now uses Architecture enum - Updated CircleCI generator to use Architecture.X64 and Architecture.AARCH64 - Updated ResourceSizer to accept Architecture enum or string - Updated all code to use architecture.value when converting to strings - Updated tests to use Architecture.X64 and Architecture.AARCH64 This provides type safety and prevents typos in architecture names. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Separated FilterCriteria.full (bool) from sanitizer (enum) - they are independent concepts - Fixed workflow generation to use correct sanitizer field - Added container_suffix and init_command support for driver tests - Fixed job-level arangosh_args parsing - Added timeLimit for nightly chaos tests - Extracted _generate_unique_key() helper to reduce duplication - Added remove_extra_args() to normalize script for cosmetic diffs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

After merge with devel: - Added job_type field to TestJob for custom CircleCI job templates - Parse 'job' field from test definitions (defaults to 'run-linux-tests') - Updated test file paths (arangojs-test-definitions.yml → arangojs.yml, etc.) - Fixed FilterCriteria to properly separate full and sanitizer fields - All tests passing (139/139) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Adapted new generator to match changes from merge commit ec7f7ef: Repository/Driver Config: - Changed init_driver_repo_command to init_command to match old generator - Only add init_command if field exists (even if empty string) - Don't add empty driver fields when there's no repository config ARM64 Workflow Generation: - Removed sanitizer restriction for ARM64 workflows (now generates for all sanitizer configs) - This matches the removal of the condition in the old generator Argument Ordering: - Fixed extraArgs order: optionsJson must come BEFORE replicationVersion and skipNightly - This matches old generator behavior where optionsJson was added during parsing UI Test Configuration: - Changed rta-branch from empty string to None for proper YAML null serialization - Added trailing space to UI filter strings to match old generator format These changes bring the new generator output into alignment with the old generator after the recent merge from devel branch. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The old generator was deleting test["arangosh_args"] after first use, which caused these settings to be lost when generating the aarch64 workflow. This meant that memory/GC settings specified in test definitions only applied to x64 builds, not aarch64 builds. Removed the deletion to ensure arangosh_args are preserved and applied consistently across all architectures. This brings the old generator into alignment with the new generator's correct behavior, eliminating the remaining test failures. All 5 comparison tests now pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Extract generic enum parsing helper to eliminate duplication across 4 enums - Add merge_field helper to TestOptions.merge_with() to reduce repetition - Move parse_args_string to TestArguments class for better cohesion - Consolidate CircleCI special case overrides into documented section - Fix missing fields (suffix, full, coverage) in TestOptions.merge_with() All 139 tests pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Add support for overriding git branches in driver test definitions via CLI. Example: --test-branches=go=feature-branch:js=main When a test definition filename matches a prefix in test_branches (e.g., "go.yml" matches "go"), the specified branch overrides the repository's default branch for all jobs in that file. This completes feature parity with the old generator as specified in the implementation plan (line 108). All 139 tests pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Add time_limit field to TestOptions - Parse timeLimit from YAML options (using camelCase key) - Remove hardcoded _get_time_limit_override method - Use job.options.time_limit directly in generator This allows any job to specify a custom timeLimit in the YAML, not just chaos. The chaos job already has timeLimit: 9600 in test-definitions.yml (line 703). All comparison tests pass.

When a test definition filename doesn't contain a directory separator, automatically prefix it with "tests/". This matches the behavior of the old generator and allows users to specify just the filename instead of the full path.

Copilot

Pull Request Overview

This PR rewrites the generate_config.py script with improved architecture, clear separation of concerns, and comprehensive unit test coverage. The new implementation uses shared code between CircleCI and Jenkins (Oskar) pipelines through a common src library.

Key Changes:

Complete rewrite of test configuration parsing with type-safe data models using dataclasses
Separation of concerns: parsing, filtering, validation, and output generation logic
Comprehensive unit test suite with >80% coverage target
Shared generic code in src/ directory for use across both CircleCI and Jenkins

Reviewed Changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`implementation_plan.md`	Detailed implementation plan documenting the rewrite architecture, types, functions, and testing strategy
`.circleci/generate_config.py`	Completely rewritten main script with cleaner structure and separation of concerns
`.circleci/src/config_lib.py`	Core data models and parsing logic shared between CircleCI and Jenkins
`.circleci/src/filters.py`	Test job filtering logic based on deployment type, platform, and other criteria
`.circleci/src/output_generators/base.py`	Abstract base classes for output generators with shared configuration types
`.circleci/src/output_generators/circleci.py`	CircleCI-specific workflow generation logic
`.circleci/src/output_generators/sizing.py`	Resource sizing logic accounting for architecture and sanitizer overhead
`.circleci/tests_unit/test_config_lib.py`	Comprehensive unit tests for core data models and parsing
`.circleci/tests_unit/test_filters.py`	Unit tests for filtering logic
`.circleci/tests_unit/test_sizing.py`	Unit tests for resource sizing logic
`.circleci/tests_unit/test_circleci_generator.py`	Unit tests for CircleCI workflow generation
`.circleci/tests_unit/test_parsing_integration.py`	Integration tests for parsing real YAML files
`.circleci/pytest.ini`	Pytest configuration for test discovery and execution

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.circleci/src/config_lib.py

Use Path iteration instead of hardcoded list to automatically discover and test all YAML files in tests directory

.circleci/src/output_generators/circleci.py

… min/max_replication_factor

- Add TypeVar for generic enum helper function - Add type annotations for dict variables to fix mypy inference issues - Assert buckets type in get_bucket_count to satisfy mypy - Replace string architecture values with Architecture enum in test_sizing.py - Install types-PyYAML for proper type checking

Remove backwards compatibility for string architecture values. The method now only accepts Architecture enum, which provides proper type safety. This change removes the ARCH_ALIASES dict and _normalize_arch() helper method that were defeating the purpose of using enums. Tests updated to exclusively use Architecture enum values instead of strings.

.circleci/src/output_generators/circleci.py

Copilot

Pull Request Overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.circleci/src/config_lib.py

.circleci/tests_unit/test_parse_yaml.py

Replace argparse-based CLI with Click decorators for better developer and user experience. Click provides: - More declarative syntax with decorators - Better help text formatting - Cleaner option definitions - More maintainable code All tests pass and comparison script validates identical output.

.circleci/src/output_generators/circleci.py

- Remove test_parse_yaml.py: redundant with existing integration tests - Remove empty BuildConfig.__post_init__: validation is handled by enum types The empty __post_init__ with only a comment was misleading. Since Architecture and Sanitizer enums already validate their values on construction, no additional validation is needed.

- Fix _dict_to_options_json to strip 'extraArgs:' prefix - Keys like 'extraArgs:log.level' now produce {'log.level': value} - Previously created {'extraArgs': {'log.level': value}} - Fix job creation to skip jobs when all suites are filtered out - _create_test_job now returns None when filtered_suites is empty - Caller filters out None values to avoid invalid job configurations

dothebart

LGTM.

I think we no longer need if enterprise
we should not have hard coded filenames in tests
probably overriding the driver branch from the commandline args is gone

dothebart · 2025-11-14T08:41:59Z

.circleci/src/output_generators/circleci.py

+    def _get_size_override(
+        self, job_name: str, build_config: BuildConfig, is_cluster: bool
+    ) -> Optional[ResourceSize]:
+        """Get size override for specific jobs based on runtime conditions."""


should we rather have a nightly_size in the yaml for this?

Can discuss this separately, but I would prefer to not include this in this PR.

.circleci/src/output_generators/circleci.py

dothebart · 2025-11-14T08:53:26Z

.circleci/src/config_lib.py

+
+        return cls(
+            git_repo=data["git_repo"],
+            git_branch=data.get("git_branch"),


this could be overridden from CLI arg previously , is this still possible?

Yes, this is still possible.

dothebart · 2025-11-14T08:55:23Z

.circleci/src/config_lib.py

+
+            # Convert jobProperties to repository config if present
+            if job_properties and "second_repo" in job_properties:
+                repo_config = {


why doesn't job_properties have a serializer or toArray() that does this?

.circleci/tests_unit/test_parsing_integration.py

- Reduce test_parsing_integration.py from 280 to 80 lines (71% reduction) - Remove redundant per-file parsing tests (covered by test_all_yaml_files_parse_successfully) - Remove trivial property tests (not integration test concerns) - Remove filtering tests (belong in unit tests) - Keep essential integration tests: - All files parse successfully - Main test structure validation - Multi-suite auto bucket behavior - Driver repository configuration - Fix pylint errors in generate_config.py: - Move dataclasses.replace import to top level - Add pylint disable for Click decorator behavior - Tests reduced from 147 to 132 (redundant tests removed) - All remaining tests pass

Consolidate repetitive test patterns across unit tests using pytest.mark.parametrize to reduce duplication while maintaining coverage. Changes: - test_filters.py: Consolidate deployment type and suite filtering tests - test_sizing.py: Parametrize architecture mapping tests - test_config_lib.py: Consolidate enum conversion and validation tests - test_parsing_integration.py: Add MAIN_TEST_DEFINITIONS constant Impact: - Reduced total lines from 1976 to 1484 (25% reduction) - Test count remains 157 (no coverage loss) - Improved maintainability and extensibility

Create conftest.py with reusable fixtures for common test objects: - BuildConfig fixtures (x64/aarch64, enterprise/community, with/without sanitizers) - mixed_suite_job fixture for filter testing - generator_factory fixture for CircleCI generator creation Changes: - tests_unit/conftest.py: New file with 104 lines of shared fixtures - test_filters.py: Use mixed_suite_job fixture (517 -> 487 lines, -30 lines) - test_sizing.py: Use BuildConfig fixtures (297 -> 295 lines, -2 lines) Impact: - Centralizes test object creation - Reduces duplication across test files - Makes tests more readable and maintainable - Easier to extend with new fixtures in the future

We no longer support community builds - only enterprise builds are maintained. This commit removes all community/enterprise distinction: - Remove enterprise field from BuildConfig dataclass - Simplify CircleCI generator to always use enterprise edition - Remove community-related test fixtures and tests - Update all BuildConfig instantiations All 153 tests pass.

.circleci/config.yml

cursor · 2025-11-14T13:35:45Z

.circleci/config.yml

            else
              set -x
-              pip install pyyaml
+              pip install pyyaml click


Bug: RTA UI tests use incorrect branch.

The --rta-branch parameter is missing from the generate_config.py invocation. The pipeline parameter rta-branch exists (line 40-42) and the Python script expects it, but it's not being passed when calling the script. This causes RTA UI tests to always use None instead of the configured branch, likely defaulting to an unintended branch.

.circleci/generate_config.py

cursor · 2025-11-14T14:06:49Z

.circleci/config.yml

                <<# pipeline.parameters.create-docker-images >> --create-docker-images <</ pipeline.parameters.create-docker-images >> \
                <<# pipeline.parameters.replication-two >> -rt <</ pipeline.parameters.replication-two >> \
-                --test-branches "<<parameters.config-definitions-branches>>" \
+                --driver-branch-overrides "<<parameters.config-definitions-branches>>" \


Bug: Configuration prevents RTA branch override.

Missing --rta-branch argument in generate_config.py invocation. The pipeline parameter rta-branch is defined at line 40-43 and the script accepts this option at generate_config.py:258, but it's not passed in the command invocation. This causes UI tests to always use the default value (None/main) instead of the pipeline parameter value, breaking the ability to override the RTA repository branch for testing.

mpoeter and others added 25 commits November 4, 2025 10:54

Merge branch 'devel' into chore/generate-config-cleanup

ec7f7ef

Remove test scripts used during reimplementation

f53537f

Merge branch 'devel' into chore/generate-config-cleanup

8f31e76

Merge branch 'devel' into chore/generate-config-cleanup

a70cb1f

Add automatic tests/ prefix for YAML files without path

b25a7a8

When a test definition filename doesn't contain a directory separator, automatically prefix it with "tests/". This matches the behavior of the old generator and allows users to specify just the filename instead of the full path.

Replace generate_config script with new version

63fa36b

mpoeter requested review from Copilot and dothebart November 11, 2025 15:51

cla-bot bot added the cla-signed label Nov 11, 2025

Copilot AI reviewed Nov 11, 2025

View reviewed changes

.circleci/src/config_lib.py Show resolved Hide resolved

Remove implementation plan

23c0344

Refactor test_all_files_parse_without_errors to use Path.glob for DRY

42cc25c

Use Path iteration instead of hardcoded list to automatically discover and test all YAML files in tests directory

cursor bot reviewed Nov 13, 2025

View reviewed changes

.circleci/src/output_generators/circleci.py Show resolved Hide resolved

mpoeter added 3 commits November 13, 2025 10:19

Remove unused fields from TestOptions: storage_engine, test_data_dir,…

33f0321

… min/max_replication_factor

cursor bot reviewed Nov 13, 2025

View reviewed changes

.circleci/src/output_generators/circleci.py Show resolved Hide resolved

mpoeter requested a review from Copilot November 13, 2025 13:57

Copilot AI reviewed Nov 13, 2025

View reviewed changes

.circleci/src/config_lib.py Outdated Show resolved Hide resolved

.circleci/tests_unit/test_parse_yaml.py Outdated Show resolved Hide resolved

cursor bot reviewed Nov 13, 2025

View reviewed changes

.circleci/src/output_generators/circleci.py Show resolved Hide resolved

.circleci/src/output_generators/circleci.py Show resolved Hide resolved

mpoeter added 3 commits November 13, 2025 16:22

Merge branch 'devel' into chore/generate-config-cleanup

d91269f

mpoeter requested a review from dothebart November 13, 2025 15:44

dothebart approved these changes Nov 14, 2025

View reviewed changes

mpoeter added 8 commits November 14, 2025 10:37

Fix pylint issues

e1d37e3

Install click dependency

f43cb6d

Add tests for git_branch override

5c316cc

Rearrange pipeline parameters and add descriptions

252ea99

mpoeter force-pushed the chore/generate-config-cleanup branch from d55fab9 to 252ea99 Compare November 14, 2025 13:33

cursor bot reviewed Nov 14, 2025

View reviewed changes

Rename test-branches parameter to driver-branch-overrides

d64f644

cursor bot reviewed Nov 14, 2025

View reviewed changes

.circleci/generate_config.py Show resolved Hide resolved

mpoeter added 2 commits November 14, 2025 15:01

Fix passing of ui-deployments

4e7dab9

Fix one more usage

04981d7

cursor bot reviewed Nov 14, 2025

View reviewed changes

[COR-11] Rewrite generate_config #22090

Are you sure you want to change the base?

[COR-11] Rewrite generate_config #22090

Uh oh!

Conversation

mpoeter commented Nov 11, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Scope & Purpose

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dothebart left a comment

Choose a reason for hiding this comment

Uh oh!

dothebart Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

mpoeter Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dothebart Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

mpoeter Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

dothebart Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cursor bot Nov 14, 2025

Choose a reason for hiding this comment

Bug: RTA UI tests use incorrect branch.

Uh oh!

Uh oh!

cursor bot Nov 14, 2025

Choose a reason for hiding this comment

Bug: Configuration prevents RTA branch override.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mpoeter commented Nov 11, 2025 •

edited by cursor bot

Loading