fix(tags): widen prerelease and devrelease tag regexes for SemVer2#1972
Conversation
The tag regex helpers in `commitizen/defaults.py::get_tag_regexes` only matched PEP-440-style prereleases (`\w+\d+` -- no dot). When users configure `version_scheme = "semver2"` and a custom `tag_format` that includes `\`, commitizen itself produces tags like `0.0-2rc.0` (with a literal dot in the SemVer2 prerelease segment), but those tags are then rejected by `TagRules.is_version_tag` -- so the next `cz bump --prerelease` fails with `No tag found to do an incremental changelog` and `cz changelog` warns `Invalid version tag: '0.0-2rc.0' does not match any configured tag format`. Widen the prerelease regex to `\w+(?:\.\w+)*` so it accepts both `rc0` (PEP-440) and `rc.0` / `alpha.beta.1` (SemVer2). Also widen the devrelease regex to `\.?dev\d+` so users substituting `\` directly in a `tag_format` (without the leading dot) are recognised on the way back -- companion to the substitution fix in commitizen-tools#1615. Closes commitizen-tools#1614 Co-authored-by: Copilot <[email protected]>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1972 +/- ##
=======================================
Coverage 98.23% 98.23%
=======================================
Files 61 61
Lines 2779 2779
=======================================
Hits 2730 2730
Misses 49 49 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Pull request overview
This PR fixes tag re-parsing for SemVer2 projects that use custom tag_format strings containing ${prerelease} and/or ${devrelease}, ensuring tags that Commitizen creates can be recognized on subsequent runs (e.g., chained prerelease bumps).
Changes:
- Widen
${prerelease}tag-regex fragment to support SemVer2-style dot-separated prerelease identifiers (e.g.rc.0,alpha.beta.1). - Relax
${devrelease}tag-regex fragment so the leading dot is optional (e.g.dev1as well as.dev1). - Add a regression test covering SemVer2 prerelease tags with a custom
tag_format.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
commitizen/defaults.py |
Updates tag placeholder regex fragments for prerelease and devrelease to better support SemVer2/custom formats. |
tests/test_tags.py |
Adds a regression test ensuring SemVer2 prerelease tags are recognized and can be extracted under a custom tag format. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…#1614 Cover the optional-leading-dot behavior of the widened devrelease regex by exercising TagRules.is_version_tag / extract_version on both '...dev1' and '....dev1' tag forms with a custom tag_format that includes ${devrelease}. Co-authored-by: Copilot <[email protected]>
Description
Closes #1614.
Why
When a project uses
version_scheme = "semver2"together with a customtag_formatthat includes${prerelease}— for example"${major}.${minor}-${patch}${prerelease}"— the tags that commitizen creates (e.g.0.0-2rc.0) cannot be recognised on the next invocation. Every subsequentcz bump --prerelease rcprintsInvalid version tag: '0.0-2rc.0' does not match any configured tag formatand then exits with code 16, making it impossible to chain prerelease bumps without manually runningcz changelogin between.The culprit is the
prereleaseentry inget_tag_regexes(commitizen/defaults.py:159). The old patternr"(?P<prerelease>\w+\d+)?"requires the prerelease segment to end with a decimal digit — matching PEP-440 forms likerc0oralpha1, but not SemVer2 forms likerc.0oralpha.beta.1(the dot terminates\w+, and the mandatory\d+then fails to match.0). Commitizen itself generates the SemVer2 form whenversion_scheme = "semver2"is active, so the tag it just wrote is immediately unreadable by the regex that should recognise it.Triage in #1964 reproduced the exact failure against master (v4.15.1): after
cz bump --prerelease rc --yes,git tag --listshows0.0-2rc.0; the followingcz bump --prerelease rc --yesfails as described. A related defect in thedevreleaseregex — it required a leading dot (\.dev\d+) while the${devrelease}substitution can produce the dot-less formdev1in some tag formats — is fixed in the same change, as both regexes live side-by-side inget_tag_regexes.What changed
commitizen/defaults.pyprereleaseregex from\w+\d+to\w+(?:\.\w+)*and made the leading dot indevreleaseoptional (\.?dev\d+) insideget_tag_regexes(lines 159–160)tests/test_tags.pytest_is_version_tag_accepts_semver2_prerelease_in_custom_tag_format— regression test assertingis_version_tagaccepts0.0-2rc.0,0.0-2, and0.0-2alpha.beta.1with the custom tag format from the issueHow it works
get_tag_regexes(commitizen/defaults.py:151–165) returns a dict that maps tag-format placeholders like${prerelease}to named-capture-group regex fragments. Those fragments are assembled into a full tag-matching regex elsewhere in the tag-parsing pipeline.prereleasechange —\w+(?:\.\w+)*replaces\w+\d+. The old suffix\d+forced the token to end with a digit:rc0✓,rc.0✗. The new pattern is an initial\w+segment (matchingrc,alpha,dev) followed by zero or more.\w+repetitions (matching.0,.beta,.1), making it greedy enough for multi-segment SemVer2 prereleases (alpha.beta.1) without over-matching. The whole group remains optional (?) so plain releases continue to match.devreleasechange —\.?dev\d+replaces\.dev\d+. The leading dot is optional so that${devrelease}substitutions that omit the dot separator (see Tag not set correctly using devreleases with semver2 and custom tag_format #1615) still round-trip correctly. The literaldevprefix is preserved — a bare\d+suffix without it would let the regex match arbitrary numeric noise.[^\s+]*for prerelease? That would be too permissive — it would consume literal characters from adjacent placeholders or from structural separators in thetag_formatstring (e.g. a+used for build metadata), corrupting the overall tag regex.normalize_taginstead? The tag format chosen in the issue (${major}.${minor}-${patch}${prerelease}) is valid; the tag string0.0-2rc.0is what commitizen correctly produces for version0.0.2-rc.0under that format. The bug is purely in the regex used to re-read those tags — widening the regex is the minimal, safe fix.Backward compatibility
\w+\d+is a strict subset of\w+(?:\.\w+)*: every tag string matched by the old regex is still matched by the new one. No previously-valid tag is rejected.?is preserved for bothprereleaseanddevrelease, so plain version tags (no prerelease, no devrelease) continue to match without changes.test_tags.py,test_bump_normalize_tag.py,test_changelog.py, andtest_bump_command.pystill pass.get_tag_regexes; no public API or config key is altered.Checklist
Was generative AI tooling used to co-author this PR?
Generated-by: Claude following the guidelines
Code Changes
uv run poe alllocally to ensure this change passes linter check and testsExpected Behavior
cz bump --prerelease rcwithversion_scheme = "semver2"andtag_formatcontaining${prerelease}Invalid version tagwarnings on subsequent bumpscz changelog --dry-runafter creating a SemVer2 prerelease tagalpha.beta.1in a customtag_format?quantifier is preservedrc0,alpha1)\w+with no dot segments covers theseSteps to Test This Pull Request
Additional Context
This is one of three bugs surfaced by the triage audit in #1964. The
devreleasehalf of this fix is closely related to #1615, which addresses the companion issue where${devrelease}substitution itself produced the wrong form; both defects share the sameget_tag_regexesfunction (commitizen/defaults.py:151–165) as their root.