Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

youknowone
Copy link
Member

@youknowone youknowone commented Aug 26, 2025

@ShaharNaveh Now the str(x) matching is also fixed

Summary by CodeRabbit

  • New Features

    • Structural pattern matching: mapping rest patterns (**rest), true wildcard ("_"), evaluated guards, and improved class-pattern behavior including self-matching for built-ins and robust match_args handling.
  • Bug Fixes

    • Clearer, stricter error messages (key/length mismatches, overflow), duplicate-key detection, and more consistent non-match behavior across mapping/sequence/class patterns.
  • Tests

    • Expanded tests for rest bindings, wildcards, mapping/class match scenarios and reduced noisy test output.

Copy link
Contributor

coderabbitai bot commented Aug 26, 2025

Walkthrough

Comprehensive pattern-matching changes: compiler adjustments for wildcards, mapping keys/rest and guards; VM refactor to use type flags for MatchMapping/MatchSequence/MatchKeys and a redesigned MatchClass supporting match_args and a new _MATCH_SELF flag; builtin class flags updated and tests extended.

Changes

Cohort / File(s) Summary of changes
Compiler: pattern matching flow
compiler/codegen/src/compile.rs
Adjust GetLen index; detect true wildcards and short-circuit them; reorder on_top handling; overhaul mapping-pattern compilation (strict key validation, rest binding, keys tuple construction, MATCH_KEYS use, clearer errors); switch singleton/class comparisons to Is/TestOperation; compile and jump on guard failures; stack/BuildTuple/unpack tweaks.
VM: frame execution and op semantics
vm/src/frame.rs
Replace runtime mapping/sequence checks with PyTypeFlags tests; strengthen MatchKeys to probe mapping/get/get_item and push tuple or None; redesign MatchClass to use nargs, inspect __match_args__, support _MATCH_SELF for builtins, validate kwd attrs, push extracted tuple or None, and propagate errors; add SWAP debug asserts and import cleanups; update bytecode::Instruction::MatchClass payload handling.
VM: new pattern-matching type flag
vm/src/types/slot.rs
Add PyTypeFlags::_MATCH_SELF (1 << 22) to indicate builtins that match the subject itself in class patterns.
VM: type inheritance for pattern flags
vm/src/builtins/type.rs
Change inherit_patma_flags to accept multiple bases and inherit first collection flag (SEQUENCE/MAPPING) from bases unless already set; update call sites and early-return behavior to avoid overrides.
Builtins: enable _MATCH_SELF on core types
vm/src/builtins/{bool.rs,bytearray.rs,bytes.rs,dict.rs,float.rs,int.rs,list.rs,set.rs,str.rs,tuple.rs}
Add _MATCH_SELF to #[pyclass(...)] flags for core builtins (bool, int, float, str, bytes, bytearray, list, tuple (two spots), dict, set, frozenset); metadata-only updates.
Tests: mapping/rest and wildcard coverage
extra_tests/snippets/syntax_match.py
Expand mapping/rest tests with explicit assertions and multiple rest cases; add wildcard/rest cases and no-match branches; remove obsolete comments/prints.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Compiler
  participant VM
  participant Subject as Subject(Value)
  participant Type as Subject.Type
  participant Class as Pattern.Class

  User->>Compiler: compile pattern: Class(positional?, kwd?, guard?)
  Compiler->>VM: emit MatchClass(nargs) + key/guard ops

  VM->>Subject: load subject
  VM->>Type: check instance of Class
  alt is instance
    opt nargs > 0
      VM->>Class: get __match_args__
      alt __match_args__ is tuple[str]
        VM->>Subject: get attrs by names
        alt any missing
          VM-->>VM: non-match (push None)
        else all found
          VM-->>VM: push tuple(values)
        end
      else no __match_args__
        alt Type has _MATCH_SELF
          alt nargs == 1
            VM-->>VM: push tuple(subject)
          else nargs > 1
            VM-->>User: TypeError
          end
        else
          VM-->>VM: non-match (push None)
        end
      end
    end
    VM->>VM: extract kwd attrs (if any)
    alt any missing
      VM-->>VM: non-match (push None)
    else success
      VM-->>VM: push combined tuple
    end
  else not instance
    VM-->>VM: non-match (push None)
  end
  opt guard present
    VM->>VM: eval guard
    alt guard false
      VM-->>VM: Jump to failure target
    end
  end
Loading
sequenceDiagram
  autonumber
  actor User
  participant Compiler
  participant VM
  participant Subject as Mapping Subject

  User->>Compiler: compile mapping pattern {keys..., **rest?}
  Compiler->>VM: emit MATCH_KEYS and rest-handling ops

  VM->>Subject: check PyTypeFlags::MAPPING
  alt is mapping
    VM->>Subject: attempt .get or get_item for each key
    alt all keys present
      VM-->>VM: push tuple(values)
      opt rest binding exists
        VM->>VM: build rest dict = subject - matched keys
        VM-->>VM: bind rest name
      end
    else missing key
      VM-->>VM: non-match (push None)
    end
  else not mapping
    VM-->>VM: non-match (push None)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Possibly related PRs

Suggested reviewers

  • arihant2math

Poem

A rabbit taps patterns with careful delight,
Keys in a bundle, rest tucked just right.
MATCH_SELF hums softly, “I am the whole,”
match_args maps pieces to every role.
Guards nod, keys hush — the matcher hops through night. 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
extra_tests/snippets/syntax_match.py (3)

72-79: Nice coverage for mapping with rest; consider adding a negative duplicate-keys case.

This verifies {"a": x, **rest} binds correctly. To harden against regressions in the compiler’s duplicate-key detection, please also add a test that a mapping pattern with duplicate literal keys is a SyntaxError at compile time.

If you want, I can draft a small snippet that asserts compile-time failure for something like:

  • case {"a": _, "a": _}: ...

89-96: Multi-key rest works; add an attribute-key case too.

Confirms rest == {"c": 3, "d": 4}. It would be useful to include a case where the key is an attribute (e.g., {C.K: v, **rest}) to exercise the compiler’s “attribute key” path.


99-106: Three-key head + rest is covered; add a class-pattern “str(x)” test given the PR goal.

Since this PR also claims “str(x) matching is fixed,” please add a quick class-pattern test:

  • match "abc": case str(s): assert s == "abc"
    This will guard the new _MATCH_SELF semantics for str.

Proposed snippet to append near here:

s = "abc"
match s:
    case str(x):
        assert x == "abc"
    case _:
        assert False
compiler/codegen/src/compile.rs (2)

3487-3506: Skip compiling true wildcards to reduce bytecode and stack churn.

Recognizing a “true underscore” (MatchAs with both pattern and name None) and just popping it avoids unnecessary work. Consider extracting the “is_true_wildcard” check into a helper to reuse wherever wildcards are handled.


3667-3682: Rest dict deletion loop is correct; consider a small micro-optimization.

The loop deletes keys from rest_dict in LIFO order using CopyItem/Swap/DeleteSubscript. It’s correct. If you’re open to a micro-opt: iterating forward (k1..kn) with a constant CopyItem index (always 2) by rotating the stack before each delete can cut the index computation, but it’s not essential.

vm/src/frame.rs (3)

705-716: SWAP index lower bound: guard against 0 to avoid out-of-bounds swap

index_val == 0 would compute j = len - 0 and panic in swap(i, j). Even if the compiler never emits SWAP(0), this assert is cheap and prevents undefined behavior with malformed bytecode or future regressions.

Apply:

-                debug_assert!(
-                    index_val <= len,
-                    "SWAP index {} exceeds stack size {}",
-                    index_val,
-                    len
-                );
+                debug_assert!(
+                    (1..=len).contains(&index_val),
+                    "SWAP index {} out of range 1..={}",
+                    index_val,
+                    len
+                );

1319-1358: MatchKeys should treat only KeyError as non-match; propagate other exceptions

In the loop fetching subject[key], any error currently counts as “no match”. This risks hiding real errors from custom mappings. Recommend aligning behavior to consider only KeyError as “key missing”.

Apply:

-                    for key in keys {
-                        match subject.get_item(key.as_object(), vm) {
-                            Ok(value) => values.push(value),
-                            Err(_) => {
-                                all_match = false;
-                                break;
-                            }
-                        }
-                    }
+                    for key in keys {
+                        match subject.get_item(key.as_object(), vm) {
+                            Ok(value) => values.push(value),
+                            Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => {
+                                all_match = false;
+                                break;
+                            }
+                            Err(e) => return Err(e),
+                        }
+                    }

1-4: Consistency note: centralize flag-based matching

You import PyTypeFlags here and use flags for MatchMapping/MatchSequence; consider following through by documenting this choice and ensuring new match-related checks prefer flags where available. This avoids protocol-based divergence and keeps performance predictable.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 711b1a6 and f6415fc.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_patma.py is excluded by !Lib/**
📒 Files selected for processing (14)
  • compiler/codegen/src/compile.rs (7 hunks)
  • extra_tests/snippets/syntax_match.py (1 hunks)
  • vm/src/builtins/bool.rs (1 hunks)
  • vm/src/builtins/bytearray.rs (1 hunks)
  • vm/src/builtins/bytes.rs (1 hunks)
  • vm/src/builtins/dict.rs (1 hunks)
  • vm/src/builtins/float.rs (1 hunks)
  • vm/src/builtins/int.rs (1 hunks)
  • vm/src/builtins/list.rs (1 hunks)
  • vm/src/builtins/set.rs (2 hunks)
  • vm/src/builtins/str.rs (1 hunks)
  • vm/src/builtins/tuple.rs (1 hunks)
  • vm/src/frame.rs (4 hunks)
  • vm/src/types/slot.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/types/slot.rs
  • vm/src/builtins/dict.rs
  • vm/src/builtins/int.rs
  • vm/src/builtins/str.rs
  • vm/src/builtins/bytearray.rs
  • vm/src/builtins/tuple.rs
  • vm/src/builtins/set.rs
  • vm/src/builtins/bool.rs
  • vm/src/builtins/bytes.rs
  • vm/src/builtins/list.rs
  • vm/src/builtins/float.rs
  • vm/src/frame.rs
  • compiler/codegen/src/compile.rs
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • extra_tests/snippets/syntax_match.py
🧬 Code graph analysis (1)
vm/src/builtins/bool.rs (1)
derive/src/lib.rs (1)
  • pyclass (124-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (27)
extra_tests/snippets/syntax_match.py (2)

67-71: Good assertions for two-key mapping; keep the negative branch explicit.

Explicit assertions on x and y make failures easier to diagnose, and the catch-all case ensures the test fails loudly if the pattern doesn’t match.


80-88: Empty-rest case looks good.

Verifies that rest captures an empty dict when there are no extra keys. Solid.

compiler/codegen/src/compile.rs (9)

3289-3303: Correct index computation for post-star sequence elements.

Using len(subject) - (size - i) by loading GetLen and then (patterns.len() - i) is the right fix for indices after a starred element.


3521-3527: Good early validation of key/pattern length mismatch.

Clear error message; prevents subtle stack discipline bugs later in MATCH_KEYS handling.


3540-3546: Efficient fast-path for empty mapping pattern without rest.

Early-return and pop the subject in the {} case is correct and avoids extra work.


3548-3562: Length pre-check aligns with CPython semantics.

len(subject) >= size check before key lookup is consistent with expected behavior.


3564-3569: Overflow guard makes sense.

The i32::MAX + 1 bound matches CPython’s intent for “too many sub-patterns in mapping pattern.”


3611-3629: Potential stray boolean on stack after MATCH_KEYS None-check.

The sequence:

  • CopyItem(1) # duplicate values_or_none
  • LoadConst(None)
  • TestOperation(IsNot)
  • jump_to_fail_pop(... PopJumpIfFalse)

If JumpIfFalse doesn’t pop the tested value (it often does not, whereas JumpIfFalseOrPop does), the boolean result may remain on the stack, skewing on_top accounting. If your VM’s JumpIfFalse pops, ignore this; otherwise, add a Pop after the check passes.

Minimal patch:

         self.emit_load_const(ConstantData::None);
         emit!(
             self,
             Instruction::TestOperation {
                 op: bytecode::TestOperator::IsNot
             }
         );
-        // Stack: [subject, keys_tuple, values_tuple, bool]
-        self.jump_to_fail_pop(pc, JumpOp::PopJumpIfFalse)?;
+        // Stack: [subject, keys_tuple, values_tuple, bool]
+        self.jump_to_fail_pop(pc, JumpOp::PopJumpIfFalse)?;
+        // On success, drop the bool (keep original values_tuple)
+        emit!(self, Instruction::Pop);

Please verify the VM’s JumpIfFalse behavior and adjust accordingly.


3631-3635: Double-check on_top adjustment after unpacking values.

You do:

  • pc.on_top += size
  • pc.on_top -= 1

Sequence patterns do not subtract 1 right after unpack; they only decrement per subpattern. That extra -1 can throw off fail_pop accounting. If tests reveal mismatched pops on failure paths, consider dropping the -1:

-        pc.on_top += size; // Unpacked size values, tuple replaced by values
-        pc.on_top -= 1;
+        pc.on_top += size; // Unpacked 'size' values, tuple replaced by values

Flagging as a verification item due to subtlety.


3921-3924: Singleton patterns now use identity (is) instead of equality.

This aligns with CPython 3.10+ semantics for None/True/False.


3996-4007: Guards are compiled and short-circuited properly.

JumpIfFalseOrPop to the failure target matches CPython behavior; great to see guards wired in for both normal cases and the default case below.

vm/src/builtins/bytearray.rs (1)

173-187: Add _MATCH_SELF flag for bytearray is consistent with class-pattern semantics.

This makes patterns like case bytearray(x) bind x to the subject itself, per the new MATCH_SELF behavior. Looks good.

Please ensure the flag is consistently applied across all intended builtins (bool, bytes, dict, float, int, list, set, str, tuple) and that vm/src/frame.rs uses it for MatchClass extraction. I can provide a quick repo scan if helpful.

vm/src/builtins/bytes.rs (1)

135-149: _MATCH_SELF on bytes aligns with case bytes(x) matching “self”.

Matches the intended “str(x) fix” family of changes. No further issues spotted here.

vm/src/builtins/dict.rs (1)

179-181: _MATCH_SELF on dict is in line with new pattern-matching rules.

Enables case dict(x) to bind the subject as x. The rest of the file remains unaffected.

vm/src/builtins/set.rs (2)

523-533: Add _MATCH_SELF to set: correct for class-pattern semantics

The new flags(BASETYPE, _MATCH_SELF) on PySet is aligned with CPython’s MATCH_SELF behavior, enabling patterns like case set(x) to bind the subject itself. This doesn’t interfere with sequence/mapping detection (set is neither), so risk is low. LGTM.


948-951: Add _MATCH_SELF to frozenset: consistent and desirable

Mirrors the PySet change and enables case frozenset(x) to bind the subject. Matches CPython semantics; no concerns.

vm/src/builtins/list.rs (1)

102-114: List now participates in MATCH_SELF patterns

Including _MATCH_SELF (while retaining SEQUENCE) correctly enables case list(x) to bind the subject itself, without affecting existing sequence-pattern matching. Looks good.

vm/src/builtins/float.rs (1)

203-207: Float marked with _MATCH_SELF: enables case float(x) binding

This is consistent with CPython and the rest of this PR. No functional risks identified.

vm/src/builtins/int.rs (1)

319-323: pyclass flags for PyInt are correctly applied
Verified that this file contains three #[pyclass] attributes, with only the impl PyInt block carrying the intended flags:

  • vm/src/builtins/int.rs:27 – #[pyclass(module = false, name = "int")] on the pub struct PyInt (intentionally flag-less)
  • vm/src/builtins/int.rs:321–323 – #[pyclass(flags(BASETYPE, _MATCH_SELF), …)] on impl PyInt (correct, matches CPython behavior)
  • vm/src/builtins/int.rs:704 – #[pyclass] on impl PyRef<PyInt> (intentionally flag-less)

No other flags-bearing pyclass declarations remain. The summary should note a single flags-bearing update.

vm/src/types/slot.rs (1)

132-136: Introduce PyTypeFlags::_MATCH_SELF: matches CPython’s intent

Defining _MATCH_SELF (1 << 22) with a clear doc comment is the right foundation for built-ins that pattern-match the subject itself. Bit placement doesn’t collide with existing flags. LGTM. As a follow-up, ensure test coverage includes positional class patterns for these built-ins (positive and negative cases), e.g., str(x), int(x), list(x), set(x), frozenset(x), and that user-defined classes without match_args don’t accept 1 positional arg.

Example tests to consider adding (Python):

def test_match_self_builtins():
    for obj in [42, 3.14, "hi", b"b", bytearray(b"a"), [1], (1,), {1:2}, {1}, frozenset({1})]:
        match obj:
            case int(x) | float(x) | str(x) | bytes(x) | bytearray(x) | list(x) | tuple(x) | dict(x) | set(x) | frozenset(x):
                assert x is obj
            case _:
                assert False, f"no match-self for {type(obj)}"

def test_no_match_self_user_class():
    class C: pass
    c = C()
    matched = False
    try:
        match c:
            case C(_):  # should be invalid without __match_args__ or MATCH_SELF
                matched = True
    except TypeError:
        pass
    else:
        assert not matched
vm/src/builtins/tuple.rs (1)

246-258: Adding _MATCH_SELF to tuple is correct and aligns with class-pattern semantics

Flagging tuple with _MATCH_SELF enables patterns like case tuple(x) to bind the subject itself when there are no __match_args__. This matches the intent of the PR and mirrors how other builtins are handled. No runtime behavior changes here, just metadata; looks good.

To double-check coverage, consider adding/expanding a snippet like:

  • match (1, 2): case tuple(x): assert x == (1, 2)
  • Ensure subclass behavior still follows spec (i.e., only exact tuple types should provide MATCH_SELF unless the subclass opts in).
vm/src/builtins/str.rs (1)

523-536: Enable MATCH_SELF for str to fix str(x) matching

Adding flags(BASETYPE, _MATCH_SELF) on PyStr is the right hook for case str(x) to capture the subject. The change is local to type metadata and should work in tandem with the Frame’s MatchClass extraction.

Please ensure tests cover:

  • match "hi": case str(x): assert x == "hi"
  • That multi-arg str(a, b) patterns fail as non-match, not with TypeError.
vm/src/builtins/bool.rs (1)

112-113: Add _MATCH_SELF to bool

Consistent with other builtins. Keeping bool non-BASETYPE is correct (bool isn’t a valid base), so only _MATCH_SELF here is appropriate.

Consider adding a quick check:

  • match True: case bool(x): assert x is True
vm/src/frame.rs (4)

1297-1305: MatchMapping: switch to PyTypeFlags::MAPPING — good move

Using PyTypeFlags::MAPPING for the fast-path mapping check matches the new flag-based approach. This also keeps behavior consistent with how CPython’s VM guards mapping-ness in matching.

Please confirm consistency with MatchKeys: given the flag check here, MatchKeys should assume a mapping and only treat KeyError as “no match” (see separate note below).


1308-1316: MatchSequence: flag-based sequence detection is fine

Aligns with the mapping change and simplifies the check. No issues spotted.


1359-1464: The above scripts will:

  • Dump the MatchClass handling code so we can confirm it currently swallows every Err(_) and never distinguishes AttributeError from other exceptions.
  • Show where else fast_isinstance is used in the VM to verify we can safely pattern-match on e.fast_isinstance(vm.ctx.exceptions.attribute_error).

Once we have those results, we can confirm whether the suggested diffs are applicable and correctly aligned with the rest of the codebase.


1297-1317: All _MATCH_SELF flags and MatchClass handling verified

  • Confirmed that every built-in type (bool, bytes, bytearray, dict, float, int, list, set, str, tuple) in vm/src/builtins/*.rs is declared with flags(..., _MATCH_SELF)
  • Verified that PyTypeFlags in vm/src/types/slot.rs includes const _MATCH_SELF = 1 << 22;
  • Spot-checked vm/src/frame.rs and found a complete Instruction::MatchClass arm (lines 1359–1464) with both the MATCH_SELF fallback and appropriate error handling

All checks pass – no further changes needed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
extra_tests/snippets/syntax_match.py (1)

151-151: Add class-pattern tests for MATCH_SELF and str(x) to reflect the PR’s stated fix.

The PR mentions “Now the str(x) matching is also fixed,” but there’s no test here exercising class patterns with MATCH_SELF. Please add:

  • A positive test: match s: case str(x): assert x is s
  • A negative test for too many positional subpatterns on MATCH_SELF: case str(x, y) should raise TypeError at match time in CPython
  • A class with custom match_args to verify positional extraction and TypeError when match_args is not a tuple or contains non-strings

Here’s a minimal addition you can append after Line 151:

+#
+# class-pattern tests for MATCH_SELF and __match_args__
+#
+class Box:
+    __match_args__ = ("value",)
+    def __init__(self, value): self.value = value
+
+def test_class_pattern_match_self_and_match_args():
+    s = "hello"
+    match s:
+        case str(x):
+            assert x is s
+        case _:
+            assert False
+
+    b = Box(42)
+    match b:
+        case Box(x):
+            assert x == 42
+        case _:
+            assert False
+
+    # __match_args__ must be a tuple; non-tuple should raise TypeError
+    class Bad1:
+        __match_args__ = "not-a-tuple"
+    try:
+        match Bad1():
+            case Bad1(x):
+                assert False
+    except TypeError:
+        pass
+    else:
+        assert False, "__match_args__ non-tuple should raise TypeError"
+
+    # MATCH_SELF accepts exactly one positional subpattern; more should error
+    try:
+        match s:
+            case str(x, y):
+                assert False
+    except TypeError:
+        pass
+    else:
+        assert False, "str() with 2 positional subpatterns should raise TypeError"
vm/src/frame.rs (1)

1333-1341: Do not swallow non-KeyError exceptions in MatchKeys; only KeyError denotes “no match”.

PEP 634 semantics: missing key (KeyError) means no match; other exceptions should propagate. Current code treats all Err(_) as non-match.

Apply this diff:

-                    for key in keys {
-                        match subject.get_item(key.as_object(), vm) {
-                            Ok(value) => values.push(value),
-                            Err(_) => {
-                                all_match = false;
-                                break;
-                            }
-                        }
-                    }
+                    for key in keys {
+                        match subject.get_item(key.as_object(), vm) {
+                            Ok(value) => values.push(value),
+                            Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => {
+                                all_match = false;
+                                break;
+                            }
+                            Err(e) => return Err(e),
+                        }
+                    }
compiler/codegen/src/compile.rs (1)

3456-3458: MatchClass signature mismatch – critical fix required

The current codegen emits

emit!(self, Instruction::MatchClass(u32::try_from(nargs).unwrap()));

but the Instruction enum still defines

MatchClass(Arg<u32>),

in compiler/core/src/bytecode.rs. This will not compile and is inconsistent with the intended change to carry a raw u32 argument.

Please update accordingly:

  • In compiler/core/src/bytecode.rs, change the enum variant to:
    - MatchClass(Arg<u32>),
    + MatchClass(u32),
  • Update the instruction‐writer macro (w!(MatchClass, …)) to match the new signature (i.e. expect a u32 instead of an Arg<u32> marker).
  • Revise the VM/interpreter dispatch (e.g. in your frame execution code) to destructure MatchClass(n) and read the immediate u32 value, rather than handling an Arg<u32> marker.

Once these changes are in place, verify that both the bytecode reader/writer and the VM frame implementation are correctly handling the raw u32 argument.

♻️ Duplicate comments (1)
compiler/codegen/src/compile.rs (1)

3575-3596: Validate mapping pattern key types more strictly

The key validation now correctly accepts None and bytes literals in addition to numbers, strings, and booleans, aligning with PEP 634. Using unparse_expr for deduplication instead of Debug formatting is more robust.

🧹 Nitpick comments (2)
vm/src/frame.rs (1)

1359-1460: Add regression tests for class patterns, especially str(x).

Given the semantic fixes here (nargs handling, _MATCH_SELF), please ensure we have tests covering:

  • str(x) binds x to the subject string
  • str(x, y) raises TypeError
  • Class with match_args not a tuple raises TypeError
  • match_args tuple with non-str raises TypeError (already handled)

I can add tests under extra_tests/snippets mirroring the suggestions in the Python test file review. Do you want me to open a follow-up PR with those?

compiler/codegen/src/compile.rs (1)

3565-3569: Overflow check could be more precise

The overflow check size > (i32::MAX as usize + 1) might be off by one. Consider using size > i32::MAX as usize for clarity, as the check should prevent overflow when casting to i32.

-        // Check for overflow (INT_MAX < size - 1)
-        if size > (i32::MAX as usize + 1) {
+        // Check for overflow to ensure safe cast to u32
+        if size > i32::MAX as usize {
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f6415fc and 0bb06ab.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_patma.py is excluded by !Lib/**
📒 Files selected for processing (14)
  • compiler/codegen/src/compile.rs (7 hunks)
  • extra_tests/snippets/syntax_match.py (1 hunks)
  • vm/src/builtins/bool.rs (1 hunks)
  • vm/src/builtins/bytearray.rs (1 hunks)
  • vm/src/builtins/bytes.rs (1 hunks)
  • vm/src/builtins/dict.rs (1 hunks)
  • vm/src/builtins/float.rs (1 hunks)
  • vm/src/builtins/int.rs (1 hunks)
  • vm/src/builtins/list.rs (1 hunks)
  • vm/src/builtins/set.rs (2 hunks)
  • vm/src/builtins/str.rs (1 hunks)
  • vm/src/builtins/tuple.rs (1 hunks)
  • vm/src/frame.rs (4 hunks)
  • vm/src/types/slot.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • vm/src/builtins/bool.rs
  • vm/src/builtins/bytes.rs
  • vm/src/builtins/dict.rs
  • vm/src/builtins/bytearray.rs
  • vm/src/builtins/set.rs
  • vm/src/builtins/str.rs
  • vm/src/builtins/tuple.rs
  • vm/src/types/slot.rs
  • vm/src/builtins/int.rs
  • vm/src/builtins/list.rs
  • vm/src/builtins/float.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/frame.rs
  • compiler/codegen/src/compile.rs
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • extra_tests/snippets/syntax_match.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (windows-latest)
🔇 Additional comments (16)
extra_tests/snippets/syntax_match.py (6)

65-71: Good strengthening of exact bindings in two-key mapping pattern.

Asserting concrete bound values for x and y plus an explicit failure path is correct and guards regressions in key lookup/binding.


72-79: Nice addition: rest-capture semantics validated.

This verifies that the unmatched keys are captured into rest as a dict, which aligns with PEP 634. Good negative branch, too.


80-88: Covers empty-rest case well.

Confirms rest is {} when no extra keys remain; this is an important edge case for mapping patterns with rest.


90-96: Multi-key rest coverage looks solid.

Validates that rest accumulates the correct remainder without disturbing bound x and y.


99-106: Three-key + rest case completes the grid.

This extends coverage to larger patterns and verifies rest size/contents after multiple exact key binds.


108-126: Wildcard fallback tests are helpful and realistic.

These mirror real-world patterns (e.g., wheelinfo). The success and failure paths are clear.

vm/src/frame.rs (3)

1452-1459: Overall MatchClass tuple assembly and non-instance path look correct.

On success, the extracted positional+keyword values are returned as a tuple; on non-instance, pushing None matches CPython’s behavior.


1297-1317: Minor consistency note: MatchMapping/MatchSequence use flags, while MatchKeys checks PyMapping protocol.

This is acceptable if the compiler guarantees MatchKeys only runs after a successful MatchMapping flag check. If not, consider aligning MatchKeys’ mapping check to flags for consistency.

Would you like me to align MatchKeys to use flags and fall back to protocol methods only when needed?


1300-1305: Audit mapping flags for all mapping types

Switching to PyTypeFlags::MAPPING in the pattern matcher is correct, but I only see it referenced in frame.rs and in the inherit_patma_flags helper (vm/src/builtins/type.rs) (github.com). You need to confirm that every mapping-like type explicitly sets this flag during its registration. In particular, please verify and update as needed:

  • vm/src/builtins/dict.rs — ensure the PyTypeSlots.flags for PyDict includes PyTypeFlags::MAPPING.
  • vm/src/builtins/mappingproxy.rs — likewise, it must carry the mapping flag.
  • The implementations of collections.UserDict and OrderedDict (wherever they’re defined) — these custom types must also set PyTypeFlags::MAPPING.

If any of these are missing the mapping bit, they won’t be recognized as mappings by the VM’s pattern matcher.

compiler/codegen/src/compile.rs (7)

294-294: Fix incorrect index calculation for GetLen in sub-patterns

The change from (patterns.len() - 1) to (patterns.len() - i) appears to be correct for calculating the negative index from the end. However, ensure this aligns with CPython's behavior for sequence patterns with star elements.


3488-3502: Good fix for wildcard pattern handling

The improved wildcard detection correctly identifies true wildcards (underscore patterns without name binding) and skips compilation for them, which is an optimization. The distinction between Pattern::MatchAs with both pattern and name as None versus other wildcard patterns is correctly implemented.


3519-3708: Comprehensive mapping pattern implementation with rest support

The mapping pattern logic has been significantly improved with:

  1. Proper rest pattern handling (**rest)
  2. Robust key validation ensuring only literals and attributes
  3. Clear duplicate detection using unparse_expr for stable representation
  4. Correct stack manipulation for rest dict creation

The implementation correctly builds a rest dict, removes consumed keys, and stores the result. The stack operations are well-documented with comments showing stack state at each step.


3926-3932: Correct singleton pattern matching semantics

The change from CompareOperation::Equal to TestOperation::Is for singleton patterns is correct. PEP 634 specifies that singleton patterns (None, True, False) should use identity checks, not equality.


4002-4014: Guards implementation completed

The guard implementation correctly:

  1. Compiles the guard expression
  2. Uses JumpIfFalseOrPop to jump to the fail block if the guard is false
  3. Properly handles the stack state

This replaces the previous NotImplementedYet stub.


4032-4036: Default case guard handling looks correct

The guard handling for the default case in match statements is properly implemented, using JumpIfFalseOrPop to skip the body if the guard fails.


3653-3698: Tests for mapping-rest patterns are already in place

The extra_tests/snippets/syntax_match.py file includes comprehensive tests for mapping patterns with **rest, covering:

  • Single-key rest (case {"a": x, **rest}) and empty rest
  • Multiple-key rest scenarios
  • Edge cases (wildcard fallback, no-match)
  • A dedicated test_mapping_comprehensive() function exercising key captures and rest behavior

No additional test coverage is required.

Comment on lines +705 to 714
debug_assert!(len > 0, "stack underflow in SWAP");
let i = len - 1; // TOS index
let index_val = index.get(arg) as usize;
// CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n]
// This means swap TOS with the element at index (len - n)
debug_assert!(
index_val <= len,
"SWAP index {} exceeds stack size {}",
index_val,
len
);
let j = len - index_val;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

SWAP index guard is incomplete; add lower-bound check and a release-mode safety path.

SWAP(0) would compute j = len - 0 and panic in release. Compiler should not emit 0, but add a debug lower-bound assertion and a release-mode guard to prevent hard-to-diagnose panics.

Apply this diff:

             let len = self.state.stack.len();
-            debug_assert!(len > 0, "stack underflow in SWAP");
+            debug_assert!(len > 0, "stack underflow in SWAP");
             let i = len - 1; // TOS index
             let index_val = index.get(arg) as usize;
             // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n]
             // This means swap TOS with the element at index (len - n)
-            debug_assert!(
-                index_val <= len,
-                "SWAP index {} exceeds stack size {}",
-                index_val,
-                len
-            );
+            debug_assert!(index_val >= 1, "SWAP index must be >= 1");
+            debug_assert!(
+                index_val <= len,
+                "SWAP index {} exceeds stack size {}",
+                index_val,
+                len
+            );
+            if index_val == 0 || index_val > len {
+                self.fatal("SWAP index out of range");
+            }
             let j = len - index_val;
             self.state.stack.swap(i, j);
🤖 Prompt for AI Agents
In vm/src/frame.rs around lines 705–716, the SWAP index check only asserts upper
bound and can produce a release-mode panic for SWAP(0); add a lower-bound check
and an explicit runtime guard: insert debug_assert!(index_val > 0, "SWAP index
must be >= 1, got {}", index_val) and replace the unchecked subtraction with an
explicit runtime check that handles index_val == 0 safely (e.g. return a clear
error or panic with a descriptive message like "invalid SWAP(0)" instead of
allowing len - 0 to underflow), keeping the existing upper-bound assertion.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
vm/src/builtins/float.rs (1)

593-593: Fix inverted float truthiness in AsNumber::boolean slot

The boolean slot for floats currently returns true only when the value is zero, which is the opposite of Python’s semantics and inconsistent with PyFloat::__bool__ (which returns self.value != 0.0).

• Location:

  • File: vm/src/builtins/float.rs
  • In the AsNumber implementation, around line 593

• Change required:

-            boolean: Some(|num, _vm| Ok(PyFloat::number_downcast(num).value.is_zero())),
+            boolean: Some(|num, _vm| Ok(!PyFloat::number_downcast(num).value.is_zero())),

• Test suggestion: add a targeted test case to ensure correct truthiness, for example in extra_tests/snippets/float_bool.py or similar:

assert bool(0.0) is False
assert bool(-0.0) is False
assert bool(1.0) is True
assert bool(float('nan')) is True
♻️ Duplicate comments (1)
compiler/codegen/src/compile.rs (1)

3582-3615: Key validation now matches PEP 634 and avoids Debug-based dedupe.

  • Accepts numbers, strings, bytes, booleans, None, and attribute lookups.
  • Duplicate literal keys checked via unparse for stability.

This addresses prior review feedback and aligns with CPython.

🧹 Nitpick comments (7)
vm/src/builtins/bool.rs (1)

90-94: Avoid runtime panic: replace todo!() in Debug impl for PyBool

Leaving todo!() will panic if PyBool is ever formatted with {:?}. Implementing a trivial Debug avoids surprises and aids diagnostics.

Apply:

 impl Debug for PyBool {
-    fn fmt(&self, _f: &mut Formatter<'_>) -> std::fmt::Result {
-        todo!()
-    }
+    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
+        f.write_str("PyBool")
+    }
 }
compiler/codegen/src/compile.rs (6)

3291-3303: Fix: post-star index calculation is now correct (no off-by-one).

Using len(subject) - (size - i) via the constant (patterns.len() - i) matches PEP 634 semantics for elements after the starred slot. Consider adding a short comment spelling out the formula to prevent regressions.


3520-3535: Validation: length parity check is good; underscore as rest target likely should be a SyntaxError, but message can be clearer.

  • keys.len() vs patterns.len() guard is correct.
  • Rejecting “**_” aligns with CPython behavior; tweak the error to be explicit for developers.

Apply this diff to improve the diagnostic:

-        if let Some(rest) = star_target {
-            if rest.as_str() == "_" {
-                return Err(self.error(CodegenErrorType::SyntaxError("invalid syntax".to_string())));
-            }
-        }
+        if let Some(rest) = star_target {
+            if rest.as_str() == "_" {
+                return Err(self.error(CodegenErrorType::SyntaxError(
+                    "cannot use '_' as mapping rest target".to_string(),
+                )));
+            }
+        }

Also please add a parser/compile-time test that match x: case {**_}: ... raises a SyntaxError.


3556-3569: Optional: drop the pre-len(subject) >= size check for mappings.

MATCH_KEYS already determines success; the extra GET_LEN adds an additional user-visible __len__ call (cost and potential side effects) and isn’t required for correctness. CPython’s mapping pattern path relies on MATCH_MAPPING/MATCH_KEYS without an upfront length guard.

If you prefer simplicity over micro-optimization, remove the block below.

-        if size > 0 {
-            // Check if the mapping has at least 'size' keys
-            emit!(self, Instruction::GetLen);
-            self.emit_load_const(ConstantData::Integer { value: size.into() });
-            // Stack: [subject, len, size]
-            emit!(
-                self,
-                Instruction::CompareOperation {
-                    op: ComparisonOperator::GreaterOrEqual
-                }
-            );
-            self.jump_to_fail_pop(pc, JumpOp::PopJumpIfFalse)?;
-            // Stack: [subject]
-        }
+        // No upfront __len__ check; rely on MATCH_KEYS to determine presence of required keys.

3619-3660: MATCH_KEYS success path and on_top accounting look sound; add a clarifying comment.

  • Tuple-of-keys + MATCH_KEYS + “is not None” is the right shape.
  • After unpacking values, you increment on_top by size and then decrement per subpattern; finally you decrement by 2 for subject + keys_tuple. The arithmetic balances, but it’s subtle.

Add a short comment summarizing the on_top invariant before and after unpack to ease future maintenance.


3650-3654: Minor: iterate directly over patterns to avoid indexing cast.

This is purely stylistic; using for pat in patterns { ... } avoids i as usize.


3662-3712: Rest-dict construction is correct; consider an intrinsic for “dict minus keys” later.

Building {}, DICT_UPDATE from subject, then deleting matched keys yields the proper rest mapping. If a VM intrinsic (e.g., MapWithoutKeys) ever lands, this block would be a good candidate for a targeted speedup; until then, this is clear and correct.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0bb06ab and 5617d23.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_patma.py is excluded by !Lib/**
📒 Files selected for processing (14)
  • compiler/codegen/src/compile.rs (7 hunks)
  • extra_tests/snippets/syntax_match.py (1 hunks)
  • vm/src/builtins/bool.rs (1 hunks)
  • vm/src/builtins/bytearray.rs (1 hunks)
  • vm/src/builtins/bytes.rs (1 hunks)
  • vm/src/builtins/dict.rs (1 hunks)
  • vm/src/builtins/float.rs (1 hunks)
  • vm/src/builtins/int.rs (1 hunks)
  • vm/src/builtins/list.rs (1 hunks)
  • vm/src/builtins/set.rs (2 hunks)
  • vm/src/builtins/str.rs (1 hunks)
  • vm/src/builtins/tuple.rs (1 hunks)
  • vm/src/frame.rs (5 hunks)
  • vm/src/types/slot.rs (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • vm/src/builtins/str.rs
🚧 Files skipped from review as they are similar to previous changes (10)
  • vm/src/builtins/dict.rs
  • vm/src/types/slot.rs
  • vm/src/builtins/bytes.rs
  • vm/src/builtins/list.rs
  • vm/src/builtins/bytearray.rs
  • vm/src/builtins/tuple.rs
  • vm/src/builtins/int.rs
  • vm/src/frame.rs
  • extra_tests/snippets/syntax_match.py
  • vm/src/builtins/set.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/builtins/bool.rs
  • vm/src/builtins/float.rs
  • compiler/codegen/src/compile.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (8)
vm/src/builtins/float.rs (1)

203-206: MATCH_SELF flag coverage verified across all builtins

Confirmed that the _MATCH_SELF flag is present on all intended builtins—bool, int, float, bytes, bytearray, str, list, tuple, set, and dict—so no further changes are required.

vm/src/builtins/bool.rs (1)

112-112: ACTION REQUIRED: Confirm MATCH_SELF flag rollout across builtins

The _MATCH_SELF flag on bool (vm/src/builtins/bool.rs:112) correctly enables class-pattern “match self” semantics for booleans. However, our quick grep across vm/src/builtins only finds this single instance:

• vm/src/builtins/bool.rs:112 – #[pyclass(..., flags(_MATCH_SELF))]

No other builtins currently include _MATCH_SELF. To ensure consistent class-pattern matching support, please:

  • Verify whether other builtins (e.g., int, float, str, list, dict, etc.) should also carry the MATCH_SELF flag.
  • If so, apply the same metadata change to those classes.
  • Otherwise, document that bool is intentionally the sole builtin requiring _MATCH_SELF.
compiler/codegen/src/compile.rs (6)

3487-3503: Skip-compiling true wildcards in class patterns is a nice win; double-check AST shape.

Early POP + continue for true wildcards avoids unnecessary work and binding checks. Ensure that ruff’s AST represents “_” as MatchAs { pattern: None, name: None } in this context; if so, this logic is solid. Add a small test covering both positional and keyword wildcards inside a class pattern.


3547-3554: Fast-path for empty mapping pattern is correct.

Special-casing {} to immediately POP the subject is faithful to CPython and reduces work.


3572-3579: Overflow guard tightened; cast is safe.

Using size > i32::MAX + 1 with the cast warning suppressed is appropriate here.


3936-3938: Use of ‘is’ for singletons is correct.

Switching to TestOperation::Is matches Python semantics for None/True/False.


4011-4021: Guards: compile and branch on false using JumpIfFalseOrPop to the fail label.

This integrates guard semantics cleanly with the existing fail_pop machinery.


4040-4043: Default-case guards compiled consistently.

Compiling the guard and jumping to end if false mirrors CPython behavior.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
vm/src/frame.rs (3)

705-716: SWAP needs lower-bound check and a release-mode guard (SWAP(0) will panic).
Current code asserts upper bound only; with index 0, j = len - 0 == len causing out-of-bounds swap in release builds.

Apply this diff:

             let len = self.state.stack.len();
             debug_assert!(len > 0, "stack underflow in SWAP");
             let i = len - 1; // TOS index
             let index_val = index.get(arg) as usize;
             // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n]
             // This means swap TOS with the element at index (len - n)
-            debug_assert!(
-                index_val <= len,
-                "SWAP index {} exceeds stack size {}",
-                index_val,
-                len
-            );
+            debug_assert!(index_val >= 1, "SWAP index must be >= 1");
+            debug_assert!(
+                index_val <= len,
+                "SWAP index {} exceeds stack size {}",
+                index_val,
+                len
+            );
+            if index_val == 0 || index_val > len {
+                self.fatal("SWAP index out of range");
+            }
             let j = len - index_val;
             self.state.stack.swap(i, j);

1387-1396: Non-tuple match_args must raise TypeError.
Returning non-match diverges from CPython.

-                            let match_args = match match_args.downcast_exact::<PyTuple>(vm) {
-                                Ok(tuple) => tuple,
-                                Err(_) => {
-                                    // __match_args__ must be a tuple
-                                    self.push_value(vm.ctx.none());
-                                    return Ok(None);
-                                }
-                            };
+                            let match_args = match match_args.downcast_exact::<PyTuple>(vm) {
+                                Ok(tuple) => tuple,
+                                Err(_) => {
+                                    return Err(vm.new_type_error(
+                                        "__match_args__ must be a tuple".to_string(),
+                                    ));
+                                }
+                            };

1432-1447: MATCH_SELF error semantics: >1 positional subpatterns must raise TypeError; and when match_args is absent (non-MATCH_SELF) with nargs > 0, raise TypeError.
Returning non-match here diverges from CPython and masks programmer errors.

                             if is_match_self_type {
                                 if nargs_val == 1 {
                                     // Match the subject itself as the single positional argument
                                     extracted.push(subject.clone());
                                 } else if nargs_val > 1 {
-                                    // Too many positional arguments for MATCH_SELF
-                                    self.push_value(vm.ctx.none());
-                                    return Ok(None);
+                                    return Err(vm.new_type_error(
+                                        "class pattern accepts at most 1 positional sub-pattern for MATCH_SELF types"
+                                            .to_string(),
+                                    ));
                                 }
                             } else {
                                 // No __match_args__ and not a MATCH_SELF type
-                                if nargs_val > 0 {
-                                    self.push_value(vm.ctx.none());
-                                    return Ok(None);
-                                }
+                                if nargs_val > 0 {
+                                    return Err(vm.new_type_error(
+                                        "class pattern defines no positional sub-patterns (__match_args__ missing)"
+                                            .to_string(),
+                                    ));
+                                }
                             }
🧹 Nitpick comments (5)
vm/src/frame.rs (5)

1335-1368: Unify mapping detection in MatchKeys with MatchMapping to avoid divergent behavior.
Currently MatchKeys uses PyMapping::check(&subject) while MatchMapping uses PyTypeFlags::MAPPING. This can yield inconsistent outcomes (e.g., flagged type passes MatchMapping but fails MatchKeys). Recommend using the same flag gate here and keep the good error behavior (KeyError → non-match; others propagate).

Apply this diff:

-                // Check if subject is a mapping and extract values for keys
-                if PyMapping::check(&subject) {
+                // Check mapping via type flag for consistency with MatchMapping
+                if subject.class().slots.flags.contains(PyTypeFlags::MAPPING) {
                     let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap();
                     let mut values = Vec::new();
                     let mut all_match = true;

                     for key in keys {
                         match subject.get_item(key.as_object(), vm) {
                             Ok(value) => values.push(value),
                             Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => {
                                 all_match = false;
                                 break;
                             }
                             Err(e) => return Err(e),
                         }
                     }

If this becomes the only usage of PyMapping in this file, remove the unused import:

-use crate::protocol::PyMapping;

1384-1388: Don’t swallow exceptions when fetching match_args; use get_attribute_opt to distinguish “absent” from “error.”
cls.get_attr(...).ok() treats unexpected errors as “missing,” hiding bugs in descriptors. Prefer vm.get_attribute_opt(...) which returns PyResult<Option<...>>.

-                        let match_args = cls.get_attr(vm.ctx.intern_str("__match_args__"), vm).ok();
+                        let match_args = vm.get_attribute_opt(cls.clone(), vm.ctx.intern_str("__match_args__"))?;

1415-1423: Only treat AttributeError as a non-match; propagate other exceptions from attribute access.
Catching all errors as non-match hides real failures (e.g., descriptor raising TypeError).

-                                match subject.get_attr(attr_name_str, vm) {
-                                    Ok(value) => extracted.push(value),
-                                    Err(_) => {
-                                        // Attribute doesn't exist
-                                        self.push_value(vm.ctx.none());
-                                        return Ok(None);
-                                    }
-                                }
+                                match subject.get_attr(attr_name_str, vm) {
+                                    Ok(value) => extracted.push(value),
+                                    Err(e) if e.fast_isinstance(vm.ctx.exceptions.attribute_error) => {
+                                        // Missing attribute → non-match
+                                        self.push_value(vm.ctx.none());
+                                        return Ok(None);
+                                    }
+                                    Err(e) => return Err(e),
+                                }

1454-1461: Same note for keyword attributes: only AttributeError should lead to non-match.
Other exceptions from attribute access should propagate.

-                        match subject.get_attr(name_str, vm) {
-                            Ok(value) => extracted.push(value),
-                            Err(_) => {
-                                // Attribute doesn't exist
-                                self.push_value(vm.ctx.none());
-                                return Ok(None);
-                            }
-                        }
+                        match subject.get_attr(name_str, vm) {
+                            Ok(value) => extracted.push(value),
+                            Err(e) if e.fast_isinstance(vm.ctx.exceptions.attribute_error) => {
+                                self.push_value(vm.ctx.none());
+                                return Ok(None);
+                            }
+                            Err(e) => return Err(e),
+                        }

1309-1325: Behavioral consistency note (flags vs protocols).
You’ve moved MatchMapping/MatchSequence to flags. MatchKeys still relies on protocol checks (now only for non-flagged objects if you keep it). Decide explicitly:

  • Either make all three opcodes flag-driven (my recommendation for internal consistency and speed),
  • Or keep protocol checks across the board to support Python-level custom mappings/sequences.

If you want, I can open a follow-up PR to align MatchKeys and add tests covering:

  • dict vs custom mapping with only keys/getitem
  • str(x) success and str(x, y) TypeError
  • match_args not tuple and too many positional subpatterns.
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5617d23 and 67bbe81.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_patma.py is excluded by !Lib/**
📒 Files selected for processing (15)
  • compiler/codegen/src/compile.rs (7 hunks)
  • compiler/core/src/bytecode.rs (7 hunks)
  • extra_tests/snippets/syntax_match.py (1 hunks)
  • vm/src/builtins/bool.rs (1 hunks)
  • vm/src/builtins/bytearray.rs (1 hunks)
  • vm/src/builtins/bytes.rs (1 hunks)
  • vm/src/builtins/dict.rs (1 hunks)
  • vm/src/builtins/float.rs (1 hunks)
  • vm/src/builtins/int.rs (1 hunks)
  • vm/src/builtins/list.rs (1 hunks)
  • vm/src/builtins/set.rs (2 hunks)
  • vm/src/builtins/str.rs (1 hunks)
  • vm/src/builtins/tuple.rs (1 hunks)
  • vm/src/frame.rs (8 hunks)
  • vm/src/types/slot.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • vm/src/builtins/set.rs
  • vm/src/builtins/bytes.rs
  • vm/src/builtins/bool.rs
  • vm/src/builtins/dict.rs
  • vm/src/builtins/float.rs
  • vm/src/builtins/list.rs
  • vm/src/types/slot.rs
  • vm/src/builtins/int.rs
  • vm/src/builtins/tuple.rs
  • vm/src/builtins/bytearray.rs
  • extra_tests/snippets/syntax_match.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • compiler/codegen/src/compile.rs
  • compiler/core/src/bytecode.rs
  • vm/src/builtins/str.rs
  • vm/src/frame.rs
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.

Applied to files:

  • vm/src/frame.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run tests under miri
🔇 Additional comments (20)
vm/src/builtins/str.rs (1)

523-536: Verified _MATCH_SELF flag across builtins

I ran a repo-wide search and confirmed that all the intended built-ins opt into _MATCH_SELF:

  • bool (vm/src/builtins/bool.rs:112)
  • bytearray (vm/src/builtins/bytearray.rs:173)
  • bytes (vm/src/builtins/bytes.rs:135)
  • dict (vm/src/builtins/dict.rs:179)
  • float (vm/src/builtins/float.rs:204)
  • int (vm/src/builtins/int.rs:320)
  • list (vm/src/builtins/list.rs:112)
  • set (vm/src/builtins/set.rs:532, 949)
  • str (vm/src/builtins/str.rs:524)
  • tuple (vm/src/builtins/tuple.rs:247)

Remaining verification needed:

  • Confirm at runtime that exactly one positional subpattern is enforced (i.e. case str(a, b): against a str should raise TypeError).
  • It’d be good to add the proposed str_class_pattern_tests.py (or similar) to lock in that behavior.

Please run those tests manually (or add them to your test suite) and verify the spec-compliant error is raised.

compiler/core/src/bytecode.rs (5)

528-528: LGTM! Properly restored ToBool instruction.

The ToBool instruction has been appropriately restored from its previously commented state, which aligns with the runtime implementation in vm/src/frame.rs.


558-561: LGTM! Well-documented PopJumpIfFalse instruction.

The new PopJumpIfFalse instruction is properly documented and follows the existing instruction pattern. The doc comment clearly explains its behavior: "Pop the top of the stack, then pop the next value and jump if it is false."


1264-1264: LGTM! Correctly added PopJumpIfFalse to label handling.

The PopJumpIfFalse instruction has been properly integrated into the label_arg() method to ensure it's recognized as a jump instruction with a label target.


1338-1338: LGTM! Accurate stack effect calculations.

The stack effects are correctly specified:

  • ToBool: 0 (replaces TOS with its boolean value)
  • PopJumpIfFalse: -1 (pops one item from stack)

These values accurately reflect the operations performed by each instruction.

Also applies to: 1349-1349


1541-1541: LGTM! Display formatting properly implemented.

The new instructions are correctly integrated into the display/disassembly logic.

Also applies to: 1556-1556

compiler/codegen/src/compile.rs (9)

3488-3503: Good improvement: Proper wildcard pattern handling.

The addition of is_true_wildcard logic correctly distinguishes between actual wildcard patterns ("_" without name binding) and named captures, preventing unnecessary pattern compilation for true wildcards. This optimization aligns with CPython's behavior.


3521-3535: LGTM! Proper validation for mapping patterns.

Good error handling:

  1. Validates that keys and patterns array lengths match
  2. Correctly rejects "_" as a rest pattern target with appropriate error message

3571-3578: LGTM! Proper overflow check with safe casting.

The overflow check correctly ensures that size doesn't exceed i32::MAX + 1 before casting to u32. The #[allow(clippy::cast_possible_truncation)] is justified here since the check guarantees safety.


3582-3616: Good implementation of PEP 634 key validation.

The key validation logic correctly implements PEP 634 requirements:

  • Allows literals (number, string, bytes, boolean, None)
  • Allows attribute lookups
  • Properly detects and reports duplicate keys using stable string representation

3662-3705: Complex but correct rest pattern handling.

The rest pattern implementation properly:

  1. Creates an empty dict and updates it with the subject
  2. Unpacks keys and removes them from the rest dict
  3. Correctly manages the stack throughout the operation

The stack comments help track the state at each step, which aids maintainability.


3936-3938: Correct implementation: Using Is operator for singleton matching.

The change from CompareOperation::Equal to TestOperation::Is is correct for singleton pattern matching (None, True, False), as per PEP 634 specification.


4010-4020: LGTM! Guard implementation with proper boolean conversion.

The guard compilation correctly:

  1. Compiles the guard expression
  2. Converts to boolean with ToBool
  3. Uses PopJumpIfFalse to jump to failure if guard is false

This properly implements the guard semantics for match cases.


4039-4043: Guard handling for default case looks correct.

The guard compilation for the default (catch-all) case properly uses JumpIfFalseOrPop to either jump to end if guard is false or continue if true.


3294-3295: Verify GetLen Index Calculation Logic

I wasn’t able to find any existing tests in extra_tests/ covering the extraction of elements after a star pattern in sequence matching. Please confirm that the updated expression:

value: (patterns.len() - i).into(),

correctly computes the number of elements following position i (previously (patterns.len() - 1)). If this is intentional:

  • Ensure that (patterns.len() - i) aligns with the intended semantics of GetLen when compiling star-pattern sequences.
  • Consider adding targeted tests (e.g., sequence patterns with varying lengths and star positions) to validate both edge cases (star at start/end) and typical scenarios.

File: compiler/codegen/src/compile.rs
Lines: 3294–3295

vm/src/frame.rs (5)

3-3: Import of PyTypeFlags is appropriate and used correctly later.
No issues; the flag checks for MAPPING/SEQUENCE/_MATCH_SELF below rely on this.


1094-1099: ToBool opcode implementation looks correct.
Pops, converts via try_to_bool, pushes bool. Matches the expected semantics and propagates exceptions properly.


1248-1250: PopJumpIfFalse + helper extraction is solid.
Refactoring into pop_jump_if keeps semantics aligned with JumpIfFalse and reduces duplication.

Also applies to: 1970-1986


1320-1325: MatchSequence flag check is OK and aligns with the intended semantics.
This matches the retrieved learnings that only certain builtins (list/tuple) should be considered sequences for pattern matching.

Note: I used your prior clarification (learning) that bytes/bytearray/range are not sequences in this context.


1345-1349: Correct: KeyError → non-match; other errors propagate.
This mirrors CPython behavior for mapping lookups inside match.

@arihant2math
Copy link
Collaborator

I'll take a look tomorrow.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
compiler/codegen/src/compile.rs (1)

3584-3603: Correct key validation to include None and bytes literals.

The current implementation correctly identifies the allowed key types according to PEP 634, but the error message should be updated to reflect the full set of allowed types.

This issue was already addressed in a previous review comment that noted the need to support None and bytes as mapping keys. The current implementation correctly handles all required literal types.

vm/src/frame.rs (2)

702-716: SWAP(0) can panic in release; add lower-bound assert and a runtime guard

Current code only asserts the upper bound and will compute j = len - 0 => len, causing swap out-of-bounds. Guard index_val >= 1 in debug and handle invalid values in release.

Apply this diff:

             let len = self.state.stack.len();
             debug_assert!(len > 0, "stack underflow in SWAP");
             let i = len - 1; // TOS index
             let index_val = index.get(arg) as usize;
             // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n]
             // This means swap TOS with the element at index (len - n)
-            debug_assert!(
-                index_val <= len,
-                "SWAP index {} exceeds stack size {}",
-                index_val,
-                len
-            );
+            debug_assert!(index_val >= 1, "SWAP index must be >= 1");
+            debug_assert!(
+                index_val <= len,
+                "SWAP index {} exceeds stack size {}",
+                index_val,
+                len
+            );
+            if index_val == 0 || index_val > len {
+                self.fatal("SWAP index out of range");
+            }
             let j = len - index_val;
             self.state.stack.swap(i, j);

1387-1394: Non-tuple match_args must raise TypeError, not result in a silent non-match

Current code pushes None and returns Ok(None). CPython raises TypeError here.

Apply this diff:

-                            let match_args = match match_args.downcast_exact::<PyTuple>(vm) {
-                                Ok(tuple) => tuple,
-                                Err(_) => {
-                                    // __match_args__ must be a tuple
-                                    self.push_value(vm.ctx.none());
-                                    return Ok(None);
-                                }
-                            };
+                            let match_args = match match_args.downcast_exact::<PyTuple>(vm) {
+                                Ok(tuple) => tuple,
+                                Err(_) => {
+                                    return Err(vm.new_type_error(
+                                        "__match_args__ must be a tuple".to_string(),
+                                    ));
+                                }
+                            };
🧹 Nitpick comments (3)
vm/src/frame.rs (3)

1429-1455: Nit: Improve TypeError message when nargs > 0 but no match_args and not MATCH_SELF

You correctly raise TypeError. Consider matching CPython’s phrasing by including the class name and the number given, e.g., “C() accepts 0 positional sub-patterns (N given)”. This helps users diagnose the exact class that rejected positional subpatterns.

Would you like me to draft a small helper to derive the class display name safely for the error message?


1366-1478: MatchClass overall: semantics for MATCH_SELF and keyword extraction look good; add targeted tests

  • MATCH_SELF path enforces at most one positional subpattern and returns TypeError on >1 — good.
  • Missing attributes for positional/keyword parts result in a non-match — good.

Happy to draft tests covering:

  • str(x) matches and str(x, y) raises TypeError.
  • Class with match_args of varying lengths (exact, too short → TypeError).
  • match_args containing a non-str → TypeError.
  • Keyword attribute missing → non-match.

1306-1311: Consider protocol-based fallback for mapping patterns

The current implementation only treats types with the MAPPING flag (i.e. dict and its subclasses) as mapping patterns, which means:

  • Built-in mapping views (mappingproxy), generic aliases, range, and any user-defined classes that implement the mapping protocol (e.g. define keys/__getitem__) won’t match a mapping pattern.
  • This deviates from PEP 634, where mapping patterns are defined by the mapping protocol rather than hard-coded type checks.

To preserve compatibility with both built-ins and third-party mappings, you can fall back to a protocol check:

In vm/src/frame.rs at lines 1306–1311, change:

-                // Check if the type has the MAPPING flag
-                let is_mapping = subject.class().slots.flags.contains(PyTypeFlags::MAPPING);
+                // Mapping flag or protocol-based fallback
+                let is_mapping = subject.class().slots.flags.contains(PyTypeFlags::MAPPING)
+                    || vm.get_method(subject.clone(), vm.ctx.intern_str("keys")).is_some();
 
                 self.push_value(subject);
                 self.push_value(vm.ctx.new_bool(is_mapping).into());

Similarly, in the mapping-extraction branch (around line 1331), replace the flag check:

-                if subject.class().slots.flags.contains(PyTypeFlags::MAPPING) {
+                if subject.class().slots.flags.contains(PyTypeFlags::MAPPING)
+                    || vm.get_method(subject.clone(), vm.ctx.intern_str("keys")).is_some() {

This hybrid approach keeps the fast path for built-ins (via flags) while ensuring any object with a keys method will also match mapping patterns.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 67bbe81 and ef1ed94.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_patma.py is excluded by !Lib/**
📒 Files selected for processing (14)
  • compiler/codegen/src/compile.rs (7 hunks)
  • extra_tests/snippets/syntax_match.py (1 hunks)
  • vm/src/builtins/bool.rs (1 hunks)
  • vm/src/builtins/bytearray.rs (1 hunks)
  • vm/src/builtins/bytes.rs (1 hunks)
  • vm/src/builtins/dict.rs (1 hunks)
  • vm/src/builtins/float.rs (1 hunks)
  • vm/src/builtins/int.rs (1 hunks)
  • vm/src/builtins/list.rs (1 hunks)
  • vm/src/builtins/set.rs (2 hunks)
  • vm/src/builtins/str.rs (1 hunks)
  • vm/src/builtins/tuple.rs (1 hunks)
  • vm/src/frame.rs (5 hunks)
  • vm/src/types/slot.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • vm/src/builtins/dict.rs
  • vm/src/builtins/bytes.rs
  • vm/src/builtins/str.rs
  • vm/src/builtins/int.rs
  • vm/src/builtins/bytearray.rs
  • vm/src/builtins/list.rs
  • vm/src/builtins/set.rs
  • vm/src/builtins/tuple.rs
  • vm/src/types/slot.rs
  • vm/src/builtins/float.rs
  • vm/src/builtins/bool.rs
  • extra_tests/snippets/syntax_match.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • compiler/codegen/src/compile.rs
  • vm/src/frame.rs
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.

Applied to files:

  • vm/src/frame.rs
🧬 Code graph analysis (1)
compiler/codegen/src/compile.rs (1)
compiler/codegen/src/unparse.rs (2)
  • unparse_expr (61-451)
  • unparse_expr (608-610)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (8)
compiler/codegen/src/compile.rs (4)

3522-3578: Enhance mapping pattern validation and error handling.

The validation logic correctly implements PEP 634 requirements but could benefit from clearer error messages and more robust overflow checking.

The implementation properly validates:

  • Pattern count matches key count
  • Rest pattern cannot be _
  • Keys must be literals or attribute lookups
  • No duplicate keys

The overflow check and error messages are appropriate for the context.


3935-3939: Use TestOperation with Is operator for singleton matching.

The change from CompareOperation with Equal to TestOperation with Is is correct for singleton pattern matching, as singletons should be compared using identity (is) rather than equality (==).

This change aligns with Python's semantics where None, True, and False are compared using identity.


4010-4021: Implement guard expression compilation correctly.

The guard implementation is now complete and correct, properly compiling the guard expression and jumping to the failure target when the guard evaluates to false.

The implementation:

  • Compiles the guard expression
  • Converts to boolean using ToBool
  • Jumps to the appropriate failure block if false

4039-4043: Handle guards in default match cases.

The guard handling for default cases is implemented correctly, using JumpIfFalseOrPop to jump to the end if the guard fails.

This ensures that even default patterns with guards can fail and continue to the end of the match statement.

vm/src/frame.rs (4)

18-18: Importing PyTypeFlags for flag-driven pattern checks — LGTM

This aligns with the new flag-based recognition in MatchMapping/MatchSequence/MatchClass.


23-23: Wtf8Buf import for BUILD_STRING — LGTM

Matches the use in BuildString to collect utf-8 chunks efficiently.


1317-1322: MatchSequence flag check looks correct for intended semantics

Checks PyTypeFlags::SEQUENCE. Based on prior guidance, only list and tuple should carry this flag (bytes/bytearray/range are not sequences for pattern matching).


1332-1365: MatchKeys aligns with MatchMapping (flag-based) and handles KeyError vs other errors correctly — LGTM

  • Stack effect matches CPython: [subject, keys] -> [subject, keys_or_none, values_or_none].
  • On missing key (KeyError) returns non-match; other errors propagate. Good.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
compiler/codegen/src/compile.rs (1)

3294-3294: Fix the off-by-one error in index calculation

The index calculation (patterns.len() - i) is incorrect. When accessing elements from the end of a sequence, the correct formula should be (patterns.len() - 1 - i) to get the proper negative offset.

Apply this diff to fix the index:

-                    value: (patterns.len() - i).into(),
+                    value: (patterns.len() - 1 - i).into(),
vm/src/frame.rs (1)

703-716: SWAP(n): add lower-bound guard and release-mode safety to prevent out-of-bounds swap

Current code asserts len > 0 and only upper-bounds index_val. SWAP(0) computes j = len - 0 == len, which panics in release. Prior review already requested a lower-bound check and a runtime guard.

Apply this diff:

             let len = self.state.stack.len();
             debug_assert!(len > 0, "stack underflow in SWAP");
             let i = len - 1; // TOS index
             let index_val = index.get(arg) as usize;
             // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n]
             // This means swap TOS with the element at index (len - n)
+            debug_assert!(index_val >= 1, "SWAP index must be >= 1");
             debug_assert!(
                 index_val <= len,
                 "SWAP index {} exceeds stack size {}",
                 index_val,
                 len
             );
+            if index_val == 0 || index_val > len {
+                self.fatal("SWAP index out of range");
+            }
             let j = len - index_val;
             self.state.stack.swap(i, j);
🧹 Nitpick comments (3)
compiler/codegen/src/compile.rs (1)

3662-3705: Complex but correct rest pattern implementation

The rest pattern handling correctly:

  1. Builds an empty dict
  2. Updates it with the subject
  3. Removes matched keys
  4. Stores the resulting rest dict

The stack manipulation with proper rotation and cleanup ensures values end up in the right place.

Consider adding a comment block explaining the stack state at each major step. For example:

# Stack states during rest pattern processing:
# Initial: [subject, keys_tuple]
# After BuildMap: [subject, keys_tuple, {}]
# After Swap: [{}, keys_tuple, subject]
# After DictUpdate: [rest_dict, keys_tuple]
# ... etc

This would help future maintainers understand the complex stack operations.

vm/src/frame.rs (2)

1325-1388: MatchKeys: semantics look right; small nits for perf and consistency

  • Preallocate values with capacity to avoid re-allocations.
  • Use interned "get" for the attribute lookup to avoid repeated string allocations and stay consistent with keys().

Apply this diff:

-                let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap();
-                let mut values = Vec::new();
+                let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap();
+                let mut values = Vec::with_capacity(keys.len());
@@
-                    if let Ok(get_method) = subject.get_attr("get", vm) {
+                    if let Ok(get_method) = subject.get_attr(vm.ctx.intern_str("get"), vm) {

1428-1434: MatchClass: align error messages with CPython; avoid unwrap on keyword names

The control flow and error/OK(None) split look correct. To match CPython’s error messages and improve robustness:

  • Include the class name and counts in TypeErrors for positional sub-pattern count mismatches.
  • Don’t unwrap keyword attribute names; raise TypeError if a non-str sneaks in (even if compiler-generated).

Apply this diff:

-                            if match_args.len() < nargs_val {
-                                return Err(vm.new_type_error(format!(
-                                    "class pattern accepts at most {} positional sub-patterns ({} given)",
-                                    match_args.len(),
-                                    nargs_val
-                                )));
-                            }
+                            if match_args.len() < nargs_val {
+                                let type_name = cls
+                                    .downcast::<crate::builtins::PyType>()
+                                    .map(|t| t.__name__(vm).as_str().to_owned())
+                                    .unwrap_or_else(|| String::from("?"));
+                                return Err(vm.new_type_error(format!(
+                                    "{}() accepts at most {} positional sub-patterns ({} given)",
+                                    type_name,
+                                    match_args.len(),
+                                    nargs_val
+                                )));
+                            }
@@
-                                } else if nargs_val > 1 {
-                                    // Too many positional arguments for MATCH_SELF
-                                    return Err(vm.new_type_error(
-                                        "class pattern accepts at most 1 positional sub-pattern for MATCH_SELF types"
-                                            .to_string(),
-                                    ));
-                                }
+                                } else if nargs_val > 1 {
+                                    let type_name = cls
+                                        .downcast::<crate::builtins::PyType>()
+                                        .map(|t| t.__name__(vm).as_str().to_owned())
+                                        .unwrap_or_else(|| String::from("?"));
+                                    return Err(vm.new_type_error(format!(
+                                        "{}() accepts at most 1 positional sub-pattern ({} given)",
+                                        type_name,
+                                        nargs_val
+                                    )));
+                                }
@@
-                                if nargs_val > 0 {
-                                    return Err(vm.new_type_error(
-                                        "class pattern defines no positional sub-patterns (__match_args__ missing)"
-                                            .to_string(),
-                                    ));
-                                }
+                                if nargs_val > 0 {
+                                    let type_name = cls
+                                        .downcast::<crate::builtins::PyType>()
+                                        .map(|t| t.__name__(vm).as_str().to_owned())
+                                        .unwrap_or_else(|| String::from("?"));
+                                    return Err(vm.new_type_error(format!(
+                                        "{}() accepts 0 positional sub-patterns but {} were given",
+                                        type_name,
+                                        nargs_val
+                                    )));
+                                }

And for keyword attribute names:

-                    for name in kwd_attrs {
-                        let name_str = name.downcast_ref::<PyStr>().unwrap();
+                    for name in kwd_attrs {
+                        let name_str = name
+                            .downcast_ref::<PyStr>()
+                            .ok_or_else(|| vm.new_type_error("keyword patterns must be strings"))?;

Also applies to: 1472-1476, 1481-1485, 1492-1492

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ef1ed94 and 1984856.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_patma.py is excluded by !Lib/**
📒 Files selected for processing (15)
  • compiler/codegen/src/compile.rs (7 hunks)
  • extra_tests/snippets/syntax_match.py (1 hunks)
  • vm/src/builtins/bool.rs (1 hunks)
  • vm/src/builtins/bytearray.rs (1 hunks)
  • vm/src/builtins/bytes.rs (1 hunks)
  • vm/src/builtins/dict.rs (1 hunks)
  • vm/src/builtins/float.rs (1 hunks)
  • vm/src/builtins/int.rs (1 hunks)
  • vm/src/builtins/list.rs (1 hunks)
  • vm/src/builtins/set.rs (2 hunks)
  • vm/src/builtins/str.rs (1 hunks)
  • vm/src/builtins/tuple.rs (1 hunks)
  • vm/src/builtins/type.rs (3 hunks)
  • vm/src/frame.rs (3 hunks)
  • vm/src/types/slot.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • vm/src/builtins/set.rs
  • vm/src/builtins/list.rs
  • vm/src/builtins/str.rs
  • vm/src/builtins/dict.rs
  • vm/src/builtins/float.rs
  • vm/src/builtins/bytes.rs
  • vm/src/builtins/tuple.rs
  • vm/src/builtins/bytearray.rs
  • vm/src/builtins/bool.rs
  • vm/src/types/slot.rs
  • vm/src/builtins/int.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/frame.rs
  • vm/src/builtins/type.rs
  • compiler/codegen/src/compile.rs
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • extra_tests/snippets/syntax_match.py
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.

Applied to files:

  • vm/src/frame.rs
🧬 Code graph analysis (1)
compiler/codegen/src/compile.rs (1)
compiler/codegen/src/unparse.rs (2)
  • unparse_expr (61-451)
  • unparse_expr (608-610)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Check the WASM package and demo
🔇 Additional comments (17)
vm/src/builtins/type.rs (4)

235-255: Pattern matching flag inheritance logic looks correct

The updated inherit_patma_flags method properly handles multi-base inheritance by:

  1. Checking all bases in order
  2. Inheriting the first non-empty COLLECTION_FLAGS found
  3. Correctly preventing override of existing flags

The implementation follows Python's semantics where the first base with pattern matching flags wins.


269-271: Good defensive check for existing flags

The early return prevents accidental overwriting of collection flags that may have been set elsewhere, maintaining flag integrity.


315-316: LGTM: Proper base list passing for flag inheritance

The change correctly passes the full bases slice instead of a single base, enabling proper multi-base inheritance checking.


374-375: Static types handled correctly with single-base slice

Creating a single-element slice for static types preserves backward compatibility while using the new multi-base API consistently.

extra_tests/snippets/syntax_match.py (3)

67-70: Good addition of explicit assertions for mapping pattern

Adding explicit assertions for both x and y values improves test clarity and ensures the mapping pattern correctly extracts both values.


72-88: Excellent test coverage for mapping rest patterns

The new tests comprehensively cover:

  • Basic rest pattern with remaining items
  • Empty rest pattern edge case
  • Multiple key extraction with rest

This provides good coverage for the **rest pattern functionality.


89-106: Comprehensive rest pattern test with multiple keys

The test properly validates that rest patterns correctly collect unmatched keys while preserving matched ones. The progression from 2-key to 3-key matching demonstrates incremental consumption.

compiler/codegen/src/compile.rs (7)

3487-3507: Wildcard pattern handling logic is correct

The implementation properly:

  1. Identifies true wildcards (patterns with no name binding)
  2. Decrements on_top for all patterns (including wildcards)
  3. Pops wildcards from the stack without compiling them

This correctly handles the _ pattern in class matching.


3521-3535: Excellent validation for mapping patterns

The implementation includes proper checks for:

  1. Key-pattern count mismatch
  2. Invalid _ as a rest target (matching Python's syntax rules)

The error messages are clear and helpful.


3577-3578: Safe downcast after overflow check

The cast is safe here since the overflow check at line 3572 ensures size <= i32::MAX + 1.


3583-3616: Key validation and duplicate detection improvements

The implementation properly:

  1. Accepts literals and attribute lookups as required by PEP 634
  2. Uses unparse_expr for stable duplicate detection (avoiding Debug format issues)
  3. Includes None and bytes literals as valid keys

This is a significant improvement over the previous Debug-based approach.


3936-3939: Correct change to use Is operator for singleton patterns

Using TestOperation::Is instead of CompareOperation::Equal is the correct approach for singleton pattern matching, as it properly tests identity rather than equality.


4010-4021: Guards properly implemented for match cases

The guard compilation correctly:

  1. Evaluates the guard expression
  2. Converts to bool
  3. Jumps to fail block if false

This matches Python's guard semantics.


4039-4043: Default case guard handling looks good

The implementation correctly handles guards in default (catch-all) cases using JumpIfFalseOrPop to either continue or skip the case body.

vm/src/frame.rs (3)

18-24: Import changes and PyTypeFlags usage — LGTM

Consolidating imports and switching to flag-based checks via PyTypeFlags is consistent with the rest of the VM. No issues here.


1317-1323: MatchSequence: restrict via SEQUENCE flag — LGTM

Using PyTypeFlags::SEQUENCE here aligns with the intended semantics (list/tuple are sequences; bytes/bytearray/range are not) per prior discussion. Good.


1306-1311: Mapping checks in MatchMapping and MatchKeys are now consistent

  • Both opcodes in vm/src/frame.rs use the PyTypeFlags::MAPPING flag:
    • Line 1307: subject.class().slots.flags.contains(PyTypeFlags::MAPPING)
    • Line 1330: same flag check in the MatchKeys path
  • A repository-wide search for PyMapping::check in vm/src/** found no other references in the match opcode implementations.

All mapping-related checks in these opcodes now uniformly rely on the type flag.

Comment on lines +1306 to 1311
// Check if the type has the MAPPING flag
let is_mapping = subject.class().slots.flags.contains(PyTypeFlags::MAPPING);

self.push_value(subject);
self.push_value(vm.ctx.new_bool(is_mapping).into());
Ok(None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

MatchMapping: flag-based detection — verify MAPPING coverage on builtins

Using subject.class().slots.flags.contains(PyTypeFlags::MAPPING) is good. Please verify PyDict (and any mapping builtins intended to match) actually carry MAPPING, otherwise MatchMapping will never succeed on dicts.

Run:


🏁 Script executed:

#!/bin/bash
set -euo pipefail
echo "Builtins with MAPPING flag:"
rg -n 'pyclass\(.*flags\([^)]*MAPPING' -g 'vm/src/builtins/*.rs' -C2
echo
echo "Dict pyclass line:"
rg -nP 'pyclass\([^)]*name\s*=\s*"dict"[^)]*\)' vm/src/builtins/dict.rs -n -C2 || true

Length of output: 155


🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail

echo "1) List all `#[pyclass]` definitions in vm/src/builtins:"
rg -n 'pyclass\(' -g 'vm/src/builtins/*.rs' -n -C2 || true

echo
echo "2) Search for `PyTypeFlags::MAPPING` usage in builtins:"
rg -n 'PyTypeFlags::MAPPING' -g 'vm/src/builtins/*.rs' -n -C2 || true

echo
echo "3) Inspect the `dict` builtin for its `#[pyclass]` flags:"
rg -n 'pyclass\([^)]*name\s*=\s*"dict"' -g 'vm/src/builtins/dict.rs' -n -C2 || true

echo
echo "4) Check any `flags` assignment in dict.rs:"
rg -n 'flags' -g 'vm/src/builtins/dict.rs' -n -C2 || true

Length of output: 39397


🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail

echo "Inspect `PyMappingProxy` struct annotation:"
rg -n 'pyclass\(.*name\s*=\s*"mappingproxy"' -g vm/src/builtins/mappingproxy.rs -n -C2 || true

echo
echo "Inspect `PyMappingProxy` impl annotation:"
rg -n 'impl PyMappingProxy' -g vm/src/builtins/mappingproxy.rs -n -C5 || true

echo
echo "Search for any `flags(` in mappingproxy.rs:"
rg -n 'flags' -g vm/src/builtins/mappingproxy.rs -n -C2 || true

echo
echo "Search for literal `MAPPING` in mappingproxy.rs:"
rg -n 'MAPPING' -g vm/src/builtins/mappingproxy.rs -n -C2 || true

Length of output: 3044


MatchMapping: Ensure all mapping types carry the MAPPING flag

  • PyDict is already annotated with MAPPING (see vm/src/builtins/dict.rs:179: flags(BASETYPE, MAPPING, _MATCH_SELF)), so mapping patterns will match plain dicts.
  • PyMappingProxy (vm/src/builtins/mappingproxy.rs:17) currently has no flags(...) entry, so its flags default omits MAPPING and it won’t be recognized by the MatchMapping logic.

To correctly support PEP 634–style mapping patterns on mappingproxy (and any other builtins implementing AsMapping), add the MAPPING flag to its #[pyclass]:

--- a/vm/src/builtins/mappingproxy.rs
+++ b/vm/src/builtins/mappingproxy.rs
@@ -15,1 +15,1 @@
-#[pyclass(module = false, name = "mappingproxy", traverse)]
+#[pyclass(module = false, name = "mappingproxy", traverse, flags(BASETYPE, MAPPING))]

Please apply this change (and similarly verify any other AsMapping types) so that they participate in MatchMapping.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Check if the type has the MAPPING flag
let is_mapping = subject.class().slots.flags.contains(PyTypeFlags::MAPPING);
self.push_value(subject);
self.push_value(vm.ctx.new_bool(is_mapping).into());
Ok(None)
#[pyclass(module = false, name = "mappingproxy", traverse, flags(BASETYPE, MAPPING))]
#[derive(Debug)]
pub struct PyMappingProxy {
mapping: MappingProxyInner,
}
🤖 Prompt for AI Agents
In vm/src/frame.rs around lines 1306 to 1311, MatchMapping checks
PyTypeFlags::MAPPING on a subject to decide mapping-pattern behavior, but
mappingproxy (vm/src/builtins/mappingproxy.rs) lacks the MAPPING flag so it
won’t be recognized; update the #[pyclass] on mappingproxy to include the
MAPPING flag (e.g., flags(..., MAPPING, ...)) and then run a quick audit of
other builtins implementing AsMapping to add MAPPING where missing so they
participate in MatchMapping.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
vm/src/frame.rs (2)

1317-1323: MatchSequence via SEQUENCE flag aligns with agreed semantics.

This is consistent with the project decision that only list/tuple should participate as sequences in pattern matching; bytes/bytearray/range should not. Good.


703-714: SWAP(0) can panic; add a lower-bound check and release-mode guard.

Only an upper-bound debug_assert is present. SWAP is 1-indexed; SWAP(0) yields j = len - 0 and panics. Add a lower-bound assertion and a runtime guard.

Apply this diff:

             let len = self.state.stack.len();
             debug_assert!(len > 0, "stack underflow in SWAP");
             let i = len - 1; // TOS index
             let index_val = index.get(arg) as usize;
             // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n]
             // This means swap TOS with the element at index (len - n)
-            debug_assert!(
-                index_val <= len,
-                "SWAP index {} exceeds stack size {}",
-                index_val,
-                len
-            );
+            debug_assert!(index_val >= 1, "SWAP index must be >= 1");
+            debug_assert!(
+                index_val <= len,
+                "SWAP index {} exceeds stack size {}",
+                index_val,
+                len
+            );
+            if index_val == 0 || index_val > len {
+                self.fatal("SWAP index out of range");
+            }
             let j = len - index_val;
             self.state.stack.swap(i, j);
🧹 Nitpick comments (3)
vm/src/builtins/type.rs (2)

235-255: Multi-base inheritance for pattern-matching flags looks correct; consider guarding against conflicting flags on a single base.

The “first base with SEQUENCE or MAPPING” rule is implemented correctly and avoids overriding pre-set flags. As a defensive check (useful during development), add a debug assertion that a base doesn’t simultaneously carry both flags, which would be ill-formed.

Apply this diff to add a debug assertion inside the loop:

         for base in bases {
             let base_flags = base.slots.flags & COLLECTION_FLAGS;
+            debug_assert!(
+                !(base_flags.contains(PyTypeFlags::SEQUENCE) && base_flags.contains(PyTypeFlags::MAPPING)),
+                "Base type '{}' has conflicting SEQUENCE and MAPPING flags",
+                base.name()
+            );
             if !base_flags.is_empty() {
                 slots.flags |= base_flags;
                 return;
             }
         }

269-297: Avoid sign-extension pitfalls when parsing abc_tpflags; use u64 instead of i64.

Using to_i64() then casting to u64 can misinterpret negative values (e.g., -1 becomes all bits set before masking). Prefer parsing as u64 and ignore out-of-range/negative values to prevent accidental flag pollution.

Apply this diff in both places where flags_val is computed:

-                let flags_val = int_obj.as_bigint().to_i64().unwrap_or(0);
-                let abc_flags = PyTypeFlags::from_bits_truncate(flags_val as u64);
+                let flags_val = int_obj.as_bigint().to_u64().unwrap_or(0);
+                let abc_flags = PyTypeFlags::from_bits_truncate(flags_val);
vm/src/frame.rs (1)

1389-1509: MatchClass: correctness improvements match CPython behavior.

  • Raises TypeError for non-tuple match_args and for too many positional subpatterns.
  • Enforces MATCH_SELF rules and errors on >1 positional subpattern.
  • Treats missing attributes/keys as non-match while propagating genuine errors.

Looks good. Please add tests that exercise:

  • match_args is not a tuple (TypeError).
  • nargs > len(match_args) (TypeError).
  • MATCH_SELF with nargs in {0,1,2} (ok/error).
  • No match_args and nargs > 0 (TypeError).
  • Keyword attribute missing → non-match, unrelated errors propagate.

I can draft targeted tests for these cases if helpful.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1984856 and c965f78.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_patma.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • vm/src/builtins/type.rs (3 hunks)
  • vm/src/frame.rs (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/builtins/type.rs
  • vm/src/frame.rs
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • vm/src/builtins/type.rs
  • vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.

Applied to files:

  • vm/src/builtins/type.rs
  • vm/src/frame.rs
🧬 Code graph analysis (1)
vm/src/frame.rs (2)
vm/src/protocol/mapping.rs (2)
  • keys (154-160)
  • values (162-168)
vm/src/vm/mod.rs (4)
  • new (116-219)
  • get_method (775-782)
  • value (660-660)
  • value (662-662)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
🔇 Additional comments (3)
vm/src/builtins/type.rs (2)

315-317: Good call-site integration.

Propagating flags from all bases in new_heap_inner aligns with the new API and the “first base wins” behavior.


373-376: Static types: confirm ABC-derived flags aren’t needed at creation.

new_static inherits flags from the single base but doesn’t consult abc_tpflags. That’s likely intentional (builtins set explicit flags). Please confirm no static type relies on ABCMeta-provided flags at init; otherwise consider calling check_abc_tpflags() here too.

Would you like a quick repo scan to list static builtins that set MAPPING/SEQUENCE explicitly to ensure coverage?

vm/src/frame.rs (1)

1306-1311: Add MAPPING flag to built-in mapping types

Our search across vm/src/builtins/*.rs found no #[pyclass(..., flags = PyTypeFlags::MAPPING, ...)] annotations (the rg for flags.*MAPPING returned zero matches). In particular:

  • vm/src/builtins/dict.rs​:
    #[pyclass(module = false, name = "dict", unhashable = true, traverse)]
    (no flags = PyTypeFlags::MAPPING)
  • vm/src/builtins/mappingproxy.rs​:
    #[pyclass(module = false, name = "mappingproxy", traverse)]
    (no flags = PyTypeFlags::MAPPING)

Because MatchMapping now simply checks the type’s PyTypeFlags::MAPPING bit, failing to set it means neither dict nor mappingproxy will be recognized as mappings at runtime.

Please update these two #[pyclass] attributes to include flags = PyTypeFlags::MAPPING. For example:

 #[pyclass(
-   module = false,
-   name = "dict",
-   unhashable = true,
-   traverse
+   module = false,
+   name = "dict",
+   unhashable = true,
+   traverse,
+   flags = PyTypeFlags::MAPPING
 )]
 #[pyclass(
-   module = false,
-   name = "mappingproxy",
-   traverse
+   module = false,
+   name = "mappingproxy",
+   traverse,
+   flags = PyTypeFlags::MAPPING
 )]

Optionally, review other types implementing the AsMapping trait (e.g. in tuple.rs, list.rs, range.rs, etc.) and decide whether they should likewise carry the mapping flag or remain sequence-only.

⛔ Skipped due to learnings
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
vm/src/frame.rs (1)

702-716: SWAP index guard still allows SWAP(0) to panic in release; add lower-bound check and runtime guard

Without a lower-bound check, index_val == 0 computes j = len - 0, causing out-of-bounds in release. Add a debug_assert for >= 1 and a release-mode guard.

Apply this diff:

             let len = self.state.stack.len();
             debug_assert!(len > 0, "stack underflow in SWAP");
             let i = len - 1; // TOS index
             let index_val = index.get(arg) as usize;
             // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n]
             // This means swap TOS with the element at index (len - n)
-            debug_assert!(
-                index_val <= len,
-                "SWAP index {} exceeds stack size {}",
-                index_val,
-                len
-            );
+            debug_assert!(index_val >= 1, "SWAP index must be >= 1");
+            debug_assert!(
+                index_val <= len,
+                "SWAP index {} exceeds stack size {}",
+                index_val,
+                len
+            );
+            if index_val == 0 || index_val > len {
+                self.fatal("SWAP index out of range");
+            }
             let j = len - index_val;
             self.state.stack.swap(i, j);
🧹 Nitpick comments (2)
compiler/codegen/src/compile.rs (2)

3286-3304: Confirm post-star index math for sequence patterns; add targeted tests

The nonnegative index computation for elements after the star slot uses:

  • index = len(subject) - (patterns.len() - i)

This looks correct (e.g., for [a, *b, c, d], i=2 -> len-2, i=3 -> len-1). A prior review suggested patterns.len() - 1 - i inside this expression; that would be off by one.

Recommendation:

  • Add a couple of tests to lock this in (both even/odd lengths):
    • case [a, *b, c, d]
    • case [a, *b, c]
    • case [*b, c, d]
    • Ensure indices selected map to the last k elements as expected.

I can draft minimal tests under extra_tests/snippets/syntax_match.py if you’d like.


3580-3616: Key validation now accepts all PEP 634 literal forms (including bytes/None) and attribute lookups

  • Allow-list of valid key ASTs looks correct.
  • Duplicate detection via unparse is much more stable than Debug and resilient to ranges.

Minor suggestion (optional):

  • Consider normalizing string/bytes literal prefixes/case in the unparse (if not already guaranteed) to avoid edge-case duplicates with different spellings but equal values. Not urgent.
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c965f78 and f4543f5.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_patma.py is excluded by !Lib/**
📒 Files selected for processing (15)
  • compiler/codegen/src/compile.rs (7 hunks)
  • extra_tests/snippets/syntax_match.py (1 hunks)
  • vm/src/builtins/bool.rs (1 hunks)
  • vm/src/builtins/bytearray.rs (1 hunks)
  • vm/src/builtins/bytes.rs (1 hunks)
  • vm/src/builtins/dict.rs (1 hunks)
  • vm/src/builtins/float.rs (1 hunks)
  • vm/src/builtins/int.rs (1 hunks)
  • vm/src/builtins/list.rs (1 hunks)
  • vm/src/builtins/set.rs (2 hunks)
  • vm/src/builtins/str.rs (1 hunks)
  • vm/src/builtins/tuple.rs (1 hunks)
  • vm/src/builtins/type.rs (3 hunks)
  • vm/src/frame.rs (3 hunks)
  • vm/src/types/slot.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • vm/src/builtins/bool.rs
  • vm/src/builtins/dict.rs
  • vm/src/builtins/tuple.rs
  • vm/src/builtins/str.rs
  • vm/src/types/slot.rs
  • vm/src/builtins/list.rs
  • vm/src/builtins/set.rs
  • vm/src/builtins/bytearray.rs
  • vm/src/builtins/float.rs
  • vm/src/builtins/bytes.rs
  • extra_tests/snippets/syntax_match.py
  • vm/src/builtins/type.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style (cargo fmt to format)
Always run clippy to lint code (cargo clippy) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/builtins/int.rs
  • compiler/codegen/src/compile.rs
  • vm/src/frame.rs
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.

Applied to files:

  • vm/src/frame.rs
🧬 Code graph analysis (2)
compiler/codegen/src/compile.rs (1)
compiler/codegen/src/unparse.rs (2)
  • unparse_expr (61-451)
  • unparse_expr (608-610)
vm/src/frame.rs (2)
common/src/boxvec.rs (1)
  • len (49-51)
vm/src/vm/mod.rs (1)
  • get_method (775-782)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (11)
vm/src/frame.rs (4)

18-18: Flag imports and common utilities consolidation — LGTM

Importing PyTypeFlags and consolidating BoxVec/PyMutex/Wtf8Buf is appropriate for the new flag-driven matching and BuildString path.

Also applies to: 23-23


1325-1391: MatchKeys logic is robust and CPython-aligned — using get with a sentinel, with getitem fallback

  • Using vm.get_method(..., "get").transpose()? correctly propagates attribute-access errors.
  • The dummy sentinel and identity check avoid KeyError and missing side-effects.
  • Falling back to getitem when get is absent keeps behavior correct for non-dict mappings.

1393-1512: Verify built-in MATCH_SELF annotations before adding tests

The ripgrep output shows that only bool is defined with flags(_MATCH_SELF); the other built-in types (int, str, float, tuple, list, dict) aren’t annotated and will follow the “no match_args” error path rather than MATCH_SELF semantics. Please confirm your intended behavior:

  • If you plan to support MATCH_SELF for those types, add flags(_MATCH_SELF) to their #[pyclass(...)] in:
    • vm/src/builtins/int.rs
    • vm/src/builtins/str.rs
    • vm/src/builtins/float.rs
    • vm/src/builtins/tuple.rs
    • vm/src/builtins/list.rs
    • vm/src/builtins/dict.rs
  • Otherwise, limit the new pytest cases to:
    • bool(x) extracting the subject itself (MATCH_SELF happy path) and rejecting bool(x, y) with the corresponding TypeError.
    • For int(x)/str(x)/etc., asserting the “C() accepts 0 positional sub-patterns but 1 were given” error from the no-match_args branch.

Once you’ve confirmed the desired flag coverage, you can proceed to draft tests that align with the actual implementation.


1306-1311: Flag coverage gaps in mappingproxy and list — action required

The mapping (MAPPING) and sequence (SEQUENCE) flags are correctly driving pattern‐matching in frame.rs, but our downstream inspection uncovered two missing flag annotations:

  • mappingproxy
    The PyMappingProxy class in vm/src/builtins/mappingproxy.rs currently reads:

    #[pyclass(module = false, name = "mappingproxy", traverse)]
    #[derive(Debug)]
    pub struct PyMappingProxy {}

    It must carry the MAPPING flag so that MatchMapping sees it. For consistency with PyDict, change it to:

    #[pyclass(
        module = false,
        name = "mappingproxy",
        traverse,
    -   // missing flags
    +   flags(BASETYPE, MAPPING, _MATCH_SELF),
    )]
    #[derive(Debug)]
    pub struct PyMappingProxy { … }

    This ensures subject.class().slots.flags.contains(PyTypeFlags::MAPPING) returns true for mappingproxy.

  • list
    The PyList class in vm/src/builtins/list.rs currently reads:

    #[pyclass(module = false, name = "list", unhashable = true, traverse)]
    #[derive(Default)]
    pub struct PyList {}

    It likewise needs the SEQUENCE flag so that MatchSequence recognizes lists. Update it to:

    #[pyclass(
        module = false,
        name = "list",
        unhashable = true,
        traverse,
    -   // missing flags
    +   flags(BASETYPE, SEQUENCE, _MATCH_SELF),
    )]
    #[derive(Default)]
    pub struct PyList { … }

    This makes subject.class().slots.flags.contains(PyTypeFlags::SEQUENCE) true for lists.

All other builtins—dict/mappingproxy for mappings and only tuple (and now list) for sequences—are correctly flagged, and bytes/bytearray/range remain unflagged as intended. Apply these changes to fully align flag coverage with CPython’s pattern-matching rules in both the mapping check (lines 1306–1311) and sequence check (lines 1317–1323) of vm/src/frame.rs.

⛔ Skipped due to learnings
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
vm/src/builtins/int.rs (1)

320-322: Add _MATCH_SELF to int: aligns with class-pattern “self” semantics

Good call adding the _MATCH_SELF flag so patterns like case int(x) bind the subject itself. This matches CPython’s Py_TPFLAGS_MATCH_SELF behavior and is consistent with the wider PR adding this flag across builtins.

Follow-ups:

  • Please make sure bool also sets _MATCH_SELF (bool is a subclass of int and typically participates in these patterns).
  • Consider adding small runtime tests to cover:
    • match 42: case int(): ...
    • match 42: case int(x): assert x == 42
    • Negative: match 42: case int(x, y): ... should not match.
compiler/codegen/src/compile.rs (6)

3486-3506: on_top accounting with wildcard subpatterns is consistent

You decrement pc.on_top before checking for a true wildcard, and explicitly Pop the wildcard value. Given we just unpacked total items and bumped on_top accordingly, consuming one stack item per iteration (wildcard or not) keeps on_top correct. No change requested.


3556-3579: Early length check + overflow guard: good prefiltering

  • GetLen >= size is a reasonable fast-fail before the more precise MatchKeys.
  • The overflow guard (size > i32::MAX + 1) mirrors CPython’s intent; casting to u32 afterward is safe.

LGTM.


3623-3650: MatchKeys + None check pipeline reads cleanly

  • Building the keys tuple, calling MatchKeys, and testing against None with IsNot is straightforward and matches the VM changes described in the PR.
  • Unpacking the resulting values tuple and updating pc.on_top accordingly looks consistent.

LGTM.


3936-3940: Singleton patterns should use identity (is) — correct

Switching to TestOperation::Is for None/True/False matches CPython semantics. Good fix.


4010-4021: Guard compilation paths look correct for both non-default and default cases

  • Non-default cases: compile guard, ToBool, and jump to the pattern-failure target — matches the intended control flow.
  • Default case: JumpIfFalseOrPop to end is the right behavior.

No issues spotted.

Also applies to: 4039-4043


3519-3536: Keep Disallowing **_ in Mapping Patterns

According to the Python structural‐pattern‐matching specification, mapping patterns inherently ignore any extra keys not listed in the pattern, so a “rest wildcard” is already redundant. The PEPs and official documentation explicitly state that while **rest is supported to capture the remainder, using **_ as a wildcard is disallowed because it would have no semantic effect.

• PEP 636 (Tutorial) notes: “Mapping patterns… support a wildcard **rest. (But **_ would be redundant, so it is not allowed.)” (peps.python.org)
• PEP 622 (Specification) likewise defines **capture_pattern for mappings and states that **_ is invalid due to its no-op nature. (peps.python.org)

Please retain the existing syntax‐error behavior for **_ in mapping patterns. This enforces clarity (avoiding a no-op wildcard) and keeps consistency with the language design.

Likely an incorrect or invalid review comment.

@youknowone
Copy link
Member Author

@arihant2math do you have time to give a look?

@youknowone youknowone merged commit 1c992f8 into RustPython:main Aug 28, 2025
12 checks passed
@youknowone youknowone deleted the pattern-mapping branch August 28, 2025 03:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants