-
Notifications
You must be signed in to change notification settings - Fork 1.3k
More Pattern matching implementation mapping + class #6110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughComprehensive pattern-matching changes: compiler adjustments for wildcards, mapping keys/rest and guards; VM refactor to use type flags for MatchMapping/MatchSequence/MatchKeys and a redesigned MatchClass supporting match_args and a new _MATCH_SELF flag; builtin class flags updated and tests extended. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Compiler
participant VM
participant Subject as Subject(Value)
participant Type as Subject.Type
participant Class as Pattern.Class
User->>Compiler: compile pattern: Class(positional?, kwd?, guard?)
Compiler->>VM: emit MatchClass(nargs) + key/guard ops
VM->>Subject: load subject
VM->>Type: check instance of Class
alt is instance
opt nargs > 0
VM->>Class: get __match_args__
alt __match_args__ is tuple[str]
VM->>Subject: get attrs by names
alt any missing
VM-->>VM: non-match (push None)
else all found
VM-->>VM: push tuple(values)
end
else no __match_args__
alt Type has _MATCH_SELF
alt nargs == 1
VM-->>VM: push tuple(subject)
else nargs > 1
VM-->>User: TypeError
end
else
VM-->>VM: non-match (push None)
end
end
end
VM->>VM: extract kwd attrs (if any)
alt any missing
VM-->>VM: non-match (push None)
else success
VM-->>VM: push combined tuple
end
else not instance
VM-->>VM: non-match (push None)
end
opt guard present
VM->>VM: eval guard
alt guard false
VM-->>VM: Jump to failure target
end
end
sequenceDiagram
autonumber
actor User
participant Compiler
participant VM
participant Subject as Mapping Subject
User->>Compiler: compile mapping pattern {keys..., **rest?}
Compiler->>VM: emit MATCH_KEYS and rest-handling ops
VM->>Subject: check PyTypeFlags::MAPPING
alt is mapping
VM->>Subject: attempt .get or get_item for each key
alt all keys present
VM-->>VM: push tuple(values)
opt rest binding exists
VM->>VM: build rest dict = subject - matched keys
VM-->>VM: bind rest name
end
else missing key
VM-->>VM: non-match (push None)
end
else not mapping
VM-->>VM: non-match (push None)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (8)
extra_tests/snippets/syntax_match.py (3)
72-79
: Nice coverage for mapping with rest; consider adding a negative duplicate-keys case.This verifies {"a": x, **rest} binds correctly. To harden against regressions in the compiler’s duplicate-key detection, please also add a test that a mapping pattern with duplicate literal keys is a SyntaxError at compile time.
If you want, I can draft a small snippet that asserts compile-time failure for something like:
- case {"a": _, "a": _}: ...
89-96
: Multi-key rest works; add an attribute-key case too.Confirms rest == {"c": 3, "d": 4}. It would be useful to include a case where the key is an attribute (e.g., {C.K: v, **rest}) to exercise the compiler’s “attribute key” path.
99-106
: Three-key head + rest is covered; add a class-pattern “str(x)” test given the PR goal.Since this PR also claims “str(x) matching is fixed,” please add a quick class-pattern test:
- match "abc": case str(s): assert s == "abc"
This will guard the new _MATCH_SELF semantics for str.Proposed snippet to append near here:
s = "abc" match s: case str(x): assert x == "abc" case _: assert Falsecompiler/codegen/src/compile.rs (2)
3487-3506
: Skip compiling true wildcards to reduce bytecode and stack churn.Recognizing a “true underscore” (MatchAs with both pattern and name None) and just popping it avoids unnecessary work. Consider extracting the “is_true_wildcard” check into a helper to reuse wherever wildcards are handled.
3667-3682
: Rest dict deletion loop is correct; consider a small micro-optimization.The loop deletes keys from rest_dict in LIFO order using CopyItem/Swap/DeleteSubscript. It’s correct. If you’re open to a micro-opt: iterating forward (k1..kn) with a constant CopyItem index (always 2) by rotating the stack before each delete can cut the index computation, but it’s not essential.
vm/src/frame.rs (3)
705-716
: SWAP index lower bound: guard against 0 to avoid out-of-bounds swap
index_val == 0
would computej = len - 0
and panic inswap(i, j)
. Even if the compiler never emits SWAP(0), this assert is cheap and prevents undefined behavior with malformed bytecode or future regressions.Apply:
- debug_assert!( - index_val <= len, - "SWAP index {} exceeds stack size {}", - index_val, - len - ); + debug_assert!( + (1..=len).contains(&index_val), + "SWAP index {} out of range 1..={}", + index_val, + len + );
1319-1358
: MatchKeys should treat only KeyError as non-match; propagate other exceptionsIn the loop fetching
subject[key]
, any error currently counts as “no match”. This risks hiding real errors from custom mappings. Recommend aligning behavior to consider only KeyError as “key missing”.Apply:
- for key in keys { - match subject.get_item(key.as_object(), vm) { - Ok(value) => values.push(value), - Err(_) => { - all_match = false; - break; - } - } - } + for key in keys { + match subject.get_item(key.as_object(), vm) { + Ok(value) => values.push(value), + Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => { + all_match = false; + break; + } + Err(e) => return Err(e), + } + }
1-4
: Consistency note: centralize flag-based matchingYou import
PyTypeFlags
here and use flags for MatchMapping/MatchSequence; consider following through by documenting this choice and ensuring new match-related checks prefer flags where available. This avoids protocol-based divergence and keeps performance predictable.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Lib/test/test_patma.py
is excluded by!Lib/**
📒 Files selected for processing (14)
compiler/codegen/src/compile.rs
(7 hunks)extra_tests/snippets/syntax_match.py
(1 hunks)vm/src/builtins/bool.rs
(1 hunks)vm/src/builtins/bytearray.rs
(1 hunks)vm/src/builtins/bytes.rs
(1 hunks)vm/src/builtins/dict.rs
(1 hunks)vm/src/builtins/float.rs
(1 hunks)vm/src/builtins/int.rs
(1 hunks)vm/src/builtins/list.rs
(1 hunks)vm/src/builtins/set.rs
(2 hunks)vm/src/builtins/str.rs
(1 hunks)vm/src/builtins/tuple.rs
(1 hunks)vm/src/frame.rs
(4 hunks)vm/src/types/slot.rs
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
vm/src/types/slot.rs
vm/src/builtins/dict.rs
vm/src/builtins/int.rs
vm/src/builtins/str.rs
vm/src/builtins/bytearray.rs
vm/src/builtins/tuple.rs
vm/src/builtins/set.rs
vm/src/builtins/bool.rs
vm/src/builtins/bytes.rs
vm/src/builtins/list.rs
vm/src/builtins/float.rs
vm/src/frame.rs
compiler/codegen/src/compile.rs
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py
: Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
extra_tests/snippets/syntax_match.py
🧬 Code graph analysis (1)
vm/src/builtins/bool.rs (1)
derive/src/lib.rs (1)
pyclass
(124-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (27)
extra_tests/snippets/syntax_match.py (2)
67-71
: Good assertions for two-key mapping; keep the negative branch explicit.Explicit assertions on x and y make failures easier to diagnose, and the catch-all case ensures the test fails loudly if the pattern doesn’t match.
80-88
: Empty-rest case looks good.Verifies that rest captures an empty dict when there are no extra keys. Solid.
compiler/codegen/src/compile.rs (9)
3289-3303
: Correct index computation for post-star sequence elements.Using len(subject) - (size - i) by loading GetLen and then (patterns.len() - i) is the right fix for indices after a starred element.
3521-3527
: Good early validation of key/pattern length mismatch.Clear error message; prevents subtle stack discipline bugs later in MATCH_KEYS handling.
3540-3546
: Efficient fast-path for empty mapping pattern without rest.Early-return and pop the subject in the {} case is correct and avoids extra work.
3548-3562
: Length pre-check aligns with CPython semantics.len(subject) >= size check before key lookup is consistent with expected behavior.
3564-3569
: Overflow guard makes sense.The i32::MAX + 1 bound matches CPython’s intent for “too many sub-patterns in mapping pattern.”
3611-3629
: Potential stray boolean on stack after MATCH_KEYS None-check.The sequence:
- CopyItem(1) # duplicate values_or_none
- LoadConst(None)
- TestOperation(IsNot)
- jump_to_fail_pop(... PopJumpIfFalse)
If JumpIfFalse doesn’t pop the tested value (it often does not, whereas JumpIfFalseOrPop does), the boolean result may remain on the stack, skewing on_top accounting. If your VM’s JumpIfFalse pops, ignore this; otherwise, add a Pop after the check passes.
Minimal patch:
self.emit_load_const(ConstantData::None); emit!( self, Instruction::TestOperation { op: bytecode::TestOperator::IsNot } ); - // Stack: [subject, keys_tuple, values_tuple, bool] - self.jump_to_fail_pop(pc, JumpOp::PopJumpIfFalse)?; + // Stack: [subject, keys_tuple, values_tuple, bool] + self.jump_to_fail_pop(pc, JumpOp::PopJumpIfFalse)?; + // On success, drop the bool (keep original values_tuple) + emit!(self, Instruction::Pop);Please verify the VM’s JumpIfFalse behavior and adjust accordingly.
3631-3635
: Double-check on_top adjustment after unpacking values.You do:
- pc.on_top += size
- pc.on_top -= 1
Sequence patterns do not subtract 1 right after unpack; they only decrement per subpattern. That extra -1 can throw off fail_pop accounting. If tests reveal mismatched pops on failure paths, consider dropping the -1:
- pc.on_top += size; // Unpacked size values, tuple replaced by values - pc.on_top -= 1; + pc.on_top += size; // Unpacked 'size' values, tuple replaced by valuesFlagging as a verification item due to subtlety.
3921-3924
: Singleton patterns now use identity (is) instead of equality.This aligns with CPython 3.10+ semantics for None/True/False.
3996-4007
: Guards are compiled and short-circuited properly.JumpIfFalseOrPop to the failure target matches CPython behavior; great to see guards wired in for both normal cases and the default case below.
vm/src/builtins/bytearray.rs (1)
173-187
: Add _MATCH_SELF flag for bytearray is consistent with class-pattern semantics.This makes patterns like case bytearray(x) bind x to the subject itself, per the new MATCH_SELF behavior. Looks good.
Please ensure the flag is consistently applied across all intended builtins (bool, bytes, dict, float, int, list, set, str, tuple) and that vm/src/frame.rs uses it for MatchClass extraction. I can provide a quick repo scan if helpful.
vm/src/builtins/bytes.rs (1)
135-149
: _MATCH_SELF on bytes aligns with case bytes(x) matching “self”.Matches the intended “str(x) fix” family of changes. No further issues spotted here.
vm/src/builtins/dict.rs (1)
179-181
: _MATCH_SELF on dict is in line with new pattern-matching rules.Enables case dict(x) to bind the subject as x. The rest of the file remains unaffected.
vm/src/builtins/set.rs (2)
523-533
: Add _MATCH_SELF to set: correct for class-pattern semanticsThe new flags(BASETYPE, _MATCH_SELF) on PySet is aligned with CPython’s MATCH_SELF behavior, enabling patterns like
case set(x)
to bind the subject itself. This doesn’t interfere with sequence/mapping detection (set is neither), so risk is low. LGTM.
948-951
: Add _MATCH_SELF to frozenset: consistent and desirableMirrors the PySet change and enables
case frozenset(x)
to bind the subject. Matches CPython semantics; no concerns.vm/src/builtins/list.rs (1)
102-114
: List now participates in MATCH_SELF patternsIncluding _MATCH_SELF (while retaining SEQUENCE) correctly enables
case list(x)
to bind the subject itself, without affecting existing sequence-pattern matching. Looks good.vm/src/builtins/float.rs (1)
203-207
: Float marked with _MATCH_SELF: enablescase float(x)
bindingThis is consistent with CPython and the rest of this PR. No functional risks identified.
vm/src/builtins/int.rs (1)
319-323
: pyclass flags forPyInt
are correctly applied
Verified that this file contains three#[pyclass]
attributes, with only theimpl PyInt
block carrying the intended flags:
- vm/src/builtins/int.rs:27 –
#[pyclass(module = false, name = "int")]
on thepub struct PyInt
(intentionally flag-less)- vm/src/builtins/int.rs:321–323 –
#[pyclass(flags(BASETYPE, _MATCH_SELF), …)]
onimpl PyInt
(correct, matches CPython behavior)- vm/src/builtins/int.rs:704 –
#[pyclass]
onimpl PyRef<PyInt>
(intentionally flag-less)No other flags-bearing
pyclass
declarations remain. The summary should note a single flags-bearing update.vm/src/types/slot.rs (1)
132-136
: Introduce PyTypeFlags::_MATCH_SELF: matches CPython’s intentDefining _MATCH_SELF (1 << 22) with a clear doc comment is the right foundation for built-ins that pattern-match the subject itself. Bit placement doesn’t collide with existing flags. LGTM. As a follow-up, ensure test coverage includes positional class patterns for these built-ins (positive and negative cases), e.g.,
str(x)
,int(x)
,list(x)
,set(x)
,frozenset(x)
, and that user-defined classes without match_args don’t accept 1 positional arg.Example tests to consider adding (Python):
def test_match_self_builtins(): for obj in [42, 3.14, "hi", b"b", bytearray(b"a"), [1], (1,), {1:2}, {1}, frozenset({1})]: match obj: case int(x) | float(x) | str(x) | bytes(x) | bytearray(x) | list(x) | tuple(x) | dict(x) | set(x) | frozenset(x): assert x is obj case _: assert False, f"no match-self for {type(obj)}" def test_no_match_self_user_class(): class C: pass c = C() matched = False try: match c: case C(_): # should be invalid without __match_args__ or MATCH_SELF matched = True except TypeError: pass else: assert not matchedvm/src/builtins/tuple.rs (1)
246-258
: Adding _MATCH_SELF to tuple is correct and aligns with class-pattern semanticsFlagging tuple with _MATCH_SELF enables patterns like
case tuple(x)
to bind the subject itself when there are no__match_args__
. This matches the intent of the PR and mirrors how other builtins are handled. No runtime behavior changes here, just metadata; looks good.To double-check coverage, consider adding/expanding a snippet like:
match (1, 2): case tuple(x): assert x == (1, 2)
- Ensure subclass behavior still follows spec (i.e., only exact tuple types should provide MATCH_SELF unless the subclass opts in).
vm/src/builtins/str.rs (1)
523-536
: Enable MATCH_SELF for str to fixstr(x)
matchingAdding
flags(BASETYPE, _MATCH_SELF)
onPyStr
is the right hook forcase str(x)
to capture the subject. The change is local to type metadata and should work in tandem with the Frame’s MatchClass extraction.Please ensure tests cover:
match "hi": case str(x): assert x == "hi"
- That multi-arg
str(a, b)
patterns fail as non-match, not with TypeError.vm/src/builtins/bool.rs (1)
112-113
: Add _MATCH_SELF to boolConsistent with other builtins. Keeping bool non-BASETYPE is correct (bool isn’t a valid base), so only
_MATCH_SELF
here is appropriate.Consider adding a quick check:
match True: case bool(x): assert x is True
vm/src/frame.rs (4)
1297-1305
: MatchMapping: switch to PyTypeFlags::MAPPING — good moveUsing
PyTypeFlags::MAPPING
for the fast-path mapping check matches the new flag-based approach. This also keeps behavior consistent with how CPython’s VM guards mapping-ness in matching.Please confirm consistency with MatchKeys: given the flag check here, MatchKeys should assume a mapping and only treat KeyError as “no match” (see separate note below).
1308-1316
: MatchSequence: flag-based sequence detection is fineAligns with the mapping change and simplifies the check. No issues spotted.
1359-1464
: The above scripts will:
- Dump the
MatchClass
handling code so we can confirm it currently swallows everyErr(_)
and never distinguishesAttributeError
from other exceptions.- Show where else
fast_isinstance
is used in the VM to verify we can safely pattern-match one.fast_isinstance(vm.ctx.exceptions.attribute_error)
.Once we have those results, we can confirm whether the suggested diffs are applicable and correctly aligned with the rest of the codebase.
1297-1317
: All_MATCH_SELF
flags andMatchClass
handling verified
- Confirmed that every built-in type (bool, bytes, bytearray, dict, float, int, list, set, str, tuple) in vm/src/builtins/*.rs is declared with
flags(..., _MATCH_SELF)
- Verified that
PyTypeFlags
in vm/src/types/slot.rs includesconst _MATCH_SELF = 1 << 22;
- Spot-checked vm/src/frame.rs and found a complete
Instruction::MatchClass
arm (lines 1359–1464) with both theMATCH_SELF
fallback and appropriate error handlingAll checks pass – no further changes needed.
f6415fc
to
0bb06ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
extra_tests/snippets/syntax_match.py (1)
151-151
: Add class-pattern tests for MATCH_SELF and str(x) to reflect the PR’s stated fix.The PR mentions “Now the str(x) matching is also fixed,” but there’s no test here exercising class patterns with MATCH_SELF. Please add:
- A positive test: match s: case str(x): assert x is s
- A negative test for too many positional subpatterns on MATCH_SELF: case str(x, y) should raise TypeError at match time in CPython
- A class with custom match_args to verify positional extraction and TypeError when match_args is not a tuple or contains non-strings
Here’s a minimal addition you can append after Line 151:
+# +# class-pattern tests for MATCH_SELF and __match_args__ +# +class Box: + __match_args__ = ("value",) + def __init__(self, value): self.value = value + +def test_class_pattern_match_self_and_match_args(): + s = "hello" + match s: + case str(x): + assert x is s + case _: + assert False + + b = Box(42) + match b: + case Box(x): + assert x == 42 + case _: + assert False + + # __match_args__ must be a tuple; non-tuple should raise TypeError + class Bad1: + __match_args__ = "not-a-tuple" + try: + match Bad1(): + case Bad1(x): + assert False + except TypeError: + pass + else: + assert False, "__match_args__ non-tuple should raise TypeError" + + # MATCH_SELF accepts exactly one positional subpattern; more should error + try: + match s: + case str(x, y): + assert False + except TypeError: + pass + else: + assert False, "str() with 2 positional subpatterns should raise TypeError"vm/src/frame.rs (1)
1333-1341
: Do not swallow non-KeyError exceptions in MatchKeys; only KeyError denotes “no match”.PEP 634 semantics: missing key (KeyError) means no match; other exceptions should propagate. Current code treats all Err(_) as non-match.
Apply this diff:
- for key in keys { - match subject.get_item(key.as_object(), vm) { - Ok(value) => values.push(value), - Err(_) => { - all_match = false; - break; - } - } - } + for key in keys { + match subject.get_item(key.as_object(), vm) { + Ok(value) => values.push(value), + Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => { + all_match = false; + break; + } + Err(e) => return Err(e), + } + }compiler/codegen/src/compile.rs (1)
3456-3458
: MatchClass signature mismatch – critical fix requiredThe current codegen emits
emit!(self, Instruction::MatchClass(u32::try_from(nargs).unwrap()));but the
Instruction
enum still definesMatchClass(Arg<u32>),in
compiler/core/src/bytecode.rs
. This will not compile and is inconsistent with the intended change to carry a rawu32
argument.Please update accordingly:
- In
compiler/core/src/bytecode.rs
, change the enum variant to:- MatchClass(Arg<u32>), + MatchClass(u32),- Update the instruction‐writer macro (
w!(MatchClass, …)
) to match the new signature (i.e. expect au32
instead of anArg<u32>
marker).- Revise the VM/interpreter dispatch (e.g. in your frame execution code) to destructure
MatchClass(n)
and read the immediateu32
value, rather than handling anArg<u32>
marker.Once these changes are in place, verify that both the bytecode reader/writer and the VM frame implementation are correctly handling the raw
u32
argument.
♻️ Duplicate comments (1)
compiler/codegen/src/compile.rs (1)
3575-3596
: Validate mapping pattern key types more strictlyThe key validation now correctly accepts
None
andbytes
literals in addition to numbers, strings, and booleans, aligning with PEP 634. Usingunparse_expr
for deduplication instead ofDebug
formatting is more robust.
🧹 Nitpick comments (2)
vm/src/frame.rs (1)
1359-1460
: Add regression tests for class patterns, especially str(x).Given the semantic fixes here (nargs handling, _MATCH_SELF), please ensure we have tests covering:
- str(x) binds x to the subject string
- str(x, y) raises TypeError
- Class with match_args not a tuple raises TypeError
- match_args tuple with non-str raises TypeError (already handled)
I can add tests under extra_tests/snippets mirroring the suggestions in the Python test file review. Do you want me to open a follow-up PR with those?
compiler/codegen/src/compile.rs (1)
3565-3569
: Overflow check could be more preciseThe overflow check
size > (i32::MAX as usize + 1)
might be off by one. Consider usingsize > i32::MAX as usize
for clarity, as the check should prevent overflow when casting toi32
.- // Check for overflow (INT_MAX < size - 1) - if size > (i32::MAX as usize + 1) { + // Check for overflow to ensure safe cast to u32 + if size > i32::MAX as usize {
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Lib/test/test_patma.py
is excluded by!Lib/**
📒 Files selected for processing (14)
compiler/codegen/src/compile.rs
(7 hunks)extra_tests/snippets/syntax_match.py
(1 hunks)vm/src/builtins/bool.rs
(1 hunks)vm/src/builtins/bytearray.rs
(1 hunks)vm/src/builtins/bytes.rs
(1 hunks)vm/src/builtins/dict.rs
(1 hunks)vm/src/builtins/float.rs
(1 hunks)vm/src/builtins/int.rs
(1 hunks)vm/src/builtins/list.rs
(1 hunks)vm/src/builtins/set.rs
(2 hunks)vm/src/builtins/str.rs
(1 hunks)vm/src/builtins/tuple.rs
(1 hunks)vm/src/frame.rs
(4 hunks)vm/src/types/slot.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
- vm/src/builtins/bool.rs
- vm/src/builtins/bytes.rs
- vm/src/builtins/dict.rs
- vm/src/builtins/bytearray.rs
- vm/src/builtins/set.rs
- vm/src/builtins/str.rs
- vm/src/builtins/tuple.rs
- vm/src/types/slot.rs
- vm/src/builtins/int.rs
- vm/src/builtins/list.rs
- vm/src/builtins/float.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
vm/src/frame.rs
compiler/codegen/src/compile.rs
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py
: Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
extra_tests/snippets/syntax_match.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run rust tests (windows-latest)
🔇 Additional comments (16)
extra_tests/snippets/syntax_match.py (6)
65-71
: Good strengthening of exact bindings in two-key mapping pattern.Asserting concrete bound values for x and y plus an explicit failure path is correct and guards regressions in key lookup/binding.
72-79
: Nice addition: rest-capture semantics validated.This verifies that the unmatched keys are captured into rest as a dict, which aligns with PEP 634. Good negative branch, too.
80-88
: Covers empty-rest case well.Confirms rest is {} when no extra keys remain; this is an important edge case for mapping patterns with rest.
90-96
: Multi-key rest coverage looks solid.Validates that rest accumulates the correct remainder without disturbing bound x and y.
99-106
: Three-key + rest case completes the grid.This extends coverage to larger patterns and verifies rest size/contents after multiple exact key binds.
108-126
: Wildcard fallback tests are helpful and realistic.These mirror real-world patterns (e.g., wheelinfo). The success and failure paths are clear.
vm/src/frame.rs (3)
1452-1459
: Overall MatchClass tuple assembly and non-instance path look correct.On success, the extracted positional+keyword values are returned as a tuple; on non-instance, pushing None matches CPython’s behavior.
1297-1317
: Minor consistency note: MatchMapping/MatchSequence use flags, while MatchKeys checks PyMapping protocol.This is acceptable if the compiler guarantees MatchKeys only runs after a successful MatchMapping flag check. If not, consider aligning MatchKeys’ mapping check to flags for consistency.
Would you like me to align MatchKeys to use flags and fall back to protocol methods only when needed?
1300-1305
: Audit mapping flags for all mapping typesSwitching to
PyTypeFlags::MAPPING
in the pattern matcher is correct, but I only see it referenced inframe.rs
and in theinherit_patma_flags
helper (vm/src/builtins/type.rs) (github.com). You need to confirm that every mapping-like type explicitly sets this flag during its registration. In particular, please verify and update as needed:
- vm/src/builtins/dict.rs — ensure the
PyTypeSlots.flags
forPyDict
includesPyTypeFlags::MAPPING
.- vm/src/builtins/mappingproxy.rs — likewise, it must carry the mapping flag.
- The implementations of
collections.UserDict
andOrderedDict
(wherever they’re defined) — these custom types must also setPyTypeFlags::MAPPING
.If any of these are missing the mapping bit, they won’t be recognized as mappings by the VM’s pattern matcher.
compiler/codegen/src/compile.rs (7)
294-294
: Fix incorrect index calculation for GetLen in sub-patternsThe change from
(patterns.len() - 1)
to(patterns.len() - i)
appears to be correct for calculating the negative index from the end. However, ensure this aligns with CPython's behavior for sequence patterns with star elements.
3488-3502
: Good fix for wildcard pattern handlingThe improved wildcard detection correctly identifies true wildcards (underscore patterns without name binding) and skips compilation for them, which is an optimization. The distinction between
Pattern::MatchAs
with bothpattern
andname
asNone
versus other wildcard patterns is correctly implemented.
3519-3708
: Comprehensive mapping pattern implementation with rest supportThe mapping pattern logic has been significantly improved with:
- Proper rest pattern handling (
**rest
)- Robust key validation ensuring only literals and attributes
- Clear duplicate detection using
unparse_expr
for stable representation- Correct stack manipulation for rest dict creation
The implementation correctly builds a rest dict, removes consumed keys, and stores the result. The stack operations are well-documented with comments showing stack state at each step.
3926-3932
: Correct singleton pattern matching semanticsThe change from
CompareOperation::Equal
toTestOperation::Is
for singleton patterns is correct. PEP 634 specifies that singleton patterns (None
,True
,False
) should use identity checks, not equality.
4002-4014
: Guards implementation completedThe guard implementation correctly:
- Compiles the guard expression
- Uses
JumpIfFalseOrPop
to jump to the fail block if the guard is false- Properly handles the stack state
This replaces the previous
NotImplementedYet
stub.
4032-4036
: Default case guard handling looks correctThe guard handling for the default case in match statements is properly implemented, using
JumpIfFalseOrPop
to skip the body if the guard fails.
3653-3698
: Tests for mapping-rest patterns are already in placeThe
extra_tests/snippets/syntax_match.py
file includes comprehensive tests for mapping patterns with**rest
, covering:
- Single-key rest (
case {"a": x, **rest}
) and empty rest- Multiple-key rest scenarios
- Edge cases (wildcard fallback, no-match)
- A dedicated
test_mapping_comprehensive()
function exercising key captures and rest behaviorNo additional test coverage is required.
debug_assert!(len > 0, "stack underflow in SWAP"); | ||
let i = len - 1; // TOS index | ||
let index_val = index.get(arg) as usize; | ||
// CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n] | ||
// This means swap TOS with the element at index (len - n) | ||
debug_assert!( | ||
index_val <= len, | ||
"SWAP index {} exceeds stack size {}", | ||
index_val, | ||
len | ||
); | ||
let j = len - index_val; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
SWAP index guard is incomplete; add lower-bound check and a release-mode safety path.
SWAP(0) would compute j = len - 0 and panic in release. Compiler should not emit 0, but add a debug lower-bound assertion and a release-mode guard to prevent hard-to-diagnose panics.
Apply this diff:
let len = self.state.stack.len();
- debug_assert!(len > 0, "stack underflow in SWAP");
+ debug_assert!(len > 0, "stack underflow in SWAP");
let i = len - 1; // TOS index
let index_val = index.get(arg) as usize;
// CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n]
// This means swap TOS with the element at index (len - n)
- debug_assert!(
- index_val <= len,
- "SWAP index {} exceeds stack size {}",
- index_val,
- len
- );
+ debug_assert!(index_val >= 1, "SWAP index must be >= 1");
+ debug_assert!(
+ index_val <= len,
+ "SWAP index {} exceeds stack size {}",
+ index_val,
+ len
+ );
+ if index_val == 0 || index_val > len {
+ self.fatal("SWAP index out of range");
+ }
let j = len - index_val;
self.state.stack.swap(i, j);
🤖 Prompt for AI Agents
In vm/src/frame.rs around lines 705–716, the SWAP index check only asserts upper
bound and can produce a release-mode panic for SWAP(0); add a lower-bound check
and an explicit runtime guard: insert debug_assert!(index_val > 0, "SWAP index
must be >= 1, got {}", index_val) and replace the unchecked subtraction with an
explicit runtime check that handles index_val == 0 safely (e.g. return a clear
error or panic with a descriptive message like "invalid SWAP(0)" instead of
allowing len - 0 to underflow), keeping the existing upper-bound assertion.
0bb06ab
to
5617d23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
vm/src/builtins/float.rs (1)
593-593
: Fix inverted float truthiness in AsNumber::boolean slotThe
boolean
slot for floats currently returns true only when the value is zero, which is the opposite of Python’s semantics and inconsistent withPyFloat::__bool__
(which returnsself.value != 0.0
).• Location:
- File:
vm/src/builtins/float.rs
- In the
AsNumber
implementation, around line 593• Change required:
- boolean: Some(|num, _vm| Ok(PyFloat::number_downcast(num).value.is_zero())), + boolean: Some(|num, _vm| Ok(!PyFloat::number_downcast(num).value.is_zero())),• Test suggestion: add a targeted test case to ensure correct truthiness, for example in
extra_tests/snippets/float_bool.py
or similar:assert bool(0.0) is False assert bool(-0.0) is False assert bool(1.0) is True assert bool(float('nan')) is True
♻️ Duplicate comments (1)
compiler/codegen/src/compile.rs (1)
3582-3615
: Key validation now matches PEP 634 and avoids Debug-based dedupe.
- Accepts numbers, strings, bytes, booleans, None, and attribute lookups.
- Duplicate literal keys checked via unparse for stability.
This addresses prior review feedback and aligns with CPython.
🧹 Nitpick comments (7)
vm/src/builtins/bool.rs (1)
90-94
: Avoid runtime panic: replace todo!() in Debug impl for PyBoolLeaving todo!() will panic if PyBool is ever formatted with {:?}. Implementing a trivial Debug avoids surprises and aids diagnostics.
Apply:
impl Debug for PyBool { - fn fmt(&self, _f: &mut Formatter<'_>) -> std::fmt::Result { - todo!() - } + fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { + f.write_str("PyBool") + } }compiler/codegen/src/compile.rs (6)
3291-3303
: Fix: post-star index calculation is now correct (no off-by-one).Using len(subject) - (size - i) via the constant (patterns.len() - i) matches PEP 634 semantics for elements after the starred slot. Consider adding a short comment spelling out the formula to prevent regressions.
3520-3535
: Validation: length parity check is good; underscore as rest target likely should be a SyntaxError, but message can be clearer.
- keys.len() vs patterns.len() guard is correct.
- Rejecting “**_” aligns with CPython behavior; tweak the error to be explicit for developers.
Apply this diff to improve the diagnostic:
- if let Some(rest) = star_target { - if rest.as_str() == "_" { - return Err(self.error(CodegenErrorType::SyntaxError("invalid syntax".to_string()))); - } - } + if let Some(rest) = star_target { + if rest.as_str() == "_" { + return Err(self.error(CodegenErrorType::SyntaxError( + "cannot use '_' as mapping rest target".to_string(), + ))); + } + }Also please add a parser/compile-time test that
match x: case {**_}: ...
raises a SyntaxError.
3556-3569
: Optional: drop the pre-len(subject) >= size
check for mappings.
MATCH_KEYS
already determines success; the extraGET_LEN
adds an additional user-visible__len__
call (cost and potential side effects) and isn’t required for correctness. CPython’s mapping pattern path relies onMATCH_MAPPING
/MATCH_KEYS
without an upfront length guard.If you prefer simplicity over micro-optimization, remove the block below.
- if size > 0 { - // Check if the mapping has at least 'size' keys - emit!(self, Instruction::GetLen); - self.emit_load_const(ConstantData::Integer { value: size.into() }); - // Stack: [subject, len, size] - emit!( - self, - Instruction::CompareOperation { - op: ComparisonOperator::GreaterOrEqual - } - ); - self.jump_to_fail_pop(pc, JumpOp::PopJumpIfFalse)?; - // Stack: [subject] - } + // No upfront __len__ check; rely on MATCH_KEYS to determine presence of required keys.
3619-3660
: MATCH_KEYS success path and on_top accounting look sound; add a clarifying comment.
- Tuple-of-keys + MATCH_KEYS + “is not None” is the right shape.
- After unpacking values, you increment on_top by size and then decrement per subpattern; finally you decrement by 2 for subject + keys_tuple. The arithmetic balances, but it’s subtle.
Add a short comment summarizing the on_top invariant before and after unpack to ease future maintenance.
3650-3654
: Minor: iterate directly over patterns to avoid indexing cast.This is purely stylistic; using
for pat in patterns { ... }
avoidsi as usize
.
3662-3712
: Rest-dict construction is correct; consider an intrinsic for “dict minus keys” later.Building
{}
, DICT_UPDATE from subject, then deleting matched keys yields the proper rest mapping. If a VM intrinsic (e.g., MapWithoutKeys) ever lands, this block would be a good candidate for a targeted speedup; until then, this is clear and correct.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Lib/test/test_patma.py
is excluded by!Lib/**
📒 Files selected for processing (14)
compiler/codegen/src/compile.rs
(7 hunks)extra_tests/snippets/syntax_match.py
(1 hunks)vm/src/builtins/bool.rs
(1 hunks)vm/src/builtins/bytearray.rs
(1 hunks)vm/src/builtins/bytes.rs
(1 hunks)vm/src/builtins/dict.rs
(1 hunks)vm/src/builtins/float.rs
(1 hunks)vm/src/builtins/int.rs
(1 hunks)vm/src/builtins/list.rs
(1 hunks)vm/src/builtins/set.rs
(2 hunks)vm/src/builtins/str.rs
(1 hunks)vm/src/builtins/tuple.rs
(1 hunks)vm/src/frame.rs
(5 hunks)vm/src/types/slot.rs
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- vm/src/builtins/str.rs
🚧 Files skipped from review as they are similar to previous changes (10)
- vm/src/builtins/dict.rs
- vm/src/types/slot.rs
- vm/src/builtins/bytes.rs
- vm/src/builtins/list.rs
- vm/src/builtins/bytearray.rs
- vm/src/builtins/tuple.rs
- vm/src/builtins/int.rs
- vm/src/frame.rs
- extra_tests/snippets/syntax_match.py
- vm/src/builtins/set.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
vm/src/builtins/bool.rs
vm/src/builtins/float.rs
compiler/codegen/src/compile.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (8)
vm/src/builtins/float.rs (1)
203-206
: MATCH_SELF flag coverage verified across all builtinsConfirmed that the
_MATCH_SELF
flag is present on all intended builtins—bool, int, float, bytes, bytearray, str, list, tuple, set, and dict—so no further changes are required.vm/src/builtins/bool.rs (1)
112-112
: ACTION REQUIRED: Confirm MATCH_SELF flag rollout across builtinsThe
_MATCH_SELF
flag onbool
(vm/src/builtins/bool.rs:112) correctly enables class-pattern “match self” semantics for booleans. However, our quick grep acrossvm/src/builtins
only finds this single instance:• vm/src/builtins/bool.rs:112 –
#[pyclass(..., flags(_MATCH_SELF))]
No other builtins currently include
_MATCH_SELF
. To ensure consistent class-pattern matching support, please:
- Verify whether other builtins (e.g.,
int
,float
,str
,list
,dict
, etc.) should also carry theMATCH_SELF
flag.- If so, apply the same metadata change to those classes.
- Otherwise, document that
bool
is intentionally the sole builtin requiring_MATCH_SELF
.compiler/codegen/src/compile.rs (6)
3487-3503
: Skip-compiling true wildcards in class patterns is a nice win; double-check AST shape.Early POP + continue for true wildcards avoids unnecessary work and binding checks. Ensure that ruff’s AST represents “_” as MatchAs { pattern: None, name: None } in this context; if so, this logic is solid. Add a small test covering both positional and keyword wildcards inside a class pattern.
3547-3554
: Fast-path for empty mapping pattern is correct.Special-casing
{}
to immediately POP the subject is faithful to CPython and reduces work.
3572-3579
: Overflow guard tightened; cast is safe.Using size > i32::MAX + 1 with the cast warning suppressed is appropriate here.
3936-3938
: Use of ‘is’ for singletons is correct.Switching to TestOperation::Is matches Python semantics for None/True/False.
4011-4021
: Guards: compile and branch on false using JumpIfFalseOrPop to the fail label.This integrates guard semantics cleanly with the existing fail_pop machinery.
4040-4043
: Default-case guards compiled consistently.Compiling the guard and jumping to end if false mirrors CPython behavior.
5617d23
to
67bbe81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
vm/src/frame.rs (3)
705-716
: SWAP needs lower-bound check and a release-mode guard (SWAP(0) will panic).
Current code asserts upper bound only; with index 0,j = len - 0 == len
causing out-of-bounds swap in release builds.Apply this diff:
let len = self.state.stack.len(); debug_assert!(len > 0, "stack underflow in SWAP"); let i = len - 1; // TOS index let index_val = index.get(arg) as usize; // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n] // This means swap TOS with the element at index (len - n) - debug_assert!( - index_val <= len, - "SWAP index {} exceeds stack size {}", - index_val, - len - ); + debug_assert!(index_val >= 1, "SWAP index must be >= 1"); + debug_assert!( + index_val <= len, + "SWAP index {} exceeds stack size {}", + index_val, + len + ); + if index_val == 0 || index_val > len { + self.fatal("SWAP index out of range"); + } let j = len - index_val; self.state.stack.swap(i, j);
1387-1396
: Non-tuple match_args must raise TypeError.
Returning non-match diverges from CPython.- let match_args = match match_args.downcast_exact::<PyTuple>(vm) { - Ok(tuple) => tuple, - Err(_) => { - // __match_args__ must be a tuple - self.push_value(vm.ctx.none()); - return Ok(None); - } - }; + let match_args = match match_args.downcast_exact::<PyTuple>(vm) { + Ok(tuple) => tuple, + Err(_) => { + return Err(vm.new_type_error( + "__match_args__ must be a tuple".to_string(), + )); + } + };
1432-1447
: MATCH_SELF error semantics: >1 positional subpatterns must raise TypeError; and when match_args is absent (non-MATCH_SELF) with nargs > 0, raise TypeError.
Returning non-match here diverges from CPython and masks programmer errors.if is_match_self_type { if nargs_val == 1 { // Match the subject itself as the single positional argument extracted.push(subject.clone()); } else if nargs_val > 1 { - // Too many positional arguments for MATCH_SELF - self.push_value(vm.ctx.none()); - return Ok(None); + return Err(vm.new_type_error( + "class pattern accepts at most 1 positional sub-pattern for MATCH_SELF types" + .to_string(), + )); } } else { // No __match_args__ and not a MATCH_SELF type - if nargs_val > 0 { - self.push_value(vm.ctx.none()); - return Ok(None); - } + if nargs_val > 0 { + return Err(vm.new_type_error( + "class pattern defines no positional sub-patterns (__match_args__ missing)" + .to_string(), + )); + } }
🧹 Nitpick comments (5)
vm/src/frame.rs (5)
1335-1368
: Unify mapping detection in MatchKeys with MatchMapping to avoid divergent behavior.
CurrentlyMatchKeys
usesPyMapping::check(&subject)
whileMatchMapping
usesPyTypeFlags::MAPPING
. This can yield inconsistent outcomes (e.g., flagged type passes MatchMapping but fails MatchKeys). Recommend using the same flag gate here and keep the good error behavior (KeyError → non-match; others propagate).Apply this diff:
- // Check if subject is a mapping and extract values for keys - if PyMapping::check(&subject) { + // Check mapping via type flag for consistency with MatchMapping + if subject.class().slots.flags.contains(PyTypeFlags::MAPPING) { let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap(); let mut values = Vec::new(); let mut all_match = true; for key in keys { match subject.get_item(key.as_object(), vm) { Ok(value) => values.push(value), Err(e) if e.fast_isinstance(vm.ctx.exceptions.key_error) => { all_match = false; break; } Err(e) => return Err(e), } }If this becomes the only usage of
PyMapping
in this file, remove the unused import:-use crate::protocol::PyMapping;
1384-1388
: Don’t swallow exceptions when fetching match_args; use get_attribute_opt to distinguish “absent” from “error.”
cls.get_attr(...).ok()
treats unexpected errors as “missing,” hiding bugs in descriptors. Prefervm.get_attribute_opt(...)
which returnsPyResult<Option<...>>
.- let match_args = cls.get_attr(vm.ctx.intern_str("__match_args__"), vm).ok(); + let match_args = vm.get_attribute_opt(cls.clone(), vm.ctx.intern_str("__match_args__"))?;
1415-1423
: Only treat AttributeError as a non-match; propagate other exceptions from attribute access.
Catching all errors as non-match hides real failures (e.g., descriptor raising TypeError).- match subject.get_attr(attr_name_str, vm) { - Ok(value) => extracted.push(value), - Err(_) => { - // Attribute doesn't exist - self.push_value(vm.ctx.none()); - return Ok(None); - } - } + match subject.get_attr(attr_name_str, vm) { + Ok(value) => extracted.push(value), + Err(e) if e.fast_isinstance(vm.ctx.exceptions.attribute_error) => { + // Missing attribute → non-match + self.push_value(vm.ctx.none()); + return Ok(None); + } + Err(e) => return Err(e), + }
1454-1461
: Same note for keyword attributes: only AttributeError should lead to non-match.
Other exceptions from attribute access should propagate.- match subject.get_attr(name_str, vm) { - Ok(value) => extracted.push(value), - Err(_) => { - // Attribute doesn't exist - self.push_value(vm.ctx.none()); - return Ok(None); - } - } + match subject.get_attr(name_str, vm) { + Ok(value) => extracted.push(value), + Err(e) if e.fast_isinstance(vm.ctx.exceptions.attribute_error) => { + self.push_value(vm.ctx.none()); + return Ok(None); + } + Err(e) => return Err(e), + }
1309-1325
: Behavioral consistency note (flags vs protocols).
You’ve moved MatchMapping/MatchSequence to flags. MatchKeys still relies on protocol checks (now only for non-flagged objects if you keep it). Decide explicitly:
- Either make all three opcodes flag-driven (my recommendation for internal consistency and speed),
- Or keep protocol checks across the board to support Python-level custom mappings/sequences.
If you want, I can open a follow-up PR to align MatchKeys and add tests covering:
- dict vs custom mapping with only keys/getitem
- str(x) success and str(x, y) TypeError
- match_args not tuple and too many positional subpatterns.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Lib/test/test_patma.py
is excluded by!Lib/**
📒 Files selected for processing (15)
compiler/codegen/src/compile.rs
(7 hunks)compiler/core/src/bytecode.rs
(7 hunks)extra_tests/snippets/syntax_match.py
(1 hunks)vm/src/builtins/bool.rs
(1 hunks)vm/src/builtins/bytearray.rs
(1 hunks)vm/src/builtins/bytes.rs
(1 hunks)vm/src/builtins/dict.rs
(1 hunks)vm/src/builtins/float.rs
(1 hunks)vm/src/builtins/int.rs
(1 hunks)vm/src/builtins/list.rs
(1 hunks)vm/src/builtins/set.rs
(2 hunks)vm/src/builtins/str.rs
(1 hunks)vm/src/builtins/tuple.rs
(1 hunks)vm/src/frame.rs
(8 hunks)vm/src/types/slot.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
- vm/src/builtins/set.rs
- vm/src/builtins/bytes.rs
- vm/src/builtins/bool.rs
- vm/src/builtins/dict.rs
- vm/src/builtins/float.rs
- vm/src/builtins/list.rs
- vm/src/types/slot.rs
- vm/src/builtins/int.rs
- vm/src/builtins/tuple.rs
- vm/src/builtins/bytearray.rs
- extra_tests/snippets/syntax_match.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
compiler/codegen/src/compile.rs
compiler/core/src/bytecode.rs
vm/src/builtins/str.rs
vm/src/frame.rs
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Applied to files:
vm/src/frame.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run tests under miri
🔇 Additional comments (20)
vm/src/builtins/str.rs (1)
523-536
: Verified _MATCH_SELF flag across builtinsI ran a repo-wide search and confirmed that all the intended built-ins opt into
_MATCH_SELF
:
- bool (vm/src/builtins/bool.rs:112)
- bytearray (vm/src/builtins/bytearray.rs:173)
- bytes (vm/src/builtins/bytes.rs:135)
- dict (vm/src/builtins/dict.rs:179)
- float (vm/src/builtins/float.rs:204)
- int (vm/src/builtins/int.rs:320)
- list (vm/src/builtins/list.rs:112)
- set (vm/src/builtins/set.rs:532, 949)
- str (vm/src/builtins/str.rs:524)
- tuple (vm/src/builtins/tuple.rs:247)
Remaining verification needed:
- Confirm at runtime that exactly one positional subpattern is enforced (i.e.
case str(a, b):
against astr
should raiseTypeError
).- It’d be good to add the proposed
str_class_pattern_tests.py
(or similar) to lock in that behavior.Please run those tests manually (or add them to your test suite) and verify the spec-compliant error is raised.
compiler/core/src/bytecode.rs (5)
528-528
: LGTM! Properly restored ToBool instruction.The ToBool instruction has been appropriately restored from its previously commented state, which aligns with the runtime implementation in
vm/src/frame.rs
.
558-561
: LGTM! Well-documented PopJumpIfFalse instruction.The new PopJumpIfFalse instruction is properly documented and follows the existing instruction pattern. The doc comment clearly explains its behavior: "Pop the top of the stack, then pop the next value and jump if it is false."
1264-1264
: LGTM! Correctly added PopJumpIfFalse to label handling.The PopJumpIfFalse instruction has been properly integrated into the
label_arg()
method to ensure it's recognized as a jump instruction with a label target.
1338-1338
: LGTM! Accurate stack effect calculations.The stack effects are correctly specified:
- ToBool: 0 (replaces TOS with its boolean value)
- PopJumpIfFalse: -1 (pops one item from stack)
These values accurately reflect the operations performed by each instruction.
Also applies to: 1349-1349
1541-1541
: LGTM! Display formatting properly implemented.The new instructions are correctly integrated into the display/disassembly logic.
Also applies to: 1556-1556
compiler/codegen/src/compile.rs (9)
3488-3503
: Good improvement: Proper wildcard pattern handling.The addition of
is_true_wildcard
logic correctly distinguishes between actual wildcard patterns ("_" without name binding) and named captures, preventing unnecessary pattern compilation for true wildcards. This optimization aligns with CPython's behavior.
3521-3535
: LGTM! Proper validation for mapping patterns.Good error handling:
- Validates that keys and patterns array lengths match
- Correctly rejects "_" as a rest pattern target with appropriate error message
3571-3578
: LGTM! Proper overflow check with safe casting.The overflow check correctly ensures that
size
doesn't exceedi32::MAX + 1
before casting to u32. The#[allow(clippy::cast_possible_truncation)]
is justified here since the check guarantees safety.
3582-3616
: Good implementation of PEP 634 key validation.The key validation logic correctly implements PEP 634 requirements:
- Allows literals (number, string, bytes, boolean, None)
- Allows attribute lookups
- Properly detects and reports duplicate keys using stable string representation
3662-3705
: Complex but correct rest pattern handling.The rest pattern implementation properly:
- Creates an empty dict and updates it with the subject
- Unpacks keys and removes them from the rest dict
- Correctly manages the stack throughout the operation
The stack comments help track the state at each step, which aids maintainability.
3936-3938
: Correct implementation: Using Is operator for singleton matching.The change from
CompareOperation::Equal
toTestOperation::Is
is correct for singleton pattern matching (None, True, False), as per PEP 634 specification.
4010-4020
: LGTM! Guard implementation with proper boolean conversion.The guard compilation correctly:
- Compiles the guard expression
- Converts to boolean with ToBool
- Uses PopJumpIfFalse to jump to failure if guard is false
This properly implements the guard semantics for match cases.
4039-4043
: Guard handling for default case looks correct.The guard compilation for the default (catch-all) case properly uses JumpIfFalseOrPop to either jump to end if guard is false or continue if true.
3294-3295
: Verify GetLen Index Calculation LogicI wasn’t able to find any existing tests in
extra_tests/
covering the extraction of elements after a star pattern in sequence matching. Please confirm that the updated expression:value: (patterns.len() - i).into(),correctly computes the number of elements following position
i
(previously(patterns.len() - 1)
). If this is intentional:
- Ensure that
(patterns.len() - i)
aligns with the intended semantics ofGetLen
when compiling star-pattern sequences.- Consider adding targeted tests (e.g., sequence patterns with varying lengths and star positions) to validate both edge cases (star at start/end) and typical scenarios.
File:
compiler/codegen/src/compile.rs
Lines: 3294–3295vm/src/frame.rs (5)
3-3
: Import of PyTypeFlags is appropriate and used correctly later.
No issues; the flag checks for MAPPING/SEQUENCE/_MATCH_SELF below rely on this.
1094-1099
: ToBool opcode implementation looks correct.
Pops, converts viatry_to_bool
, pushes bool. Matches the expected semantics and propagates exceptions properly.
1248-1250
: PopJumpIfFalse + helper extraction is solid.
Refactoring intopop_jump_if
keeps semantics aligned withJumpIfFalse
and reduces duplication.Also applies to: 1970-1986
1320-1325
: MatchSequence flag check is OK and aligns with the intended semantics.
This matches the retrieved learnings that only certain builtins (list/tuple) should be considered sequences for pattern matching.Note: I used your prior clarification (learning) that bytes/bytearray/range are not sequences in this context.
1345-1349
: Correct: KeyError → non-match; other errors propagate.
This mirrors CPython behavior for mapping lookups inside match.
I'll take a look tomorrow. |
67bbe81
to
ef1ed94
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
compiler/codegen/src/compile.rs (1)
3584-3603
: Correct key validation to include None and bytes literals.The current implementation correctly identifies the allowed key types according to PEP 634, but the error message should be updated to reflect the full set of allowed types.
This issue was already addressed in a previous review comment that noted the need to support None and bytes as mapping keys. The current implementation correctly handles all required literal types.
vm/src/frame.rs (2)
702-716
: SWAP(0) can panic in release; add lower-bound assert and a runtime guardCurrent code only asserts the upper bound and will compute j = len - 0 => len, causing swap out-of-bounds. Guard index_val >= 1 in debug and handle invalid values in release.
Apply this diff:
let len = self.state.stack.len(); debug_assert!(len > 0, "stack underflow in SWAP"); let i = len - 1; // TOS index let index_val = index.get(arg) as usize; // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n] // This means swap TOS with the element at index (len - n) - debug_assert!( - index_val <= len, - "SWAP index {} exceeds stack size {}", - index_val, - len - ); + debug_assert!(index_val >= 1, "SWAP index must be >= 1"); + debug_assert!( + index_val <= len, + "SWAP index {} exceeds stack size {}", + index_val, + len + ); + if index_val == 0 || index_val > len { + self.fatal("SWAP index out of range"); + } let j = len - index_val; self.state.stack.swap(i, j);
1387-1394
: Non-tuple match_args must raise TypeError, not result in a silent non-matchCurrent code pushes None and returns Ok(None). CPython raises TypeError here.
Apply this diff:
- let match_args = match match_args.downcast_exact::<PyTuple>(vm) { - Ok(tuple) => tuple, - Err(_) => { - // __match_args__ must be a tuple - self.push_value(vm.ctx.none()); - return Ok(None); - } - }; + let match_args = match match_args.downcast_exact::<PyTuple>(vm) { + Ok(tuple) => tuple, + Err(_) => { + return Err(vm.new_type_error( + "__match_args__ must be a tuple".to_string(), + )); + } + };
🧹 Nitpick comments (3)
vm/src/frame.rs (3)
1429-1455
: Nit: Improve TypeError message when nargs > 0 but no match_args and not MATCH_SELFYou correctly raise TypeError. Consider matching CPython’s phrasing by including the class name and the number given, e.g., “C() accepts 0 positional sub-patterns (N given)”. This helps users diagnose the exact class that rejected positional subpatterns.
Would you like me to draft a small helper to derive the class display name safely for the error message?
1366-1478
: MatchClass overall: semantics for MATCH_SELF and keyword extraction look good; add targeted tests
- MATCH_SELF path enforces at most one positional subpattern and returns TypeError on >1 — good.
- Missing attributes for positional/keyword parts result in a non-match — good.
Happy to draft tests covering:
- str(x) matches and str(x, y) raises TypeError.
- Class with match_args of varying lengths (exact, too short → TypeError).
- match_args containing a non-str → TypeError.
- Keyword attribute missing → non-match.
1306-1311
: Consider protocol-based fallback for mapping patternsThe current implementation only treats types with the
MAPPING
flag (i.e.dict
and its subclasses) as mapping patterns, which means:
- Built-in mapping views (
mappingproxy
), generic aliases,range
, and any user-defined classes that implement the mapping protocol (e.g. definekeys
/__getitem__
) won’t match a mapping pattern.- This deviates from PEP 634, where mapping patterns are defined by the mapping protocol rather than hard-coded type checks.
To preserve compatibility with both built-ins and third-party mappings, you can fall back to a protocol check:
• In
vm/src/frame.rs
at lines 1306–1311, change:- // Check if the type has the MAPPING flag - let is_mapping = subject.class().slots.flags.contains(PyTypeFlags::MAPPING); + // Mapping flag or protocol-based fallback + let is_mapping = subject.class().slots.flags.contains(PyTypeFlags::MAPPING) + || vm.get_method(subject.clone(), vm.ctx.intern_str("keys")).is_some(); self.push_value(subject); self.push_value(vm.ctx.new_bool(is_mapping).into());• Similarly, in the mapping-extraction branch (around line 1331), replace the flag check:
- if subject.class().slots.flags.contains(PyTypeFlags::MAPPING) { + if subject.class().slots.flags.contains(PyTypeFlags::MAPPING) + || vm.get_method(subject.clone(), vm.ctx.intern_str("keys")).is_some() {This hybrid approach keeps the fast path for built-ins (via flags) while ensuring any object with a
keys
method will also match mapping patterns.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Lib/test/test_patma.py
is excluded by!Lib/**
📒 Files selected for processing (14)
compiler/codegen/src/compile.rs
(7 hunks)extra_tests/snippets/syntax_match.py
(1 hunks)vm/src/builtins/bool.rs
(1 hunks)vm/src/builtins/bytearray.rs
(1 hunks)vm/src/builtins/bytes.rs
(1 hunks)vm/src/builtins/dict.rs
(1 hunks)vm/src/builtins/float.rs
(1 hunks)vm/src/builtins/int.rs
(1 hunks)vm/src/builtins/list.rs
(1 hunks)vm/src/builtins/set.rs
(2 hunks)vm/src/builtins/str.rs
(1 hunks)vm/src/builtins/tuple.rs
(1 hunks)vm/src/frame.rs
(5 hunks)vm/src/types/slot.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- vm/src/builtins/dict.rs
- vm/src/builtins/bytes.rs
- vm/src/builtins/str.rs
- vm/src/builtins/int.rs
- vm/src/builtins/bytearray.rs
- vm/src/builtins/list.rs
- vm/src/builtins/set.rs
- vm/src/builtins/tuple.rs
- vm/src/types/slot.rs
- vm/src/builtins/float.rs
- vm/src/builtins/bool.rs
- extra_tests/snippets/syntax_match.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
compiler/codegen/src/compile.rs
vm/src/frame.rs
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Applied to files:
vm/src/frame.rs
🧬 Code graph analysis (1)
compiler/codegen/src/compile.rs (1)
compiler/codegen/src/unparse.rs (2)
unparse_expr
(61-451)unparse_expr
(608-610)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (8)
compiler/codegen/src/compile.rs (4)
3522-3578
: Enhance mapping pattern validation and error handling.The validation logic correctly implements PEP 634 requirements but could benefit from clearer error messages and more robust overflow checking.
The implementation properly validates:
- Pattern count matches key count
- Rest pattern cannot be
_
- Keys must be literals or attribute lookups
- No duplicate keys
The overflow check and error messages are appropriate for the context.
3935-3939
: Use TestOperation with Is operator for singleton matching.The change from
CompareOperation
withEqual
toTestOperation
withIs
is correct for singleton pattern matching, as singletons should be compared using identity (is
) rather than equality (==
).This change aligns with Python's semantics where
None
,True
, andFalse
are compared using identity.
4010-4021
: Implement guard expression compilation correctly.The guard implementation is now complete and correct, properly compiling the guard expression and jumping to the failure target when the guard evaluates to false.
The implementation:
- Compiles the guard expression
- Converts to boolean using
ToBool
- Jumps to the appropriate failure block if false
4039-4043
: Handle guards in default match cases.The guard handling for default cases is implemented correctly, using
JumpIfFalseOrPop
to jump to the end if the guard fails.This ensures that even default patterns with guards can fail and continue to the end of the match statement.
vm/src/frame.rs (4)
18-18
: Importing PyTypeFlags for flag-driven pattern checks — LGTMThis aligns with the new flag-based recognition in MatchMapping/MatchSequence/MatchClass.
23-23
: Wtf8Buf import for BUILD_STRING — LGTMMatches the use in BuildString to collect utf-8 chunks efficiently.
1317-1322
: MatchSequence flag check looks correct for intended semanticsChecks
PyTypeFlags::SEQUENCE
. Based on prior guidance, only list and tuple should carry this flag (bytes/bytearray/range are not sequences for pattern matching).
1332-1365
: MatchKeys aligns with MatchMapping (flag-based) and handles KeyError vs other errors correctly — LGTM
- Stack effect matches CPython: [subject, keys] -> [subject, keys_or_none, values_or_none].
- On missing key (KeyError) returns non-match; other errors propagate. Good.
ef1ed94
to
1984856
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
compiler/codegen/src/compile.rs (1)
3294-3294
: Fix the off-by-one error in index calculationThe index calculation
(patterns.len() - i)
is incorrect. When accessing elements from the end of a sequence, the correct formula should be(patterns.len() - 1 - i)
to get the proper negative offset.Apply this diff to fix the index:
- value: (patterns.len() - i).into(), + value: (patterns.len() - 1 - i).into(),vm/src/frame.rs (1)
703-716
: SWAP(n): add lower-bound guard and release-mode safety to prevent out-of-bounds swapCurrent code asserts len > 0 and only upper-bounds index_val. SWAP(0) computes j = len - 0 == len, which panics in release. Prior review already requested a lower-bound check and a runtime guard.
Apply this diff:
let len = self.state.stack.len(); debug_assert!(len > 0, "stack underflow in SWAP"); let i = len - 1; // TOS index let index_val = index.get(arg) as usize; // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n] // This means swap TOS with the element at index (len - n) + debug_assert!(index_val >= 1, "SWAP index must be >= 1"); debug_assert!( index_val <= len, "SWAP index {} exceeds stack size {}", index_val, len ); + if index_val == 0 || index_val > len { + self.fatal("SWAP index out of range"); + } let j = len - index_val; self.state.stack.swap(i, j);
🧹 Nitpick comments (3)
compiler/codegen/src/compile.rs (1)
3662-3705
: Complex but correct rest pattern implementationThe rest pattern handling correctly:
- Builds an empty dict
- Updates it with the subject
- Removes matched keys
- Stores the resulting rest dict
The stack manipulation with proper rotation and cleanup ensures values end up in the right place.
Consider adding a comment block explaining the stack state at each major step. For example:
# Stack states during rest pattern processing: # Initial: [subject, keys_tuple] # After BuildMap: [subject, keys_tuple, {}] # After Swap: [{}, keys_tuple, subject] # After DictUpdate: [rest_dict, keys_tuple] # ... etcThis would help future maintainers understand the complex stack operations.
vm/src/frame.rs (2)
1325-1388
: MatchKeys: semantics look right; small nits for perf and consistency
- Preallocate values with capacity to avoid re-allocations.
- Use interned "get" for the attribute lookup to avoid repeated string allocations and stay consistent with keys().
Apply this diff:
- let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap(); - let mut values = Vec::new(); + let keys = keys_tuple.downcast_ref::<PyTuple>().unwrap(); + let mut values = Vec::with_capacity(keys.len()); @@ - if let Ok(get_method) = subject.get_attr("get", vm) { + if let Ok(get_method) = subject.get_attr(vm.ctx.intern_str("get"), vm) {
1428-1434
: MatchClass: align error messages with CPython; avoid unwrap on keyword namesThe control flow and error/OK(None) split look correct. To match CPython’s error messages and improve robustness:
- Include the class name and counts in TypeErrors for positional sub-pattern count mismatches.
- Don’t unwrap keyword attribute names; raise TypeError if a non-str sneaks in (even if compiler-generated).
Apply this diff:
- if match_args.len() < nargs_val { - return Err(vm.new_type_error(format!( - "class pattern accepts at most {} positional sub-patterns ({} given)", - match_args.len(), - nargs_val - ))); - } + if match_args.len() < nargs_val { + let type_name = cls + .downcast::<crate::builtins::PyType>() + .map(|t| t.__name__(vm).as_str().to_owned()) + .unwrap_or_else(|| String::from("?")); + return Err(vm.new_type_error(format!( + "{}() accepts at most {} positional sub-patterns ({} given)", + type_name, + match_args.len(), + nargs_val + ))); + } @@ - } else if nargs_val > 1 { - // Too many positional arguments for MATCH_SELF - return Err(vm.new_type_error( - "class pattern accepts at most 1 positional sub-pattern for MATCH_SELF types" - .to_string(), - )); - } + } else if nargs_val > 1 { + let type_name = cls + .downcast::<crate::builtins::PyType>() + .map(|t| t.__name__(vm).as_str().to_owned()) + .unwrap_or_else(|| String::from("?")); + return Err(vm.new_type_error(format!( + "{}() accepts at most 1 positional sub-pattern ({} given)", + type_name, + nargs_val + ))); + } @@ - if nargs_val > 0 { - return Err(vm.new_type_error( - "class pattern defines no positional sub-patterns (__match_args__ missing)" - .to_string(), - )); - } + if nargs_val > 0 { + let type_name = cls + .downcast::<crate::builtins::PyType>() + .map(|t| t.__name__(vm).as_str().to_owned()) + .unwrap_or_else(|| String::from("?")); + return Err(vm.new_type_error(format!( + "{}() accepts 0 positional sub-patterns but {} were given", + type_name, + nargs_val + ))); + }And for keyword attribute names:
- for name in kwd_attrs { - let name_str = name.downcast_ref::<PyStr>().unwrap(); + for name in kwd_attrs { + let name_str = name + .downcast_ref::<PyStr>() + .ok_or_else(|| vm.new_type_error("keyword patterns must be strings"))?;Also applies to: 1472-1476, 1481-1485, 1492-1492
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Lib/test/test_patma.py
is excluded by!Lib/**
📒 Files selected for processing (15)
compiler/codegen/src/compile.rs
(7 hunks)extra_tests/snippets/syntax_match.py
(1 hunks)vm/src/builtins/bool.rs
(1 hunks)vm/src/builtins/bytearray.rs
(1 hunks)vm/src/builtins/bytes.rs
(1 hunks)vm/src/builtins/dict.rs
(1 hunks)vm/src/builtins/float.rs
(1 hunks)vm/src/builtins/int.rs
(1 hunks)vm/src/builtins/list.rs
(1 hunks)vm/src/builtins/set.rs
(2 hunks)vm/src/builtins/str.rs
(1 hunks)vm/src/builtins/tuple.rs
(1 hunks)vm/src/builtins/type.rs
(3 hunks)vm/src/frame.rs
(3 hunks)vm/src/types/slot.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
- vm/src/builtins/set.rs
- vm/src/builtins/list.rs
- vm/src/builtins/str.rs
- vm/src/builtins/dict.rs
- vm/src/builtins/float.rs
- vm/src/builtins/bytes.rs
- vm/src/builtins/tuple.rs
- vm/src/builtins/bytearray.rs
- vm/src/builtins/bool.rs
- vm/src/types/slot.rs
- vm/src/builtins/int.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
vm/src/frame.rs
vm/src/builtins/type.rs
compiler/codegen/src/compile.rs
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py
: Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
extra_tests/snippets/syntax_match.py
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Applied to files:
vm/src/frame.rs
🧬 Code graph analysis (1)
compiler/codegen/src/compile.rs (1)
compiler/codegen/src/unparse.rs (2)
unparse_expr
(61-451)unparse_expr
(608-610)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Check the WASM package and demo
🔇 Additional comments (17)
vm/src/builtins/type.rs (4)
235-255
: Pattern matching flag inheritance logic looks correctThe updated
inherit_patma_flags
method properly handles multi-base inheritance by:
- Checking all bases in order
- Inheriting the first non-empty COLLECTION_FLAGS found
- Correctly preventing override of existing flags
The implementation follows Python's semantics where the first base with pattern matching flags wins.
269-271
: Good defensive check for existing flagsThe early return prevents accidental overwriting of collection flags that may have been set elsewhere, maintaining flag integrity.
315-316
: LGTM: Proper base list passing for flag inheritanceThe change correctly passes the full
bases
slice instead of a single base, enabling proper multi-base inheritance checking.
374-375
: Static types handled correctly with single-base sliceCreating a single-element slice for static types preserves backward compatibility while using the new multi-base API consistently.
extra_tests/snippets/syntax_match.py (3)
67-70
: Good addition of explicit assertions for mapping patternAdding explicit assertions for both
x
andy
values improves test clarity and ensures the mapping pattern correctly extracts both values.
72-88
: Excellent test coverage for mapping rest patternsThe new tests comprehensively cover:
- Basic rest pattern with remaining items
- Empty rest pattern edge case
- Multiple key extraction with rest
This provides good coverage for the
**rest
pattern functionality.
89-106
: Comprehensive rest pattern test with multiple keysThe test properly validates that rest patterns correctly collect unmatched keys while preserving matched ones. The progression from 2-key to 3-key matching demonstrates incremental consumption.
compiler/codegen/src/compile.rs (7)
3487-3507
: Wildcard pattern handling logic is correctThe implementation properly:
- Identifies true wildcards (patterns with no name binding)
- Decrements
on_top
for all patterns (including wildcards)- Pops wildcards from the stack without compiling them
This correctly handles the
_
pattern in class matching.
3521-3535
: Excellent validation for mapping patternsThe implementation includes proper checks for:
- Key-pattern count mismatch
- Invalid
_
as a rest target (matching Python's syntax rules)The error messages are clear and helpful.
3577-3578
: Safe downcast after overflow checkThe cast is safe here since the overflow check at line 3572 ensures
size <= i32::MAX + 1
.
3583-3616
: Key validation and duplicate detection improvementsThe implementation properly:
- Accepts literals and attribute lookups as required by PEP 634
- Uses
unparse_expr
for stable duplicate detection (avoiding Debug format issues)- Includes
None
andbytes
literals as valid keysThis is a significant improvement over the previous Debug-based approach.
3936-3939
: Correct change to use Is operator for singleton patternsUsing
TestOperation::Is
instead ofCompareOperation::Equal
is the correct approach for singleton pattern matching, as it properly tests identity rather than equality.
4010-4021
: Guards properly implemented for match casesThe guard compilation correctly:
- Evaluates the guard expression
- Converts to bool
- Jumps to fail block if false
This matches Python's guard semantics.
4039-4043
: Default case guard handling looks goodThe implementation correctly handles guards in default (catch-all) cases using
JumpIfFalseOrPop
to either continue or skip the case body.vm/src/frame.rs (3)
18-24
: Import changes and PyTypeFlags usage — LGTMConsolidating imports and switching to flag-based checks via PyTypeFlags is consistent with the rest of the VM. No issues here.
1317-1323
: MatchSequence: restrict via SEQUENCE flag — LGTMUsing PyTypeFlags::SEQUENCE here aligns with the intended semantics (list/tuple are sequences; bytes/bytearray/range are not) per prior discussion. Good.
1306-1311
: Mapping checks inMatchMapping
andMatchKeys
are now consistent
- Both opcodes in
vm/src/frame.rs
use thePyTypeFlags::MAPPING
flag:
- Line 1307:
subject.class().slots.flags.contains(PyTypeFlags::MAPPING)
- Line 1330: same flag check in the
MatchKeys
path- A repository-wide search for
PyMapping::check
invm/src/**
found no other references in the match opcode implementations.All mapping-related checks in these opcodes now uniformly rely on the type flag.
// Check if the type has the MAPPING flag | ||
let is_mapping = subject.class().slots.flags.contains(PyTypeFlags::MAPPING); | ||
|
||
self.push_value(subject); | ||
self.push_value(vm.ctx.new_bool(is_mapping).into()); | ||
Ok(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
MatchMapping: flag-based detection — verify MAPPING coverage on builtins
Using subject.class().slots.flags.contains(PyTypeFlags::MAPPING) is good. Please verify PyDict (and any mapping builtins intended to match) actually carry MAPPING, otherwise MatchMapping will never succeed on dicts.
Run:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Builtins with MAPPING flag:"
rg -n 'pyclass\(.*flags\([^)]*MAPPING' -g 'vm/src/builtins/*.rs' -C2
echo
echo "Dict pyclass line:"
rg -nP 'pyclass\([^)]*name\s*=\s*"dict"[^)]*\)' vm/src/builtins/dict.rs -n -C2 || true
Length of output: 155
🏁 Script executed:
#!/usr/bin/env bash
set -euo pipefail
echo "1) List all `#[pyclass]` definitions in vm/src/builtins:"
rg -n 'pyclass\(' -g 'vm/src/builtins/*.rs' -n -C2 || true
echo
echo "2) Search for `PyTypeFlags::MAPPING` usage in builtins:"
rg -n 'PyTypeFlags::MAPPING' -g 'vm/src/builtins/*.rs' -n -C2 || true
echo
echo "3) Inspect the `dict` builtin for its `#[pyclass]` flags:"
rg -n 'pyclass\([^)]*name\s*=\s*"dict"' -g 'vm/src/builtins/dict.rs' -n -C2 || true
echo
echo "4) Check any `flags` assignment in dict.rs:"
rg -n 'flags' -g 'vm/src/builtins/dict.rs' -n -C2 || true
Length of output: 39397
🏁 Script executed:
#!/usr/bin/env bash
set -euo pipefail
echo "Inspect `PyMappingProxy` struct annotation:"
rg -n 'pyclass\(.*name\s*=\s*"mappingproxy"' -g vm/src/builtins/mappingproxy.rs -n -C2 || true
echo
echo "Inspect `PyMappingProxy` impl annotation:"
rg -n 'impl PyMappingProxy' -g vm/src/builtins/mappingproxy.rs -n -C5 || true
echo
echo "Search for any `flags(` in mappingproxy.rs:"
rg -n 'flags' -g vm/src/builtins/mappingproxy.rs -n -C2 || true
echo
echo "Search for literal `MAPPING` in mappingproxy.rs:"
rg -n 'MAPPING' -g vm/src/builtins/mappingproxy.rs -n -C2 || true
Length of output: 3044
MatchMapping: Ensure all mapping types carry the MAPPING flag
- PyDict is already annotated with
MAPPING
(see vm/src/builtins/dict.rs:179:flags(BASETYPE, MAPPING, _MATCH_SELF)
), so mapping patterns will match plain dicts. - PyMappingProxy (vm/src/builtins/mappingproxy.rs:17) currently has no
flags(...)
entry, so itsflags
default omitsMAPPING
and it won’t be recognized by theMatchMapping
logic.
To correctly support PEP 634–style mapping patterns on mappingproxy
(and any other builtins implementing AsMapping
), add the MAPPING
flag to its #[pyclass]
:
--- a/vm/src/builtins/mappingproxy.rs
+++ b/vm/src/builtins/mappingproxy.rs
@@ -15,1 +15,1 @@
-#[pyclass(module = false, name = "mappingproxy", traverse)]
+#[pyclass(module = false, name = "mappingproxy", traverse, flags(BASETYPE, MAPPING))]
Please apply this change (and similarly verify any other AsMapping
types) so that they participate in MatchMapping
.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Check if the type has the MAPPING flag | |
let is_mapping = subject.class().slots.flags.contains(PyTypeFlags::MAPPING); | |
self.push_value(subject); | |
self.push_value(vm.ctx.new_bool(is_mapping).into()); | |
Ok(None) | |
#[pyclass(module = false, name = "mappingproxy", traverse, flags(BASETYPE, MAPPING))] | |
#[derive(Debug)] | |
pub struct PyMappingProxy { | |
mapping: MappingProxyInner, | |
} |
🤖 Prompt for AI Agents
In vm/src/frame.rs around lines 1306 to 1311, MatchMapping checks
PyTypeFlags::MAPPING on a subject to decide mapping-pattern behavior, but
mappingproxy (vm/src/builtins/mappingproxy.rs) lacks the MAPPING flag so it
won’t be recognized; update the #[pyclass] on mappingproxy to include the
MAPPING flag (e.g., flags(..., MAPPING, ...)) and then run a quick audit of
other builtins implementing AsMapping to add MAPPING where missing so they
participate in MatchMapping.
1984856
to
c965f78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
vm/src/frame.rs (2)
1317-1323
: MatchSequence via SEQUENCE flag aligns with agreed semantics.This is consistent with the project decision that only list/tuple should participate as sequences in pattern matching; bytes/bytearray/range should not. Good.
703-714
: SWAP(0) can panic; add a lower-bound check and release-mode guard.Only an upper-bound debug_assert is present. SWAP is 1-indexed; SWAP(0) yields j = len - 0 and panics. Add a lower-bound assertion and a runtime guard.
Apply this diff:
let len = self.state.stack.len(); debug_assert!(len > 0, "stack underflow in SWAP"); let i = len - 1; // TOS index let index_val = index.get(arg) as usize; // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n] // This means swap TOS with the element at index (len - n) - debug_assert!( - index_val <= len, - "SWAP index {} exceeds stack size {}", - index_val, - len - ); + debug_assert!(index_val >= 1, "SWAP index must be >= 1"); + debug_assert!( + index_val <= len, + "SWAP index {} exceeds stack size {}", + index_val, + len + ); + if index_val == 0 || index_val > len { + self.fatal("SWAP index out of range"); + } let j = len - index_val; self.state.stack.swap(i, j);
🧹 Nitpick comments (3)
vm/src/builtins/type.rs (2)
235-255
: Multi-base inheritance for pattern-matching flags looks correct; consider guarding against conflicting flags on a single base.The “first base with SEQUENCE or MAPPING” rule is implemented correctly and avoids overriding pre-set flags. As a defensive check (useful during development), add a debug assertion that a base doesn’t simultaneously carry both flags, which would be ill-formed.
Apply this diff to add a debug assertion inside the loop:
for base in bases { let base_flags = base.slots.flags & COLLECTION_FLAGS; + debug_assert!( + !(base_flags.contains(PyTypeFlags::SEQUENCE) && base_flags.contains(PyTypeFlags::MAPPING)), + "Base type '{}' has conflicting SEQUENCE and MAPPING flags", + base.name() + ); if !base_flags.is_empty() { slots.flags |= base_flags; return; } }
269-297
: Avoid sign-extension pitfalls when parsing abc_tpflags; use u64 instead of i64.Using to_i64() then casting to u64 can misinterpret negative values (e.g., -1 becomes all bits set before masking). Prefer parsing as u64 and ignore out-of-range/negative values to prevent accidental flag pollution.
Apply this diff in both places where flags_val is computed:
- let flags_val = int_obj.as_bigint().to_i64().unwrap_or(0); - let abc_flags = PyTypeFlags::from_bits_truncate(flags_val as u64); + let flags_val = int_obj.as_bigint().to_u64().unwrap_or(0); + let abc_flags = PyTypeFlags::from_bits_truncate(flags_val);vm/src/frame.rs (1)
1389-1509
: MatchClass: correctness improvements match CPython behavior.
- Raises TypeError for non-tuple match_args and for too many positional subpatterns.
- Enforces MATCH_SELF rules and errors on >1 positional subpattern.
- Treats missing attributes/keys as non-match while propagating genuine errors.
Looks good. Please add tests that exercise:
- match_args is not a tuple (TypeError).
- nargs > len(match_args) (TypeError).
- MATCH_SELF with nargs in {0,1,2} (ok/error).
- No match_args and nargs > 0 (TypeError).
- Keyword attribute missing → non-match, unrelated errors propagate.
I can draft targeted tests for these cases if helpful.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Lib/test/test_patma.py
is excluded by!Lib/**
📒 Files selected for processing (2)
vm/src/builtins/type.rs
(3 hunks)vm/src/frame.rs
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
vm/src/builtins/type.rs
vm/src/frame.rs
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
vm/src/builtins/type.rs
vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Applied to files:
vm/src/builtins/type.rs
vm/src/frame.rs
🧬 Code graph analysis (1)
vm/src/frame.rs (2)
vm/src/protocol/mapping.rs (2)
keys
(154-160)values
(162-168)vm/src/vm/mod.rs (4)
new
(116-219)get_method
(775-782)value
(660-660)value
(662-662)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
🔇 Additional comments (3)
vm/src/builtins/type.rs (2)
315-317
: Good call-site integration.Propagating flags from all bases in new_heap_inner aligns with the new API and the “first base wins” behavior.
373-376
: Static types: confirm ABC-derived flags aren’t needed at creation.new_static inherits flags from the single base but doesn’t consult abc_tpflags. That’s likely intentional (builtins set explicit flags). Please confirm no static type relies on ABCMeta-provided flags at init; otherwise consider calling check_abc_tpflags() here too.
Would you like a quick repo scan to list static builtins that set MAPPING/SEQUENCE explicitly to ensure coverage?
vm/src/frame.rs (1)
1306-1311
: AddMAPPING
flag to built-in mapping typesOur search across
vm/src/builtins/*.rs
found no#[pyclass(..., flags = PyTypeFlags::MAPPING, ...)]
annotations (therg
forflags.*MAPPING
returned zero matches). In particular:
vm/src/builtins/dict.rs
:(no#[pyclass(module = false, name = "dict", unhashable = true, traverse)]
flags = PyTypeFlags::MAPPING
)vm/src/builtins/mappingproxy.rs
:(no#[pyclass(module = false, name = "mappingproxy", traverse)]
flags = PyTypeFlags::MAPPING
)Because
MatchMapping
now simply checks the type’sPyTypeFlags::MAPPING
bit, failing to set it means neitherdict
normappingproxy
will be recognized as mappings at runtime.Please update these two
#[pyclass]
attributes to includeflags = PyTypeFlags::MAPPING
. For example:#[pyclass( - module = false, - name = "dict", - unhashable = true, - traverse + module = false, + name = "dict", + unhashable = true, + traverse, + flags = PyTypeFlags::MAPPING )]#[pyclass( - module = false, - name = "mappingproxy", - traverse + module = false, + name = "mappingproxy", + traverse, + flags = PyTypeFlags::MAPPING )]Optionally, review other types implementing the
AsMapping
trait (e.g. intuple.rs
,list.rs
,range.rs
, etc.) and decide whether they should likewise carry the mapping flag or remain sequence-only.⛔ Skipped due to learnings
Learnt from: youknowone PR: RustPython/RustPython#6110 File: vm/src/frame.rs:1311-1316 Timestamp: 2025-08-26T05:20:54.523Z Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Learnt from: youknowone PR: RustPython/RustPython#6110 File: vm/src/frame.rs:1311-1316 Timestamp: 2025-08-26T05:20:54.523Z Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
c965f78
to
f4543f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
vm/src/frame.rs (1)
702-716
: SWAP index guard still allows SWAP(0) to panic in release; add lower-bound check and runtime guardWithout a lower-bound check, index_val == 0 computes j = len - 0, causing out-of-bounds in release. Add a debug_assert for >= 1 and a release-mode guard.
Apply this diff:
let len = self.state.stack.len(); debug_assert!(len > 0, "stack underflow in SWAP"); let i = len - 1; // TOS index let index_val = index.get(arg) as usize; // CPython: SWAP(n) swaps TOS with PEEK(n) where PEEK(n) = stack_pointer[-n] // This means swap TOS with the element at index (len - n) - debug_assert!( - index_val <= len, - "SWAP index {} exceeds stack size {}", - index_val, - len - ); + debug_assert!(index_val >= 1, "SWAP index must be >= 1"); + debug_assert!( + index_val <= len, + "SWAP index {} exceeds stack size {}", + index_val, + len + ); + if index_val == 0 || index_val > len { + self.fatal("SWAP index out of range"); + } let j = len - index_val; self.state.stack.swap(i, j);
🧹 Nitpick comments (2)
compiler/codegen/src/compile.rs (2)
3286-3304
: Confirm post-star index math for sequence patterns; add targeted testsThe nonnegative index computation for elements after the star slot uses:
- index = len(subject) - (patterns.len() - i)
This looks correct (e.g., for
[a, *b, c, d]
,i=2 -> len-2
,i=3 -> len-1
). A prior review suggestedpatterns.len() - 1 - i
inside this expression; that would be off by one.Recommendation:
- Add a couple of tests to lock this in (both even/odd lengths):
case [a, *b, c, d]
case [a, *b, c]
case [*b, c, d]
- Ensure indices selected map to the last k elements as expected.
I can draft minimal tests under extra_tests/snippets/syntax_match.py if you’d like.
3580-3616
: Key validation now accepts all PEP 634 literal forms (including bytes/None) and attribute lookups
- Allow-list of valid key ASTs looks correct.
- Duplicate detection via unparse is much more stable than
Debug
and resilient to ranges.Minor suggestion (optional):
- Consider normalizing string/bytes literal prefixes/case in the unparse (if not already guaranteed) to avoid edge-case duplicates with different spellings but equal values. Not urgent.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Lib/test/test_patma.py
is excluded by!Lib/**
📒 Files selected for processing (15)
compiler/codegen/src/compile.rs
(7 hunks)extra_tests/snippets/syntax_match.py
(1 hunks)vm/src/builtins/bool.rs
(1 hunks)vm/src/builtins/bytearray.rs
(1 hunks)vm/src/builtins/bytes.rs
(1 hunks)vm/src/builtins/dict.rs
(1 hunks)vm/src/builtins/float.rs
(1 hunks)vm/src/builtins/int.rs
(1 hunks)vm/src/builtins/list.rs
(1 hunks)vm/src/builtins/set.rs
(2 hunks)vm/src/builtins/str.rs
(1 hunks)vm/src/builtins/tuple.rs
(1 hunks)vm/src/builtins/type.rs
(3 hunks)vm/src/frame.rs
(3 hunks)vm/src/types/slot.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- vm/src/builtins/bool.rs
- vm/src/builtins/dict.rs
- vm/src/builtins/tuple.rs
- vm/src/builtins/str.rs
- vm/src/types/slot.rs
- vm/src/builtins/list.rs
- vm/src/builtins/set.rs
- vm/src/builtins/bytearray.rs
- vm/src/builtins/float.rs
- vm/src/builtins/bytes.rs
- extra_tests/snippets/syntax_match.py
- vm/src/builtins/type.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs
: Follow the default rustfmt code style (cargo fmt
to format)
Always run clippy to lint code (cargo clippy
) before completing tasks. Fix any warnings or lints that are introduced by your changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass
,pymodule
,pyfunction
, etc.) when implementing Python functionality in Rust
Files:
vm/src/builtins/int.rs
compiler/codegen/src/compile.rs
vm/src/frame.rs
🧠 Learnings (2)
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
vm/src/frame.rs
📚 Learning: 2025-08-26T05:20:54.523Z
Learnt from: youknowone
PR: RustPython/RustPython#6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.523Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Applied to files:
vm/src/frame.rs
🧬 Code graph analysis (2)
compiler/codegen/src/compile.rs (1)
compiler/codegen/src/unparse.rs (2)
unparse_expr
(61-451)unparse_expr
(608-610)
vm/src/frame.rs (2)
common/src/boxvec.rs (1)
len
(49-51)vm/src/vm/mod.rs (1)
get_method
(775-782)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (11)
vm/src/frame.rs (4)
18-18
: Flag imports and common utilities consolidation — LGTMImporting PyTypeFlags and consolidating BoxVec/PyMutex/Wtf8Buf is appropriate for the new flag-driven matching and BuildString path.
Also applies to: 23-23
1325-1391
: MatchKeys logic is robust and CPython-aligned — using get with a sentinel, with getitem fallback
- Using vm.get_method(..., "get").transpose()? correctly propagates attribute-access errors.
- The dummy sentinel and identity check avoid KeyError and missing side-effects.
- Falling back to getitem when get is absent keeps behavior correct for non-dict mappings.
1393-1512
: Verify built-in MATCH_SELF annotations before adding testsThe ripgrep output shows that only
bool
is defined withflags(_MATCH_SELF)
; the other built-in types (int
,str
,float
,tuple
,list
,dict
) aren’t annotated and will follow the “no match_args” error path rather than MATCH_SELF semantics. Please confirm your intended behavior:
- If you plan to support MATCH_SELF for those types, add
flags(_MATCH_SELF)
to their#[pyclass(...)]
in:
vm/src/builtins/int.rs
vm/src/builtins/str.rs
vm/src/builtins/float.rs
vm/src/builtins/tuple.rs
vm/src/builtins/list.rs
vm/src/builtins/dict.rs
- Otherwise, limit the new pytest cases to:
bool(x)
extracting the subject itself (MATCH_SELF happy path) and rejectingbool(x, y)
with the corresponding TypeError.- For
int(x)
/str(x)
/etc., asserting the “C() accepts 0 positional sub-patterns but 1 were given
” error from the no-match_args branch.Once you’ve confirmed the desired flag coverage, you can proceed to draft tests that align with the actual implementation.
1306-1311
: Flag coverage gaps in mappingproxy and list — action requiredThe mapping (
MAPPING
) and sequence (SEQUENCE
) flags are correctly driving pattern‐matching inframe.rs
, but our downstream inspection uncovered two missing flag annotations:
mappingproxy
ThePyMappingProxy
class invm/src/builtins/mappingproxy.rs
currently reads:#[pyclass(module = false, name = "mappingproxy", traverse)] #[derive(Debug)] pub struct PyMappingProxy { … }It must carry the
MAPPING
flag so thatMatchMapping
sees it. For consistency withPyDict
, change it to:#[pyclass( module = false, name = "mappingproxy", traverse, - // missing flags + flags(BASETYPE, MAPPING, _MATCH_SELF), )] #[derive(Debug)] pub struct PyMappingProxy { … }This ensures
subject.class().slots.flags.contains(PyTypeFlags::MAPPING)
returns true for mappingproxy.list
ThePyList
class invm/src/builtins/list.rs
currently reads:#[pyclass(module = false, name = "list", unhashable = true, traverse)] #[derive(Default)] pub struct PyList { … }It likewise needs the
SEQUENCE
flag so thatMatchSequence
recognizes lists. Update it to:#[pyclass( module = false, name = "list", unhashable = true, traverse, - // missing flags + flags(BASETYPE, SEQUENCE, _MATCH_SELF), )] #[derive(Default)] pub struct PyList { … }This makes
subject.class().slots.flags.contains(PyTypeFlags::SEQUENCE)
true for lists.All other builtins—
dict
/mappingproxy
for mappings and onlytuple
(and nowlist
) for sequences—are correctly flagged, and bytes/bytearray/range remain unflagged as intended. Apply these changes to fully align flag coverage with CPython’s pattern-matching rules in both the mapping check (lines 1306–1311) and sequence check (lines 1317–1323) ofvm/src/frame.rs
.⛔ Skipped due to learnings
Learnt from: youknowone PR: RustPython/RustPython#6110 File: vm/src/frame.rs:1311-1316 Timestamp: 2025-08-26T05:20:54.523Z Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Learnt from: youknowone PR: RustPython/RustPython#6110 File: vm/src/frame.rs:1311-1316 Timestamp: 2025-08-26T05:20:54.523Z Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
vm/src/builtins/int.rs (1)
320-322
: Add _MATCH_SELF to int: aligns with class-pattern “self” semanticsGood call adding the _MATCH_SELF flag so patterns like
case int(x)
bind the subject itself. This matches CPython’s Py_TPFLAGS_MATCH_SELF behavior and is consistent with the wider PR adding this flag across builtins.Follow-ups:
- Please make sure bool also sets _MATCH_SELF (bool is a subclass of int and typically participates in these patterns).
- Consider adding small runtime tests to cover:
match 42: case int(): ...
match 42: case int(x): assert x == 42
- Negative:
match 42: case int(x, y): ...
should not match.compiler/codegen/src/compile.rs (6)
3486-3506
: on_top accounting with wildcard subpatterns is consistentYou decrement
pc.on_top
before checking for a true wildcard, and explicitlyPop
the wildcard value. Given we just unpackedtotal
items and bumpedon_top
accordingly, consuming one stack item per iteration (wildcard or not) keepson_top
correct. No change requested.
3556-3579
: Early length check + overflow guard: good prefiltering
GetLen >= size
is a reasonable fast-fail before the more preciseMatchKeys
.- The overflow guard (
size > i32::MAX + 1
) mirrors CPython’s intent; casting to u32 afterward is safe.LGTM.
3623-3650
: MatchKeys + None check pipeline reads cleanly
- Building the keys tuple, calling
MatchKeys
, and testing againstNone
withIsNot
is straightforward and matches the VM changes described in the PR.- Unpacking the resulting values tuple and updating
pc.on_top
accordingly looks consistent.LGTM.
3936-3940
: Singleton patterns should use identity (is) — correctSwitching to
TestOperation::Is
forNone/True/False
matches CPython semantics. Good fix.
4010-4021
: Guard compilation paths look correct for both non-default and default cases
- Non-default cases: compile guard, ToBool, and jump to the pattern-failure target — matches the intended control flow.
- Default case:
JumpIfFalseOrPop
toend
is the right behavior.No issues spotted.
Also applies to: 4039-4043
3519-3536
: Keep Disallowing**_
in Mapping PatternsAccording to the Python structural‐pattern‐matching specification, mapping patterns inherently ignore any extra keys not listed in the pattern, so a “rest wildcard” is already redundant. The PEPs and official documentation explicitly state that while
**rest
is supported to capture the remainder, using**_
as a wildcard is disallowed because it would have no semantic effect.• PEP 636 (Tutorial) notes: “Mapping patterns… support a wildcard
**rest
. (But**_
would be redundant, so it is not allowed.)” (peps.python.org)
• PEP 622 (Specification) likewise defines**capture_pattern
for mappings and states that**_
is invalid due to its no-op nature. (peps.python.org)Please retain the existing syntax‐error behavior for
**_
in mapping patterns. This enforces clarity (avoiding a no-op wildcard) and keeps consistency with the language design.Likely an incorrect or invalid review comment.
@arihant2math do you have time to give a look? |
@ShaharNaveh Now the str(x) matching is also fixed
Summary by CodeRabbit
New Features
Bug Fixes
Tests