Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 23, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for special method optimization and executor specialization.
  • Refactor

    • Simplified internal attribute access compilation and bytecode emission.
    • Updated opcode support for Python 3.14.2 alignment.
    • Removed redundant instruction variants for streamlined bytecode generation.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 23, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This PR refactors attribute and super attribute load instructions across the compiler stack. It introduces encoded opargs for LoadAttr/LoadSuperAttr operations, consolidates multiple pseudo-instruction variants into single encoded forms, removes public re-exports of encoding helpers, and updates bytecode instruction definitions accordingly.

Changes

Cohort / File(s) Summary
Compiler Codegen Emission
crates/codegen/src/compile.rs
Added 6 new private helper methods (emit_load_attr, emit_load_attr_method, emit_load_super_attr, emit_load_super_method, emit_load_zero_super_attr, emit_load_zero_super_method) that encode opargs and emit LoadAttr/LoadSuperAttr instructions. Replaced all direct instruction emissions with corresponding helper calls throughout the file. Added imports for encode_load_attr_arg and encode_load_super_attr_arg.
IR Instruction Finalization
crates/codegen/src/ir.rs
Removed public re-exports of encode_load_attr_arg and encode_load_super_attr_arg. Simplified CodeInfo::finalize_code to skip per-instruction encoding for already-encoded real instructions. Removed handling of LoadAttrMethod/LoadSuperMethod/LoadZeroSuper\* pseudo-instruction variants.
Bytecode Instruction Definitions
crates/compiler-core/src/bytecode/instruction.rs
Extended label_arg matching to include PopJumpIfNone/PopJumpIfNotNone. Modified LoadAttr stack_info to dynamically compute stack effect using decoded oparg. Removed RustPython-specific pseudo-instruction variants (LoadAttrMethod, LoadSuperMethod, LoadZeroSuperAttr, LoadZeroSuperMethod) from PseudoInstruction enum. Updated enum bounds and InstructionMetadata implementations accordingly.
Opcode Module
crates/stdlib/src/_opcode.rs
Changed HAVE_ARGUMENT constant from 44 to 43. Removed LoadAttrMethod from has_name pattern matching. Added ENABLE_SPECIALIZATION_FT constant and get_special_method_names() function. Extended get_nb_ops list and modified get_executor signature to include _offset: i32 parameter.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Pseudo ops #6678: Directly related—both PRs introduce encode_load_attr_arg/decode_load_attr_arg helpers and consolidate LoadAttrMethod encoding into LoadAttr with encoded opargs across codegen/IR/bytecode layers.
  • Bytecode pseudo opcodes #6715: Directly related—both modify pseudo-instruction removal (LoadAttrMethod, LoadSuperMethod, LoadZeroSuper\* variants) and introduce encoded oparg handling in compile.rs/ir.rs/instruction.rs.
  • instruction improvements #6829: Directly related—both refactor attribute/super-attribute instruction representation by removing pseudo-instruction variants and switching to encoded LoadAttr/LoadSuper handling across the compilation pipeline.

Suggested reviewers

  • ShaharNaveh

Poem

🐰 Attributes now encode so tight,
No pseudo bloat, just bytes so right,
From compile through IR we weave,
Consolidation that we believe! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main objective of the PR, which is aligning pseudo operations to CPython 3.14.2 as evidenced by the removal of RustPython-specific pseudo-instruction variants and updates to match CPython standards.
Docstring Coverage ✅ Passed Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2026

Code has been automatically formatted

The code in this PR has been formatted using:

  • ruff format
    Please pull the latest changes before pushing again:
git pull origin pseudo-ops

@youknowone youknowone changed the title Align psuedo ops to CPython Align psuedo ops to CPython 3.14.2 Jan 23, 2026
@youknowone youknowone marked this pull request as ready for review January 23, 2026 16:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 474-483: The Arg<NameIdx> used for the LoadAttr operand currently
holds an encoded value which is confusing and error-prone; introduce a new
OpArgType called EncodedLoadAttrArg and replace uses of Arg<NameIdx> for
LoadAttr so the type signals that decoding is required. Update the emitter sites
(emit_load_attr, emit_load_attr_method) to produce EncodedLoadAttrArg (using
encode_load_attr_arg), update consumers (stack_effect implementation for
LoadAttr, disassembly, and the load_attr runtime) to accept EncodedLoadAttrArg
and call decode_load_attr_arg before extracting the name index, and adjust any
trait impls or type aliases so Arg<EncodedLoadAttrArg> integrates with the
existing OpArgType system.

In `@crates/stdlib/src/_opcode.rs`:
- Around line 154-156: Remove the public exposure of the internal macro by
deleting or disabling the #[pyattr] export for ENABLE_SPECIALIZATION_FT (and any
ENABLE_SPECIALIZATION) so they are not added to the opcode module; keep the
internal constants locally if needed but do not annotate them with #[pyattr] (or
change to private/internal symbol) and ensure only CPython-public attributes
(e.g., cmp_op, stack_effect, hascompare, opname, opmap, HAVE_ARGUMENT,
EXTENDED_ARG, hasarg, hasconst, hasname, hasjump, hasjrel, hasjabs, hasfree,
haslocal, hasexc) remain exported from the module.
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)

932-967: Pseudo-op range alignment looks consistent; do a clean build to avoid stale bytecode artifacts.
Based on learnings, please run a clean build and clear any stale .pyc / target/debug/build/rustpython-* artifacts after bytecode instruction changes.

Comment on lines 474 to +483
Self::LoadFromDictOrDeref(_) => 1,
Self::StoreSubscr => -3,
Self::DeleteSubscr => -2,
Self::LoadAttr { .. } => 0,
Self::LoadAttr { idx } => {
// Stack effect depends on method flag in encoded oparg
// method=false: pop obj, push attr → effect = 0
// method=true: pop obj, push (method, self_or_null) → effect = +1
let (_, is_method) = decode_load_attr_arg(idx.get(arg));
if is_method { 1 } else { 0 }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Locate all LoadAttr construction sites and encode/decode usage
echo "=== LoadAttr construction/usage sites ==="
rg -n -C2 --type=rs 'LoadAttr\s*\{|Instruction::LoadAttr\b' crates/

echo -e "\n=== encode_load_attr_arg and decode_load_attr_arg usage ==="
rg -n -C2 --type=rs 'encode_load_attr_arg|decode_load_attr_arg' crates/

Repository: RustPython/RustPython

Length of output: 238


🏁 Script executed:

#!/bin/bash
# Locate all LoadAttr construction sites and encode/decode usage
echo "=== LoadAttr construction/usage sites ==="
rg -n -C2 'LoadAttr\s*\{|Instruction::LoadAttr' crates/

echo -e "\n=== encode_load_attr_arg and decode_load_attr_arg usage ==="
rg -n -C2 'encode_load_attr_arg|decode_load_attr_arg' crates/

echo -e "\n=== Files containing these terms ==="
rg -l 'LoadAttr|encode_load_attr_arg|decode_load_attr_arg' crates/

Repository: RustPython/RustPython

Length of output: 7418


🏁 Script executed:

#!/bin/bash
# Search for any direct LoadAttr construction without going through emit functions
echo "=== Search for direct LoadAttr { idx construction (potential bypass) ==="
rg -n 'LoadAttr\s*\{\s*idx\s*:' crates/

echo -e "\n=== Check emit_load_attr calls to confirm all usages go through encoding ==="
rg -n 'emit_load_attr' crates/

echo -e "\n=== Check Arg type definition ==="
rg -n 'pub struct Arg' crates/ -A 5

echo -e "\n=== Check if there are any other places calling Instruction::LoadAttr ==="
rg -n 'Instruction::LoadAttr' crates/ -C 3

Repository: RustPython/RustPython

Length of output: 8611


🏁 Script executed:

#!/bin/bash
# Get more context on Arg<T> and OpArgType definition
echo "=== Arg<T> definition and usage ==="
rg -n 'impl.*Arg<' crates/compiler-core/src/bytecode/instruction.rs -A 10 | head -50

echo -e "\n=== OpArgType trait definition ==="
rg -n 'pub trait OpArgType|pub enum OpArgType' crates/ -A 5

echo -e "\n=== NameIdx type definition ==="
rg -n 'type NameIdx|pub.*NameIdx' crates/compiler-core/src/bytecode/instruction.rs | head -10

echo -e "\n=== LoadAttr field type vs actual usage ==="
sed -n '167,169p' crates/compiler-core/src/bytecode/instruction.rs
echo "---"
sed -n '842,846p' crates/compiler-core/src/bytecode/instruction.rs

Repository: RustPython/RustPython

Length of output: 1779


🏁 Script executed:

#!/bin/bash
# Search for NameIdx definition
echo "=== NameIdx definition ==="
rg -n 'type NameIdx\s*=|pub.*NameIdx\s*=' crates/ -A 2

echo -e "\n=== Arg.get method implementation ==="
rg -n 'pub fn get\|pub const fn get' crates/compiler-core/src/bytecode/instruction.rs -A 8

echo -e "\n=== How idx.get is used in all LoadAttr contexts ==="
rg -n 'idx\.get\(' crates/ -B 2 -A 1

echo -e "\n=== Check if NameIdx implements OpArgType ==="
rg -n 'impl.*OpArgType.*NameIdx\|NameIdx.*OpArgType' crates/

Repository: RustPython/RustPython

Length of output: 13278


🏁 Script executed:

#!/bin/bash
# Find the .get() method on Arg<T>
echo "=== Search for .get method on Arg<T> ==="
rg -n '\.get\s*\(' crates/compiler-core/src/bytecode/instruction.rs -B 5 | head -40

echo -e "\n=== Full Arg implementation ==="
sed -n '1177,1230p' crates/compiler-core/src/bytecode/instruction.rs

echo -e "\n=== Check for OpArgType impl for NameIdx/u32 ==="
rg -n 'impl.*OpArgType' crates/compiler-core/src/bytecode/oparg.rs -A 10 | head -80

Repository: RustPython/RustPython

Length of output: 5623


Type safety for encoded LOAD_ATTR args is currently correct but could be improved.

All 2 emission sites (emit_load_attr, emit_load_attr_method) properly use encode_load_attr_arg, and all 3 consumption sites (stack_effect, disassembly, load_attr runtime) properly decode before using the name index. However, the field type Arg<NameIdx> is misleading since it contains an encoded value, not a raw name index. Since Arg<u32>.get() just returns the raw u32 without decoding, a future maintainer could naively use it as a direct index. Consider introducing a dedicated EncodedLoadAttrArg type implementing OpArgType to make the encoding requirement explicit in the type system.

🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 474 - 483, The
Arg<NameIdx> used for the LoadAttr operand currently holds an encoded value
which is confusing and error-prone; introduce a new OpArgType called
EncodedLoadAttrArg and replace uses of Arg<NameIdx> for LoadAttr so the type
signals that decoding is required. Update the emitter sites (emit_load_attr,
emit_load_attr_method) to produce EncodedLoadAttrArg (using
encode_load_attr_arg), update consumers (stack_effect implementation for
LoadAttr, disassembly, and the load_attr runtime) to accept EncodedLoadAttrArg
and call decode_load_attr_arg before extracting the name index, and adjust any
trait impls or type aliases so Arg<EncodedLoadAttrArg> integrates with the
existing OpArgType system.

}

// Real instructions are already encoded by compile.rs
if info.instr.real().is_some() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd replace the line 214, no need to check twice for if it's real/pseudo

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/stdlib/src/_opcode.rs`:
- Around line 320-326: The current get_special_method_names function only
returns the four context-manager names; replace or extend that array to match
CPython's full _Py_SpecialMethods list so callers receive the complete set of
special method names. Locate or create a static Rust slice/const (e.g.,
SPECIAL_METHODS or _PY_SPECIAL_METHODS) containing all CPython special method
strings, then change get_special_method_names to iterate over that slice (using
the existing .into_iter().map(|x| vm.ctx.new_str(x).into()).collect()) to
produce the Vec<PyObjectRef>; ensure the symbol name get_special_method_names
remains unchanged and the returned type stays Vec<PyObjectRef>.

Comment on lines 320 to +326
#[pyfunction]
fn get_executor(_code: PyObjectRef, vm: &VirtualMachine) -> PyResult<PyObjectRef> {
// TODO
fn get_special_method_names(vm: &VirtualMachine) -> Vec<PyObjectRef> {
["__enter__", "__exit__", "__aenter__", "__aexit__"]
.into_iter()
.map(|x| vm.ctx.new_str(x).into())
.collect()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n crates/stdlib/src/_opcode.rs | sed -n '310,330p'

Repository: RustPython/RustPython

Length of output: 754


🏁 Script executed:

cd crates/stdlib && cargo clippy --all-targets 2>&1 | head -100

Repository: RustPython/RustPython

Length of output: 174


🏁 Script executed:

rg "get_special_method_names\|__enter__\|__exit__" crates/stdlib/src/_opcode.rs -A 2 -B 2

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

rg "get_intrinsic1_descs|get_intrinsic2_descs" crates/stdlib/src/_opcode.rs -A 6

Repository: RustPython/RustPython

Length of output: 639


🏁 Script executed:

rg "class.*Manager|__enter__|__exit__|__aenter__|__aexit__" --type rs | head -30

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

cat -n crates/stdlib/src/_opcode.rs | sed -n '1,50p'

Repository: RustPython/RustPython

Length of output: 1893


🏁 Script executed:

rg "get_special_method_names" crates/stdlib/

Repository: RustPython/RustPython

Length of output: 168


🏁 Script executed:

rg "__enter__|__exit__|__aenter__|__aexit__" crates/stdlib/src/_opcode.rs

Repository: RustPython/RustPython

Length of output: 125


🏁 Script executed:

git log --oneline -10 crates/stdlib/src/_opcode.rs

Repository: RustPython/RustPython

Length of output: 99


🏁 Script executed:

find crates -name "*test*" -type f | xargs grep -l "special_method\|__enter__\|__exit__" 2>/dev/null | head -5

Repository: RustPython/RustPython

Length of output: 159


🏁 Script executed:

rg "def get_special_method_names|special_method" crates/stdlib/src/_opcode.rs -B 5 -A 10

Repository: RustPython/RustPython

Length of output: 446


🏁 Script executed:

cat -n crates/stdlib/src/_opcode.rs | sed -n '320,350p'

Repository: RustPython/RustPython

Length of output: 776


🌐 Web query:

CPython _opcode module get_special_method_names function

💡 Result:

In CPython, _opcode.get_special_method_names() is a CPython-internal helper that returns a Python list of the interpreter’s “special method” (dunder) names.

Implementation (CPython 3.14): it allocates a list of length SPECIAL_MAX + 1 and fills it with _Py_SpecialMethods[i].name for i = 0..SPECIAL_MAX. The returned list index corresponds to CPython’s internal “special method” id. [1]

API docstring in the generated clinic header: “Return a list of special method names.” [2]

Use:

import _opcode
names = _opcode.get_special_method_names()

Sources: [1] [2]


🏁 Script executed:

rg "ASYNC_WITH|WITH_CLEANUP|context.*manager" crates/ --type-list | head -10

Repository: RustPython/RustPython

Length of output: 454


🏁 Script executed:

cat -n crates/stdlib/src/_opcode.rs | sed -n '280,330p'

Repository: RustPython/RustPython

Length of output: 2069


Incomplete implementation: Returns only context manager special methods, not all special methods.

The function syntactically follows the established pattern (array → .into_iter().map().collect()), but does not match CPython's get_special_method_names(). CPython's implementation returns a comprehensive list of all special method names (indexed from an internal _Py_SpecialMethods array), while this implementation returns only the four context manager methods: __enter__, __exit__, __aenter__, __aexit__. This implementation is incomplete relative to CPython and may cause compatibility issues if callers expect the full set of special method names.

🤖 Prompt for AI Agents
In `@crates/stdlib/src/_opcode.rs` around lines 320 - 326, The current
get_special_method_names function only returns the four context-manager names;
replace or extend that array to match CPython's full _Py_SpecialMethods list so
callers receive the complete set of special method names. Locate or create a
static Rust slice/const (e.g., SPECIAL_METHODS or _PY_SPECIAL_METHODS)
containing all CPython special method strings, then change
get_special_method_names to iterate over that slice (using the existing
.into_iter().map(|x| vm.ctx.new_str(x).into()).collect()) to produce the
Vec<PyObjectRef>; ensure the symbol name get_special_method_names remains
unchanged and the returned type stays Vec<PyObjectRef>.

Copy link
Contributor

@ShaharNaveh ShaharNaveh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@github-actions
Copy link
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[+] lib: cpython/Lib/_opcode_metadata.py

dependencies:

  • _opcode_metadata

dependent tests: (13 tests)
- [ ] opcode: test__opcode test_code test_dis test_peepholer
- [ ] dis: test_ast test_compile test_compiler_assemble test_dtrace test_fstring test_inspect test_opcache test_patma test_positional_only_arg

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@youknowone youknowone merged commit 483e4a2 into RustPython:main Jan 24, 2026
12 of 14 checks passed
@youknowone youknowone deleted the pseudo-ops branch January 24, 2026 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants