Thanks to visit codestin.com
Credit goes to www.slideshare.net

Rusty Python
Python interpreter, reimagined.
About the dude standing here
● Juhun “RangHo” Lee
● Professional procrastinator
● Commits bullshits for living
● Oh shit another hipster again
Twitter: @RangHo_777
GitHub: @RangHo
I like programming
languages.
Like, a lot.
Rust is pretty cool…… Right?
● Compile-time memory safety
● Zero-cost abstraction
● Fearless concurrency
● Runtime engine not required
● FFI-able design
● Embed-able toolchain
● IT IS THE BEST COMPILED LANGUAGE EVER!!!11!!1!!!!
And Python is quite nice…… Isn’t it?
● Feature-rich standard library
● Partially prototype-based OOP
● (Relatively) Easy metaprogramming
● Multi-paradigm
● Syntax is hip af
● It works on your machine as well™
● IT IS THE BEST SCRIPTING LANGUAGE EVER!!!!!11!!
Imagination time!
● I have a Rust
○ which is hip in and of itself
● I have a Python
○ which used to be hip in and of itself
UH!
● Rust + Python! (or something)
○ which has to be hip af right? SO INTERESTING OMFG
“자, 재밌는 상상 한 번 해 보자고.” - Some random
dude
Heap Hip overflow — how?
● The big question: HOW DO WE DOUBLE THE HIP?
● Two of many ways to achieve hip²
1. Python-based: Building Rust for Python
2. Rust-based: Building Python for Rust
Building Rust for Python
Stage 1
Python ❤️ Extension
● Python can be extended using C or C++
● Extensions - Python function with native code
● Python is slow af
○ Keras
○ TensorFlow
○ PyTorch
○ NumPy
○ SciPy
○ ...anything that requires heavy calculations
Where Python loses its strength…
● C/C++ extension requires...
○ Manual memory management and fun times free()-ing stuff
○ Manual reference counting with Py_INCREF and Py_DECREF macros
○ Saying bye bye to memory safety
“Why bother using Python, when you have C?”
Here comes a new challenger!
● Rust can fix these issues
○ Memory management? -> Leave that to the Borrow Checker™
○ Reference counting? -> Leave that to the Borrow Checker™
○ Memory safety? -> Leave that to the Borrow Checker™
● Make native Python more Python-y!
Sure, but we want the juice of it
● The most important stuff: PERFORMANCE!
● Simple implementation of Sieve of Eratosthenes
Performance Battle (ver. Python)
● Only 11 lines!
● Basically looks like a
pseudocode
○ (and it is basically a
pseudocode)
● It takes about 25ms to sieve
out 100,000 numbers
○ i7-9700K, btw
Performance Battle (ver. C)
● Relatively massive
● Some if statements for memory
safety
○ I forgot malloc safety check as
well
● It takes about 700µs to sieve
out 100,000 numbers
○ Same, i7-9700K
Performance Battle (ver. Rust + PyO3)
● Simpler than the C code
● Resembles Python more
● Requires two functions
○ Usually they are separated
● It takes about 670µs to sieve
out 100,000 numbers
○ Again, i7-9700K
Performance battle result
● Jesus Christ, Python is slow
● In most cases, Rust is as fast as C
● In most cases, Rust requires less memory-related code
=> Rust is enough to replace C for Python extensions!
Building Python for Rust
Stage 2
There are loads of Pythons out there
● CPython - The “Reference”
● PyPy - The Ouroboros of Infinity
● MicroPython - The Featherweight Warrior
● Jython - Python with a cup of coffee
● IronPython - Python got Microsoft’d
Why another one?
“One of the reasons is
that… I wanted to
learn Rust.”
- Windel Bouwman
Learning Rust by making a Python interpreter
● Currently RustPython project has 5M+ lines of code
● In the beginning, it used to be really simple
○ https://github.com/windelbouwman/rspython
● Now it is fully (kinda) functional Python 3.5 interpreter
Yeah, but why?
Let’s pull out the C-equivalent of p[key], where p is dict:
PyObject *PyDict_GetItem(PyObject *p, PyObject *key);
From the dict
object
...find by key...and return the
borrowed reference
Yeah, but why? (Cont’d)
PyObject *PyDict_GetItem(PyObject *p, PyObject *key);
From the dict
object
...find by key...if no match,
return NULL
(no exception)
Yeah, but why? (Cont’d)
Because Python is GC’d language:
PyObject *PyDict_GetItem(PyObject *p, PyObject *key);
Keeps reference
counter of its own
Keeps reference
counter of its own
Keeps reference
counter of its own
Yeah, but why? (Cont’d)
● Here are some Rust features:
○ Rust ships with Borrow Checker to enforce strict borrowing rules.
○ Rust has a type called Result<T, E> to indicate a recoverable error.
○ Rust has a type called Rc to simplify reference counting.
● Sounds familiar yet?
Boy-meets-girl: an all-time classic
● Python interpreter’s benefit
○ Borrow Checker
○ Rust Standard Library
○ Cargo and Rust Packages
○ Memory Safety
○ WebAssembly
● The Holy Grail of Hipness
How RustPython sees Python
RustPython Interpreter
Design
AST
Byte
code
Source
Lexer & Parser
● Python grammar is painful
● Not context-free language
○ i.e. Indentation
● Two ways to solve issue:
1. Parser-lexer feedback loop
2. “Terminalize” indentation
● Possible to form context-free
grammar out of this spec
○ INDENT and DEDENT
○ Lexing rules are not CF
○ Parsing rules can be CF
○ LL(2) parser can be constructed
Compiler
● Python virtual machine only
understands bytecode
● AST to Bytecode
● Bytecode often reside in memory
● py_compile.compile(file)
○ Python2-> ./xxx.pyc
○ Python3-> ./__pycache__/xxx.pyc
Bytecode
● Python bytecode is not standardized
● Bytecode is subject to change
● Yet CPython has pretty comprehensive documents
○ https://docs.python.org/3/library/dis.html
Some remarks about Bytecode
● Python can be used without .py
source code
○ .pyc file has all the info
● Recovery of original source
code possible
○ e.g. DDLC
● Python bytecode is not
optimized well
○ “Python is about having the
simplest, dumbest compiler
imaginable.”
- Guido van Rossum, our Savior
Optimization Checklist
✓ Constant folding
✓ Immutable allocation
optimization
✗ Unused local variable
elimination
✗ Unnecessary intermediate object
elimination
✗ Loop optimization
✗ Tail recursion optimization
✗ ...and pretty much anything
else
RustPython Bytecode
● RustPython does not
produce bytecode file…
○ Say bye to marshal
● Separated into a crate
○ rustpython-bytecode
● Rather simple
architecture
○ no INPLACE_* or
other advanced stuff
● Massive dispatch loop
Virtual Machine
● Reads bytecode, executes the
darn thing
● Keeps track of runtime info
○ Frames
○ Imported modules
○ Settings
○ Context
○ All builtins
Virtual Machine: the good parts
● Honestly bytecode routine is
not fun at all
○ Read byte -> match instruction
-> dispatch -> go back
● The real fun stuff
○ How PyObject is implemented
○ __builtin__ hellfire mess
○ ByRef and ByVal in Python
○ Rust HashMap v. Python __hash__
○ Metaprogramming
Alright, now what?
● Why RustPython instead of CPython?
○ Native WebAssembly support
○ Guaranteed memory safety
○ Clean, readable codebase
○ Lots of things to learn
● Some projects started picking up
○ pyckitup 2D game engine
○ codingworkshops.org educational website
Can it really stand against CPython?
● Native integration is not
working yet
● Still lots of improvements
needed
○ Some features work on 🅱️in
🅱️ows only
○ It targets WASM but...
● Not very optimized
○ around x16 slower than CPython
○ Sub-optimal data structure
design
Final Thoughts
Alright, I know, let’s face it
● Rust is not the best
○ Any *good* programmer can make manageable code
○ Compilation takes eternity
○ Honestly, wtf is borrow checking
○ “OK, here’s langserver, deal with this crap”
○ Static linkage creates bloated binaries
○ Except libc, because reasons
Alright, I know, let’s face it (cont’d)
● Python is not the best
○ Honestly the “scope-by-whitespace” is an unintelligible mess
○ lol no proper threadings
○ Why no switch-case? WHY?
○ async is a function but no, it’s a generator, but still a function
○ Stop using Python2 already goddammit
○ Basically no optimization is made during compilation stage
Bottom line
● Rust can replace C, but cannot replace C++
○ Modern C++ has many features that help managing memories
○ Partial functional programming support is neat
○ Template-based metaprogramming is scalable enough
● Rust has potential
○ Data science
○ Compilation time is getting faster (for real)
○ Still better than golang, no?
Bottom line
● Python has its uses, but nothing more
○ Pseudocode must remain pseudocode
○ Great scripting engine to make simple CLI tools
■ Ruby, no one cares about that language anymore :(
■ “Perl is worse than Python because people wanted it worse.”
■ JavaScript, we don’t want another 120+MB of dependencies
■ Shell Script is useful, but complex logic is painful
● Stop giving Python a life support
○ Python 2 has fallen cold-dead, long live Python 3
○ Stop shoving more features into the poor thing
Still I am a hipster-
wannabe madman.
(Objectively speaking)

Rusty Python

  • 1.
  • 2.
    About the dudestanding here ● Juhun “RangHo” Lee ● Professional procrastinator ● Commits bullshits for living ● Oh shit another hipster again Twitter: @RangHo_777 GitHub: @RangHo
  • 3.
  • 4.
    Rust is prettycool…… Right? ● Compile-time memory safety ● Zero-cost abstraction ● Fearless concurrency ● Runtime engine not required ● FFI-able design ● Embed-able toolchain ● IT IS THE BEST COMPILED LANGUAGE EVER!!!11!!1!!!!
  • 5.
    And Python isquite nice…… Isn’t it? ● Feature-rich standard library ● Partially prototype-based OOP ● (Relatively) Easy metaprogramming ● Multi-paradigm ● Syntax is hip af ● It works on your machine as well™ ● IT IS THE BEST SCRIPTING LANGUAGE EVER!!!!!11!!
  • 6.
    Imagination time! ● Ihave a Rust ○ which is hip in and of itself ● I have a Python ○ which used to be hip in and of itself UH! ● Rust + Python! (or something) ○ which has to be hip af right? SO INTERESTING OMFG “자, 재밌는 상상 한 번 해 보자고.” - Some random dude
  • 7.
    Heap Hip overflow— how? ● The big question: HOW DO WE DOUBLE THE HIP? ● Two of many ways to achieve hip² 1. Python-based: Building Rust for Python 2. Rust-based: Building Python for Rust
  • 8.
    Building Rust forPython Stage 1
  • 9.
    Python ❤️ Extension ●Python can be extended using C or C++ ● Extensions - Python function with native code ● Python is slow af ○ Keras ○ TensorFlow ○ PyTorch ○ NumPy ○ SciPy ○ ...anything that requires heavy calculations
  • 10.
    Where Python losesits strength… ● C/C++ extension requires... ○ Manual memory management and fun times free()-ing stuff ○ Manual reference counting with Py_INCREF and Py_DECREF macros ○ Saying bye bye to memory safety “Why bother using Python, when you have C?”
  • 11.
    Here comes anew challenger! ● Rust can fix these issues ○ Memory management? -> Leave that to the Borrow Checker™ ○ Reference counting? -> Leave that to the Borrow Checker™ ○ Memory safety? -> Leave that to the Borrow Checker™ ● Make native Python more Python-y!
  • 12.
    Sure, but wewant the juice of it ● The most important stuff: PERFORMANCE! ● Simple implementation of Sieve of Eratosthenes
  • 13.
    Performance Battle (ver.Python) ● Only 11 lines! ● Basically looks like a pseudocode ○ (and it is basically a pseudocode) ● It takes about 25ms to sieve out 100,000 numbers ○ i7-9700K, btw
  • 14.
    Performance Battle (ver.C) ● Relatively massive ● Some if statements for memory safety ○ I forgot malloc safety check as well ● It takes about 700µs to sieve out 100,000 numbers ○ Same, i7-9700K
  • 15.
    Performance Battle (ver.Rust + PyO3) ● Simpler than the C code ● Resembles Python more ● Requires two functions ○ Usually they are separated ● It takes about 670µs to sieve out 100,000 numbers ○ Again, i7-9700K
  • 16.
    Performance battle result ●Jesus Christ, Python is slow ● In most cases, Rust is as fast as C ● In most cases, Rust requires less memory-related code => Rust is enough to replace C for Python extensions!
  • 17.
    Building Python forRust Stage 2
  • 18.
    There are loadsof Pythons out there ● CPython - The “Reference” ● PyPy - The Ouroboros of Infinity ● MicroPython - The Featherweight Warrior ● Jython - Python with a cup of coffee ● IronPython - Python got Microsoft’d Why another one?
  • 19.
    “One of thereasons is that… I wanted to learn Rust.” - Windel Bouwman
  • 20.
    Learning Rust bymaking a Python interpreter ● Currently RustPython project has 5M+ lines of code ● In the beginning, it used to be really simple ○ https://github.com/windelbouwman/rspython ● Now it is fully (kinda) functional Python 3.5 interpreter
  • 21.
    Yeah, but why? Let’spull out the C-equivalent of p[key], where p is dict: PyObject *PyDict_GetItem(PyObject *p, PyObject *key); From the dict object ...find by key...and return the borrowed reference
  • 22.
    Yeah, but why?(Cont’d) PyObject *PyDict_GetItem(PyObject *p, PyObject *key); From the dict object ...find by key...if no match, return NULL (no exception)
  • 23.
    Yeah, but why?(Cont’d) Because Python is GC’d language: PyObject *PyDict_GetItem(PyObject *p, PyObject *key); Keeps reference counter of its own Keeps reference counter of its own Keeps reference counter of its own
  • 24.
    Yeah, but why?(Cont’d) ● Here are some Rust features: ○ Rust ships with Borrow Checker to enforce strict borrowing rules. ○ Rust has a type called Result<T, E> to indicate a recoverable error. ○ Rust has a type called Rc to simplify reference counting. ● Sounds familiar yet?
  • 25.
    Boy-meets-girl: an all-timeclassic ● Python interpreter’s benefit ○ Borrow Checker ○ Rust Standard Library ○ Cargo and Rust Packages ○ Memory Safety ○ WebAssembly ● The Holy Grail of Hipness
  • 26.
    How RustPython seesPython RustPython Interpreter Design AST Byte code Source
  • 27.
    Lexer & Parser ●Python grammar is painful ● Not context-free language ○ i.e. Indentation ● Two ways to solve issue: 1. Parser-lexer feedback loop 2. “Terminalize” indentation ● Possible to form context-free grammar out of this spec ○ INDENT and DEDENT ○ Lexing rules are not CF ○ Parsing rules can be CF ○ LL(2) parser can be constructed
  • 28.
    Compiler ● Python virtualmachine only understands bytecode ● AST to Bytecode ● Bytecode often reside in memory ● py_compile.compile(file) ○ Python2-> ./xxx.pyc ○ Python3-> ./__pycache__/xxx.pyc
  • 29.
    Bytecode ● Python bytecodeis not standardized ● Bytecode is subject to change ● Yet CPython has pretty comprehensive documents ○ https://docs.python.org/3/library/dis.html
  • 30.
    Some remarks aboutBytecode ● Python can be used without .py source code ○ .pyc file has all the info ● Recovery of original source code possible ○ e.g. DDLC ● Python bytecode is not optimized well ○ “Python is about having the simplest, dumbest compiler imaginable.” - Guido van Rossum, our Savior Optimization Checklist ✓ Constant folding ✓ Immutable allocation optimization ✗ Unused local variable elimination ✗ Unnecessary intermediate object elimination ✗ Loop optimization ✗ Tail recursion optimization ✗ ...and pretty much anything else
  • 31.
    RustPython Bytecode ● RustPythondoes not produce bytecode file… ○ Say bye to marshal ● Separated into a crate ○ rustpython-bytecode ● Rather simple architecture ○ no INPLACE_* or other advanced stuff ● Massive dispatch loop
  • 32.
    Virtual Machine ● Readsbytecode, executes the darn thing ● Keeps track of runtime info ○ Frames ○ Imported modules ○ Settings ○ Context ○ All builtins
  • 33.
    Virtual Machine: thegood parts ● Honestly bytecode routine is not fun at all ○ Read byte -> match instruction -> dispatch -> go back ● The real fun stuff ○ How PyObject is implemented ○ __builtin__ hellfire mess ○ ByRef and ByVal in Python ○ Rust HashMap v. Python __hash__ ○ Metaprogramming
  • 34.
    Alright, now what? ●Why RustPython instead of CPython? ○ Native WebAssembly support ○ Guaranteed memory safety ○ Clean, readable codebase ○ Lots of things to learn ● Some projects started picking up ○ pyckitup 2D game engine ○ codingworkshops.org educational website
  • 35.
    Can it reallystand against CPython? ● Native integration is not working yet ● Still lots of improvements needed ○ Some features work on 🅱️in 🅱️ows only ○ It targets WASM but... ● Not very optimized ○ around x16 slower than CPython ○ Sub-optimal data structure design
  • 36.
  • 37.
    Alright, I know,let’s face it ● Rust is not the best ○ Any *good* programmer can make manageable code ○ Compilation takes eternity ○ Honestly, wtf is borrow checking ○ “OK, here’s langserver, deal with this crap” ○ Static linkage creates bloated binaries ○ Except libc, because reasons
  • 38.
    Alright, I know,let’s face it (cont’d) ● Python is not the best ○ Honestly the “scope-by-whitespace” is an unintelligible mess ○ lol no proper threadings ○ Why no switch-case? WHY? ○ async is a function but no, it’s a generator, but still a function ○ Stop using Python2 already goddammit ○ Basically no optimization is made during compilation stage
  • 39.
    Bottom line ● Rustcan replace C, but cannot replace C++ ○ Modern C++ has many features that help managing memories ○ Partial functional programming support is neat ○ Template-based metaprogramming is scalable enough ● Rust has potential ○ Data science ○ Compilation time is getting faster (for real) ○ Still better than golang, no?
  • 40.
    Bottom line ● Pythonhas its uses, but nothing more ○ Pseudocode must remain pseudocode ○ Great scripting engine to make simple CLI tools ■ Ruby, no one cares about that language anymore :( ■ “Perl is worse than Python because people wanted it worse.” ■ JavaScript, we don’t want another 120+MB of dependencies ■ Shell Script is useful, but complex logic is painful ● Stop giving Python a life support ○ Python 2 has fallen cold-dead, long live Python 3 ○ Stop shoving more features into the poor thing
  • 41.
    Still I ama hipster- wannabe madman. (Objectively speaking)

Editor's Notes