Feb
26
A standard assembly format for Python bytecode
Background
Currently, CPython’s internal bytecode format stores instructions with no args as 1 byte, instructions with small args as 3 bytes, and instructions with large args as 6 bytes (actually, a 3-byte EXTENDED_ARG followed by a 3-byte real instruction). While bytecode is implementation-specific, many other implementations (PyPy, MicroPython, …) use CPython’s bytecode format, or variations on it.
Python exposes as much of this as possible to user code. For example, you can write a decorator that takes a function’s __code__ member, access its co_code bytecode string and other attributes, build a different code object, and replace the function’s __code__.
Currently, CPython’s internal bytecode format stores instructions with no args as 1 byte, instructions with small args as 3 bytes, and instructions with large args as 6 bytes (actually, a 3-byte EXTENDED_ARG followed by a 3-byte real instruction). While bytecode is implementation-specific, many other implementations (PyPy, MicroPython, …) use CPython’s bytecode format, or variations on it.
Python exposes as much of this as possible to user code. For example, you can write a decorator that takes a function’s __code__ member, access its co_code bytecode string and other attributes, build a different code object, and replace the function’s __code__.