What exactly is the difference between bytecode and a parse tree, specifically the one used by Perl? Do they actually refer to the same concept, or is there a distinction?
I'm familiar with the concept of bytecode from Python and Java, but when reading about Perl, I've learned that it supposedly executes a parse tree (instead of bytecode) in its interpreter.
If there actually is a distinction, what are the reasons for Perl not using bytecode (or Python not using parse trees)? Is it mainly historical, or are there differences between the languages that necessitate a different compilation/execution model? Could Perl (with reasonable effort and execution performance) be implemented by using a bytecode interpreter?
The bytecode is a low-level platform-independent representation of your source code, however, it is not the binary machine code and cannot be run by the target machine directly. In fact, it is a set of instructions for a virtual machine which is called the Python Virtual Machine (PVM).
Byte code is the intermediate code compiled and executed by a virtual machine (VM). Byte code can be used unchanged on any platform on which the VM operates.
A parse tree is a record of the rules (and tokens) used to match some input text whereas a syntax tree records the structure of the input and is insensitive to the grammar that produced it. Combining the above two definitions, An Abstract Syntax Tree describes the parse tree logically.
Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normally numeric addresses) that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.
What Perl uses is not a parse tree, at least not how Wikipedia defines it. It's an opcode tree.
>perl -MO=Concise -E"for (1..10) { say $i }"
g <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 49 -e:1) v:%,{,2048 ->3
f <2> leaveloop vK/2 ->g
7 <{> enteriter(next->c last->f redo->8) lKS/8 ->d
- <0> ex-pushmark s ->3
- <1> ex-list lK ->6
3 <0> pushmark s ->4
4 <$> const[IV 1] s ->5
5 <$> const[IV 10] s ->6
6 <#> gv[*_] s ->7
- <1> null vK/1 ->f
e <|> and(other->8) vK/1 ->f
d <0> iter s ->e
- <@> lineseq vK ->-
8 <;> nextstate(main 47 -e:1) v:%,2048 ->9
b <@> say vK ->c
9 <0> pushmark s ->a
- <1> ex-rv2sv sK/1 ->b
a <#> gvsv[*i] s ->b
c <0> unstack v ->d
-e syntax OK
Except, despite being called a tree, it's not really a tree. Notice the arrows? It's because it's actually a list-like graph of opcodes (like any other executable).
>perl -MO=Concise,-exec -E"for (1..10) { say $i }"
1 <0> enter
2 <;> nextstate(main 49 -e:1) v:%,{,2048
3 <0> pushmark s
4 <$> const[IV 1] s
5 <$> const[IV 10] s
6 <#> gv[*_] s
7 <{> enteriter(next->c last->f redo->8) lKS/8
d <0> iter s
e <|> and(other->8) vK/1
8 <;> nextstate(main 47 -e:1) v:%,2048
9 <0> pushmark s
a <#> gvsv[*i] s
b <@> say vK
c <0> unstack v
goto d
f <2> leaveloop vK/2
g <@> leave[1 ref] vKP/REFC
-e syntax OK
The difference between Perl's opcodes and Java's bytecodes is that Java's bytecodes are designed to be serialisable (stored in a file).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With