I wrote a virtual machine language to assembly translator for a computer systems course I'm taking (using the nand2tetris curriculum). I originally wrote it in Python, but since I'm learning D, I thought I'd translate it. D is fairly close to Python syntactically, so it wasn't too difficult. I assumed that D, being a performance language and compiled, would be at least as fast as Python and on a large file, would be much faster. But the opposite is true! Despite identical algorithms, D consistently performed slightly slower than python when I constructed a very large file to compile. On a file approximately 500000 lines long, python consistently took about 2.6 seconds to complete, while D consistently took about 3. This isn't an enormous gap, but it's notable that python would be at all faster.
I don't want to suggest that I'm naive enough to think that python is actually faster than D overall; however, in this instance it seems at the very least that D is not intuitively faster. I'd appreciate some input on possible sources of performance decreases in my D code. I think the bottleneck might be on IO operations, but I'm not sure.
The source code is below. The details aren't that important; some assembly language templates are allocated and then there's a linear pass through the virtual machine language, translating each instruction to an equivalent block of assembly code.
EDIT: after recompiling the D code using dmd -O -release -inline -m64
, D comes out the victor with a time of 2.20s on the input. However, the question still remains of why with almost identical code, D seems to perform slower than python.
EDIT 2: using the advice from below, I switched from using a simple list of strings, to using an appender!string()
, and improved the time by a noticeable amount. It's worth mentioning however, that if you have a bunch of strings in an appender
, do not write them to a file with a command like:
auto outputfile = File("foo.txt","w");
foreach(str; my_appender.data)
outputfile.write(str);
Rather instead, write something like:
auto outputfile = File("foo.txt","w");
outputfile.write(my_appender.data);
The second example will give you a small performance increase over using a simple string[]
. But using the first one gave me a huge performance hit, doubling the execution time.
Changing to an appender!string()
, the compilation of the aforementioned large file took about 2.75 seconds (to Python's 2.8), where the original had taken about 3. Doing this, and also using the optimization flags in dmd
, gave a total compilation time of 1.98
seconds! :)
Python:
#!/usr/bin/python
import sys
operations_dict = {"add":"+", "sub":"-",
"and":"&", "or":"|",
"not":"!", "neg":"-",
"lt":"JLT", "gt":"JGT",
"eq":"JEQ", "leq":"JLE",
"geq":"JGE"}
vars_dict = {"this":("THIS","M"),
"that":("THAT","M"),
"argument":("ARG","M",),
"local":("LCL","M",),
"static":("f.%d","M",),
"temp":("TEMP","A",)}
start = "@SP\nAM=M-1\n"
end = "@SP\nM=M+1\n"
binary_template = start + "D=M\n\
@SP\n\
AM=M-1\n\
M=M%sD\n" + end
unary_template = start + "M=%sM\n" + end
comp_template = start + "D=M\n\
@SP\n\
AM=M-1\n\
D=M-D\n\
@COMP.%d.TRUE\n\
D;%s\n\
@COMP.%d.FALSE\n\
0;JMP\n\
(COMP.%d.TRUE)\n\
@SP\n\
A=M\n\
M=-1\n\
@SP\n\
M=M+1\n\
@COMP.%d.END\n\
0;JMP\n\
(COMP.%d.FALSE)\n\
@SP\n\
A=M\n\
M=0\n" + end + "(COMP.%d.END)\n"
push_tail_template = "@SP\n\
A=M\n\
M=D\n\
@SP\n\
M=M+1\n"
push_const_template = "@%d\nD=A\n" + push_tail_template
push_var_template = "@%d\n\
D=A\n\
@%s\n\
A=%s+D\n\
D=M\n" + push_tail_template
push_staticpointer_template = "@%s\nD=M\n" + push_tail_template
pop_template = "@%d\n\
D=A\n\
@%s\n\
D=%s+D\n\
@R13\n\
M=D\n\
@SP\n\
AM=M-1\n\
D=M\n\
@R13\n\
A=M\n\
M=D\n"
pop_staticpointer_template = "@SP\n\
AM=M-1\n\
D=M\n\
@%s\n\
M=D"
type_dict = {"add":"arithmetic", "sub":"arithmetic",
"and":"arithmetic", "or":"arithmetic",
"not":"arithmetic", "neg":"arithmetic",
"lt":"arithmetic", "gt":"arithmetic",
"eq":"arithmetic", "leq":"arithmetic",
"geq":"arithmetic",
"push":"memory", "pop":"memory"}
binary_ops = ["add", "sub", "and", "or"]
unary_ops = ["not", "neg"]
comp_ops = ["lt", "gt", "eq", "leq", "geq"]
op_count = 0
line_count = 0
output = ["// Assembly file generated by my awesome VM compiler\n"]
def compile_operation(op):
global line_count
if (op[0:2] == "//") or (len(op.split()) == 0):
return ""
# print "input: " + op
operation = op.split()[0]
header = "// '" + op + "' (line " + str(line_count) + ")\n"
line_count += 1
if type_dict[operation] == "arithmetic":
return header + compile_arithmetic(op)
elif type_dict[operation] == "memory":
return header + compile_memory(op)
def compile_arithmetic(op):
global op_count
out_string = ""
if op in comp_ops:
out_string += comp_template % (op_count, operations_dict[op], op_count, \
op_count, op_count, op_count, op_count)
op_count += 1
elif op in unary_ops:
out_string += unary_template % operations_dict[op]
else:
out_string += binary_template % operations_dict[op]
return out_string
def compile_memory(op):
global output
instructions = op.split()
inst = instructions[0]
argtype = instructions[1]
val = int(instructions[2])
if inst == "push":
if argtype == "constant":
return push_const_template % val
elif argtype == "static":
return push_staticpointer_template % ("f." + str(val))
elif argtype == "pointer":
if val == 0:
return push_staticpointer_template % ("THIS")
else:
return push_staticpointer_template % ("THAT")
else:
return push_var_template % (val, vars_dict[argtype][0], vars_dict[argtype][1])
elif inst == "pop":
if argtype != "constant":
if argtype == "static":
return pop_staticpointer_template % ("f." + str(val))
elif argtype == "pointer":
if val == 0:
return pop_staticpointer_template % "THIS"
else:
return pop_staticpointer_template % "THAT"
else:
return pop_template % (val, vars_dict[argtype][0], vars_dict[argtype][1])
def main():
global output
if len(sys.argv) == 1:
inputfname = "test.txt"
else:
inputfname = sys.argv[1]
outputfname = inputfname.split('.')[0] + ".asm"
inputf = open(inputfname)
output += ["// Input filename: %s\n" % inputfname]
for line in inputf.readlines():
output += [compile_operation(line.strip())]
outputf = open(outputfname, 'w')
for outl in output:
outputf.write(outl)
outputf.write("(END)\n@END\n0;JMP");
inputf.close()
outputf.close()
print "Output written to " + outputfname
if __name__ == "__main__":
main()
D:
import std.stdio, std.string, std.conv, std.format, std.c.stdlib;
string[string] operations_dict, type_dict;
string[][string] vars_dict;
string[] arithmetic, memory, comp_ops, unary_ops, binary_ops, lines, output;
string start, end, binary_template, unary_template,
comp_template, push_tail_template, push_const_template,
push_var_template, push_staticpointer_template,
pop_template, pop_staticpointer_template;
int op_count, line_count;
void build_dictionaries() {
vars_dict = ["this":["THIS","M"],
"that":["THAT","M"],
"argument":["ARG","M"],
"local":["LCL","M"],
"static":["f.%d","M"],
"temp":["TEMP","A"]];
operations_dict = ["add":"+", "sub":"-",
"and":"&", "or":"|",
"not":"!", "neg":"-",
"lt":"JLT", "gt":"JGT",
"eq":"JEQ", "leq":"JLE",
"geq":"JGE"];
type_dict = ["add":"arithmetic", "sub":"arithmetic",
"and":"arithmetic", "or":"arithmetic",
"not":"arithmetic", "neg":"arithmetic",
"lt":"arithmetic", "gt":"arithmetic",
"eq":"arithmetic", "leq":"arithmetic",
"geq":"arithmetic",
"push":"memory", "pop":"memory"];
binary_ops = ["add", "sub", "and", "or"];
unary_ops = ["not", "neg"];
comp_ops = ["lt", "gt", "eq", "leq", "geq"];
}
bool is_in(string s, string[] list) {
foreach (str; list)
if (str==s) return true;
return false;
}
void build_strings() {
start = "@SP\nAM=M-1\n";
end = "@SP\nM=M+1\n";
binary_template = start ~ "D=M\n"
"@SP\n"
"AM=M-1\n"
"M=M%sD\n" ~ end;
unary_template = start ~ "M=%sM\n" ~ end;
comp_template = start ~ "D=M\n"
"@SP\n"
"AM=M-1\n"
"D=M-D\n"
"@COMP.%s.TRUE\n"
"D;%s\n"
"@COMP.%s.FALSE\n"
"0;JMP\n"
"(COMP.%s.TRUE)\n"
"@SP\n"
"A=M\n"
"M=-1\n"
"@SP\n"
"M=M+1\n"
"@COMP.%s.END\n"
"0;JMP\n"
"(COMP.%s.FALSE)\n"
"@SP\n"
"A=M\n"
"M=0\n" ~ end ~ "(COMP.%s.END)\n";
push_tail_template = "@SP\n"
"A=M\n"
"M=D\n"
"@SP\n"
"M=M+1\n";
push_const_template = "@%s\nD=A\n" ~ push_tail_template;
push_var_template = "@%s\n"
"D=A\n"
"@%s\n"
"A=%s+D\n"
"D=M\n" ~ push_tail_template;
push_staticpointer_template = "@%s\nD=M\n" ~ push_tail_template;
pop_template = "@%s\n"
"D=A\n"
"@%s\n"
"D=%s+D\n"
"@R13\n"
"M=D\n"
"@SP\n"
"AM=M-1\n"
"D=M\n"
"@R13\n"
"A=M\n"
"M=D\n";
pop_staticpointer_template = "@SP\n"
"AM=M-1\n"
"D=M\n"
"@%s\n"
"M=D";
}
void init() {
op_count = 0;
line_count = 0;
output = ["// Assembly file generated by my awesome VM compiler\n"];
build_strings();
build_dictionaries();
}
string compile_operation(string op) {
if (op.length == 0 || op[0..2] == "//")
return "";
string operation = op.split()[0];
string header = "// '" ~ op ~ "' (line " ~ to!string(line_count) ~ ")\n";
++line_count;
if (type_dict[operation] == "arithmetic")
return header ~ compile_arithmetic(op);
else
return header ~ compile_memory(op);
}
string compile_arithmetic(string op) {
if (is_in(op, comp_ops)) {
string out_string = format(comp_template, op_count, operations_dict[op], op_count,
op_count, op_count, op_count, op_count);
op_count += 1;
return out_string;
} else if (is_in(op, unary_ops))
return format(unary_template, operations_dict[op]);
else
return format(binary_template, operations_dict[op]);
}
string compile_memory(string op) {
string[] instructions = op.split();
string inst = instructions[0];
string argtype = instructions[1];
int val = to!int(instructions[2]);
if (inst == "push") {
if (argtype == "constant") {
return format(push_const_template, val);
} else if (argtype == "static")
return format(push_staticpointer_template, ("f." ~ to!string(val)));
else if (argtype == "pointer")
if (val == 0)
return format(push_staticpointer_template, "THIS");
else
return format(push_staticpointer_template, "THAT");
else
return format(push_var_template, val, vars_dict[argtype][0], vars_dict[argtype][1]);
} else {
if (argtype != "constant") {
if (argtype == "static")
return format(pop_staticpointer_template, ("f." ~ to!string(val)));
else if (argtype == "pointer") {
if (val == 0)
return format(pop_staticpointer_template, "THIS");
else
return format(pop_staticpointer_template, "THAT");
}
else
return format(pop_template, val, vars_dict[argtype][0], vars_dict[argtype][1]);
} else {
return "";
}
}
}
void main(string args[]) {
init();
if (args.length < 2) {
writefln("usage: %s <filename>", args[0]);
exit(0);
}
string inputfname = args[1];
string outputfname = args[1].split(".")[0] ~ ".asm";
auto inputf = File(inputfname, "r");
output ~= format("// Input filename: %s\n", inputfname);
foreach (line; inputf.byLine) {
output ~= compile_operation(to!string(line).strip);
}
inputf.close();
auto outputf = File(outputfname, "w");
foreach (outl; output)
outputf.write(outl);
outputf.write("(END)\n@END\n0;JMP");
outputf.close();
writeln("Compilation successful. Output written to " ~ outputfname);
}
The source code compiled to byte code is then executed in Python's virtual machine one by one, to carry out the operations. The virtual machine is an internal component of Python. Internally Python code is interpreted during run time rather than being compiled to native code hence it is a bit slower.
Unlike other popular programming languages including C# or JAVA, Python is dynamically typed and an interpreted language. It is slow primarily due to its dynamic nature and versatility.
Python programs are generally expected to run slower than Java programs, but they also take much less time to develop. Python programs are typically 3-5 times shorter than equivalent Java programs. This difference can be attributed to Python's built-in high-level data types and its dynamic typing.
Python is incredibly popular because it's easy to learn, versatile, and has thousands of useful libraries for data science. But one thing it is not is fast.
For output
variable use Appender
(docs):
import std.array : appender;
void main() {
auto output = appender!string("// Assembly file generated by my awesome VM compiler\n");
//...
output.put(format("// Input filename: %s\n", inputfname));
foreach (line; inputf.byLine) {
output.put(compile_operation(line.to!string().strip()));
}
//...
outputf.write(output.data());
//...
}
Also, I suggest that you should change your type_dict
to something like int[string]
and use it with integer constants.
int[string] type_dict;
const TYPE_ARITHMETIC = 0,
TYPE_MEMORY = 1;
//...
type_dict = ["add": TYPE_ARITHMETIC, "push": TYPE_MEMORY]; // etc
//...
//...
if (type_dict[operation] == TYPE_ARITHMETIC) {
//...
}
//...
Use canFind
method (docs) instead of custom is_in
. Or even try SortedRange
(docs).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With