I am comparing a cart and pole simulation with python 3.7 and Julia 1.2. In python the simulation is written as a class object as seen below, and in Julia is is just a function. I am getting a consistent 0.2 seconds time to solve using Julia which is much slower than python. I do not understand Julia well enough to understand why. My guess is it has something to do with compiling or the way the loop is set up.
import math
import random
from collections import namedtuple
RAD_PER_DEG = 0.0174533
DEG_PER_RAD = 57.2958
State = namedtuple('State', 'x x_dot theta theta_dot')
class CartPole:
""" Model for the dynamics of an inverted pendulum
"""
def __init__(self):
self.gravity = 9.8
self.masscart = 1.0
self.masspole = 0.1
self.length = 0.5 # actually half the pole's length
self.force_mag = 10.0
self.tau = 0.02 # seconds between state updates
self.x = 0
self.x_dot = 0
self.theta = 0
self.theta_dot = 0
@property
def state(self):
return State(self.x, self.x_dot, self.theta, self.theta_dot)
def reset(self, x=0, x_dot=0, theta=0, theta_dot=0):
""" Reset the model of a cartpole system to it's initial conditions
" theta is in radians
"""
self.x = x
self.x_dot = x_dot
self.theta = theta
self.theta_dot = theta_dot
def step(self, action):
""" Move the state of the cartpole simulation forward one time unit
"""
total_mass = self.masspole + self.masscart
pole_masslength = self.masspole * self.length
force = self.force_mag if action else -self.force_mag
costheta = math.cos(self.theta)
sintheta = math.sin(self.theta)
temp = (force + pole_masslength * self.theta_dot ** 2 * sintheta) / total_mass
# theta acceleration
theta_dotdot = (
(self.gravity * sintheta - costheta * temp)
/ (self.length *
(4.0/3.0 - self.masspole * costheta * costheta /
total_mass)))
# x acceleration
x_dotdot = temp - pole_masslength * theta_dotdot * costheta / total_mass
self.x += self.tau * self.x_dot
self.x_dot += self.tau * x_dotdot
self.theta += self.tau * self.theta_dot
self.theta_dot += self.tau * theta_dotdot
return self.state
To run the simulation the following code was used
from cartpole import CartPole
import time
cp = CartPole()
start = time.time()
for i in range(100000):
cp.step(True)
end = time.time()
print(end-start)
The Julia code is
function cartpole(state, action)
"""Cart and Pole simulation in discrete time
Inputs: cartpole( state, action )
state: 1X4 array [cart_position, cart_velocity, pole_angle, pole_velocity]
action: Boolean True or False where true is a positive force and False is a negative force
"""
gravity = 9.8
masscart = 1.0
masspole = 0.1
l = 0.5 # actually half the pole's length
force_mag = 10.0
tau = 0.02 # seconds between state updates
# x = 0
# x_dot = 0
# theta = 0
# theta_dot = 0
x = state[1]
x_dot = state[2]
theta = state[3]
theta_dot = state[4]
total_mass = masspole + masscart
pole_massl = masspole * l
if action == 0
force = force_mag
else
force = -force_mag
end
costheta = cos(theta)
sintheta = sin(theta)
temp = (force + pole_massl * theta_dot^2 * sintheta) / total_mass
# theta acceleration
theta_dotdot = (gravity * sintheta - costheta * temp)/ (l *(4.0/3.0 - masspole * costheta * costheta / total_mass))
# x acceleration
x_dotdot = temp - pole_massl * theta_dotdot * costheta / total_mass
x += tau * x_dot
x_dot += tau * x_dotdot
theta += tau * theta_dot
theta_dot += tau * theta_dotdot
new_state = [x x_dot theta theta_dot]
return new_state
end
The run code is
@time include("cartpole.jl")
function run_sim()
"""Runs the cartpole simulation
No inputs or ouputs
Use with @time run_sim() for timing puposes.
"""
state = [0 0 0 0]
for i = 1:100000
state = cartpole( state, 0)
#print(state)
#print("\n")
end
end
@time run_sim()
Your Python version takes 0.21s on my laptop. Here are timing results for the original Julia version on the same system:
julia> @time run_sim()
0.222335 seconds (654.98 k allocations: 38.342 MiB)
julia> @time run_sim()
0.019425 seconds (100.00 k allocations: 10.681 MiB, 37.52% gc time)
julia> @time run_sim()
0.010103 seconds (100.00 k allocations: 10.681 MiB)
julia> @time run_sim()
0.012553 seconds (100.00 k allocations: 10.681 MiB)
julia> @time run_sim()
0.011470 seconds (100.00 k allocations: 10.681 MiB)
julia> @time run_sim()
0.025003 seconds (100.00 k allocations: 10.681 MiB, 52.82% gc time)
The first run includes JIT compilation and takes ~0.2s whereas after that each run takes 10-20ms. That breaks down into ~10ms of actual compute time and ~10s of garbage collection time triggered every four calls or so. That means that Julia is about 10-20x faster than Python, excluding JIT compilation time, which is not bad for a straight port.
Why not count JIT time when benchmarking? Because you don't actually care about how long it takes to run fast programs like benchmarks. You're timing small benchmark problems to extrapolate how long it it will take to run larger problems where speed really matters. JIT compilation time is proportional to the amount of code you're compiling not to problem size. So when solving larger problems that you actually care about, the JIT compilation will still only take 0.2s, which is a negligible fraction of execution time for large problems.
Now, let's see about making the Julia code even faster. This is actually very simple: use a tuple instead of a row vector for your state. So initialize the state as state = (0, 0, 0, 0)
and then update the state similarly:
new_state = (x, x_dot, theta, theta_dot)
That's it, otherwise the code is identical. For this version the timings are:
julia> @time run_sim()
0.132459 seconds (479.53 k allocations: 24.020 MiB)
julia> @time run_sim()
0.008218 seconds (4 allocations: 160 bytes)
julia> @time run_sim()
0.007230 seconds (4 allocations: 160 bytes)
julia> @time run_sim()
0.005379 seconds (4 allocations: 160 bytes)
julia> @time run_sim()
0.008773 seconds (4 allocations: 160 bytes)
The first run still includes JIT time. Subsequent runs are now 5-10ms, which is about 25-40x faster than the Python version. Note that there are almost no allocations—small, fixed numbers of allocations are just for return values and won't trigger GC if this is called from other code in a loop.
Okay, so I've just run your Python and Julia code, and I get different results: 1.41 s for 10m iterations for Julia, 25.5 seconds for 10m iterations for Python. Already, Julia is 18x faster!
I think perhaps the issue is that @time
is not accurate when run in global scope - you need multi-second timings for it to be accurate enough. You can use the package BenchmarkTools
to get accurate timings of small functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With