Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simulation of suicide burn in openai-gym's LunarLander

I want to simulate suicide burn to learn and understand rocket landing. OpenAI gym already has an LunarLander enviroment which is used for training reinforcement learning agents. I am using this enviroment to simulate suicide burn in python. I have extracted the coordinates (x,y) from the first two values of state vector of this enviroment. From these values, considering y coordinates as the altitude; I have calculated velocity and accelartion of the falling lander using these equations

velocity(v) = delta_y/ delta_t
acceleartion(a) = delta_v/delta_t

As the simulation is incrementing stepwise the difference in time delta_t was taken as 1. Unable to find the gravity parameter of LunarLander I gave it a default value g=1. Then using the below equation from this reddit comment

altitude to start suicide burn = [ (current altitude)(acceleration of gravity) + (1/2)(current velocity)2 ] / (acceleration of engines)

I tried to calculate altitude to start suicide burn. This is my full python code. I am only planning to use two actions 0(do nothing) and 2(start main engine) of the four possible actions.

import gym
env = gym.make('LunarLander-v2')
env.seed(0)

g = 1
delta_t = 1
action = 0

state = env.reset()

# x0 = state[0]
y0 = state[1]
v0 = 0

for t in range(3000):
    state, reward, done, _  = env.step(action)
    y = state[1]
    if done or y <0:
        break
    v = (y-y0)/delta_t  # velocity
    a = (v - v0)/delta_t # acceleration

    # (altitude to start suicide burn) = [ (current altitude)(acceleration of gravity) + (1/2)(current velocity)2 ] / (acceleration of engines)
    alt_burn = [y*g+0.5*v*v]/a

    v0 = v
    y0 = y

    print(" y",round(y,5)," v",round(v,5)," a",round(a,5)," Alt_burn",round(alt_burn[0],5))

The output results looks something like this

 y 1.41542  v 0.00196  a 0.00196  Alt_burn 722.35767
 y 1.41678  v 0.00136  a -0.0006  Alt_burn -2362.78166
 y 1.41754  v 0.00076  a -0.0006  Alt_burn -2362.63867
 y 1.4177  v 0.00016  a -0.0006  Alt_burn -2362.43506
 y 1.41726  v -0.00044  a -0.0006  Alt_burn -2362.64046
 y 1.41622  v -0.00104  a -0.0006  Alt_burn -2359.03148
 y 1.41458  v -0.00164  a -0.0006  Alt_burn -2358.17355
 y 1.41233  v -0.00224  a -0.0006  Alt_burn -2353.50518
 y 1.40949  v -0.00284  a -0.0006  Alt_burn -2349.24118
 y 1.40605  v -0.00344  a -0.0006  Alt_burn -2343.51016
 y 1.40201  v -0.00404  a -0.0006  Alt_burn -2336.31535
 y 1.39737  v -0.00464  a -0.0006  Alt_burn -2329.04954

If we look at altitude(y) its a very small value less than 1.5 whereas the calculated altitude to start suicide burn are very high. How can I solve this problem?

In the reddit comments they have only mentioned to start the engine but not to end it. Anyone knows the math for killing the engine dynamically?

like image 685
Eka Avatar asked Jan 04 '20 10:01

Eka


1 Answers

There are two issues with your code:

  1. delta_t should be 1.0/50.0 according to the lunar_lander source
 FPS = 50
 #...
 self.world.Step(1.0/FPS, 6*30, 2*30)

with the Box2D documentation indicating the typical time step by

 # [...]Typically we use a time step of 1/60 of a
 # second (60Hz) and 6 velocity/2 position iterations. This provides a 
 # high quality simulation in most game scenarios.
 timeStep = 1.0 / 60
  1. The acceleration of the engines should not be defined as the current acceleration as it is in
alt_burn = [y*g+0.5*v*v]/a

but by the impulse of the engines, defined here as

MAIN_ENGINE_POWER  = 13.0

The mass of the lander is not specified in the creation of the lander body, but using

env.lander.mass

we can find the mass to be 4.82. Given this, the correct acceleration for the engines is given by1

alt_burn = (y * g + 0.5 * v*v) / (13.0 / env.lander.mass * 0.5)

If we run the code with the above modifications,

import gym
env = gym.make('LunarLander-v2')
env.seed(0)

g = 1.0
delta_t = 1.0/50.0
action = 0

state = env.reset()

y0 = state[1]

for t in range(3000):
    state, reward, done, _  = env.step(action)
    y = state[1]
    v = (y - y0) / delta_t
    y0 = y
    if done or y < 0:
        break

    alt_burn = (y*g+0.5*v*v)/(13.0 / env.lander.mass * 0.5)

    print(" y",round(y,5)," v",round(v,5)," Alt_burn",round(alt_burn,5))

we get a much more reasonable answer for the burn altitude:

 y 1.41542  v 0.09797  Alt_burn 1.05242
 y 1.41678  v 0.06799  Alt_burn 1.05158
 y 1.41754  v 0.03799  Alt_burn 1.05097
 y 1.4177  v 0.00799  Alt_burn 1.05057
 y 1.41726  v -0.02201  Alt_burn 1.0504
 y 1.41622  v -0.05202  Alt_burn 1.05045
 y 1.41458  v -0.08202  Alt_burn 1.05073
 y 1.41233  v -0.11202  Alt_burn 1.05123
 y 1.40949  v -0.14202  Alt_burn 1.05194
 y 1.40605  v -0.17202  Alt_burn 1.05289
 y 1.40201  v -0.20202  Alt_burn 1.05405
 y 1.39737  v -0.23202  Alt_burn 1.05544
 y 1.39213  v -0.26203  Alt_burn 1.05705
 y 1.38629  v -0.29202  Alt_burn 1.05887
 ...
 y 0.56683  v -1.58312  Alt_burn 1.34863
 y 0.53456  v -1.61311  Alt_burn 1.36025
 y 0.5017  v -1.64311  Alt_burn 1.37209
 y 0.46824  v -1.67311  Alt_burn 1.38416
 y 0.43418  v -1.70311  Alt_burn 1.39644
 y 0.39951  v -1.73311  Alt_burn 1.40895
 y 0.36425  v -1.76311  Alt_burn 1.42168
 y 0.32839  v -1.79311  Alt_burn 1.43463
 y 0.29193  v -1.82311  Alt_burn 1.44781
 y 0.25487  v -1.85311  Alt_burn 1.46121
 y 0.2172  v -1.88311  Alt_burn 1.47483
 y 0.17894  v -1.91311  Alt_burn 1.48867
 y 0.14008  v -1.94311  Alt_burn 1.50274
 y 0.10062  v -1.97311  Alt_burn 1.51702
 y 0.06055  v -2.00311  Alt_burn 1.53154
 y 0.01989  v -2.03311  Alt_burn 1.54626

The execution of the suicide burn2 is now trivial, we simply need to activate the engines when the lander is below burn altitude and deactivate them when the lander is below a cutoff altitude.

import gym
env = gym.make('LunarLander-v2')
env.seed(0)

g = 1.0
delta_t = 1.0/50.0
action = 0

state = env.reset()

y0 = state[1]
v0 = 0
cut_off = 0.01

for t in range(3000):
    env.render()
    state, reward, done, _  = env.step(action)
    y = state[1]
    v = (y - y0)/delta_t
    if done or y < 0 or v == 0.001:
        break

    alt_burn = (y*g+0.5*v*v)/(13.0 / env.lander.mass * 0.5)

    v0 = v
    y0 = y
    if y < alt_burn and y > cut_off:
        action = 2
    else:
        action = 0

    print(" y",round(y,5)," v",round(v,5)," Alt_burn",round(alt_burn,5))

And the lander will hoverslam:

enter image description here


1Shamefully I can't find a source for this equation. I have a degree in Applied Physics with an emphasis in Astrophysics and over 1000 hours of KSP - so I am absolutely certain this equation is correct, but I cannot for the life of me remember where I got it from.

2This brand of suicide burn is also referred to as a hoverslam. SpaceX coined this term because when landing a Falcon 9 Booster, one engine at minimum thrust generates a TWR greater than 1.00, meaning there is no capacity for a Powered Descent like that used by the Apollo Missions. Consequently, there is only an instantaneous hover - hence hoverslam.

like image 193
William Miller Avatar answered Nov 15 '22 04:11

William Miller