I have a custom environment in keras-rl with the following configurations in the constructor
def __init__(self, data):
    #Declare the episode as the first episode
    self.episode=1
    #Initialize data      
    self.data=data
    #Declare low and high as vectors with -inf values 
    self.low = numpy.array([-numpy.inf])
    self.high = numpy.array([+numpy.inf])
    self.observation_space = spaces.Box(self.low, self.high, dtype=numpy.float32)
    #Define the space of actions as 3 (I want them to be 0, 1 and 2)
    self.action_space = spaces.Discrete(3) 
    self.currentObservation = 0
    self.limit = len(data)      
    #Initiates the values to be returned by the environment
    self.reward = None
As you can see, my agent will perform 3 actions, depending on the action, a different reward will be calculated in the function step() below:
def step(self, action):
    assert self.action_space.contains(action)
    #Initiates the reward
    self.reward=0
    #get the reward 
    self.possibleGain = self.data.iloc[self.currentObservation]['delta_next_day']
    #If action is 1, calculate the reward 
    if(action == 1):
        self.reward = self.possibleGain-self.operationCost
    #If action is 2, calculate the reward as negative     
    elif(action==2):
        self.reward = (-self.possibleGain)-self.operationCost
    #If action is 0, no reward     
    elif(action==0):
        self.reward = 0
    #Finish episode 
    self.done=True 
    self.episode+=1   
    self.currentObservation+=1
    if(self.currentObservation>=self.limit):
        self.currentObservation=0
    #Return the state, reward and if its done or not
    return self.getObservation(), self.reward, self.done, {}
The problem is the fact that, if I print the actions at every episode, they are 0, 2, and 4. I want them to be 0, 1 and 2. How can I force the agent to recognize only these 3 actions with keras-rl?
I am not sure why self.action_space = spaces.Discrete(3) is giving you actions as 0,2,4 since I cannot reproduce your error with the code snippet you posted, so I would suggest the following for defining your action  
self.action_space = gym.spaces.Box(low=np.array([1]),high= np.array([3]), dtype=np.int)
And this what I get when I sample from the action space.
actions= gym.spaces.Box(low=np.array([1]),high= np.array([3]), dtype=np.int)
for i in range(10):
    print(actions.sample())
[1]
[3]
[2]
[2]
[3]
[3]
[1]
[1]
[2]
[3]
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With