<h3>Questions</h3> Is there a best value to stay on so that I win the greatest percentage of games possible? If so, what is it? Edit: Is there an exact probability of winning that can be calculated for a given limit, independent of whatever the opponent does? (I haven't done probability & statistics since college). I'd be interested in seeing that as an answer to contrast it with my simulated results. Edit: Fixed bugs in my algorithm, updated result table. <h3>Background</h3> I've been playing a modified blackjack game with some rather annoying rule tweaks from the standard rules. I've italicized the rules that are different from the standard blackjack rules, as well as included the rules of blackjack for those not familiar. <h3>Modified Blackjack Rules</h3> <ol> <li>Exactly two human players (dealer is irrelevant)</li> <li> Each player is dealt two cards face down <ul> <li>Neither player _ever_ knows the value of _any_ of the opponent's cards</li> <li>Neither player knows the value of the opponent's hand until _both_ have finished the hand</li> </ul> </li> <li>Goal is to come as close to score of 21 as possible. Outcomes: <ul> <li>If player's A & B have identical score, game is a draw </li> <li>If player's A & B both have a score over 21 (a bust), game is a draw </li> <li>If player A's score is <= 21 and player B has busted, player A wins </li> <li>If player A's score is greater than player B's, and neither have busted, player A wins </li> <li>Otherwise, player A has lost (B has won).</li> </ul> </li> <li>Cards are worth: <ul> <li>Cards 2 through 10 are worth the corresponding amount of points</li> <li>Cards J, Q, K are worth 10 points</li> <li>Card Ace is worth 1 or 11 points</li> </ul> </li> <li>Each player may request additional cards one at a time until: <ul> <li>The player doesn't want any more (stay)</li> <li>The player's score, with any Aces counted as 1, exceeds 21 (bust)</li> <li>Neither player knows how many cards the other has used at any time</li> </ul> </li> <li>Once both players have either stayed or busted the winner is determined per rule 3 above.</li> <li> After each hand the entire deck is reshuffled and all 52 cards are in play again </li> </ol> <h3>What is a deck of cards?</h3> A deck of cards consists of 52 cards, four each of the following 13 values: <blockquote> 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K, A </blockquote> No other property of the cards are relevant. A Ruby representation of this is: <pre class="prettyprint"><code>CARDS = ((2..11).to_a+[10]*3)*4 </code></pre> <h3>Algorithm</h3> I've been approaching this as follows: <ul> <li>I will always want to hit if my score is 2 through 11, as it is impossible to bust</li> <li>For each of the scores 12 through 21 I will simulate N hands against an opponent <ul> <li>For these N hands, the score will be my "limit". Once I reach the limit or greater, I will stay.</li> <li>My opponent will follow the exact same strategy</li> <li>I will simulate N hands for every permutation of the sets (12..21), (12..21)</li> </ul> </li> <li>Print the difference in wins and losses for each permutation as well as the net win loss difference</li> </ul> Here is the algorithm implemented in Ruby: <pre class="prettyprint"><code>#!/usr/bin/env ruby class Array def shuffle sort_by { rand } end def shuffle! self.replace shuffle end def score sort.each_with_index.inject(0){|s,(c,i)| s+c > 21 - (size - (i + 1)) && c==11 ? s+1 : s+c } end end N=(ARGV[0]||100_000).to_i NDECKS = (ARGV[1]||1).to_i CARDS = ((2..11).to_a+[10]*3)*4*NDECKS CARDS.shuffle my_limits = (12..21).to_a opp_limits = my_limits.dup puts " " * 55 + "opponent_limit" printf "my_limit |" opp_limits.each do |result| printf "%10s", result.to_s end printf "%10s", "net" puts printf "-" * 8 + " |" print " " + "-" * 8 opp_limits.each do |result| print " " + "-" * 8 end puts win_totals = Array.new(10) win_totals.map! { Array.new(10) } my_limits.each do |my_limit| printf "%8s |", my_limit $stdout.flush opp_limits.each do |opp_limit| if my_limit == opp_limit # will be a tie, skip win_totals[my_limit-12][opp_limit-12] = 0 print " --" $stdout.flush next elsif win_totals[my_limit-12][opp_limit-12] # if previously calculated, print printf "%10d", win_totals[my_limit-12][opp_limit-12] $stdout.flush next end win = 0 lose = 0 draw = 0 N.times { cards = CARDS.dup.shuffle my_hand = [cards.pop, cards.pop] opp_hand = [cards.pop, cards.pop] # hit until I hit limit while my_hand.score < my_limit my_hand << cards.pop end # hit until opponent hits limit while opp_hand.score < opp_limit opp_hand << cards.pop end my_score = my_hand.score opp_score = opp_hand.score my_score = 0 if my_score > 21 opp_score = 0 if opp_score > 21 if my_hand.score == opp_hand.score draw += 1 elsif my_score > opp_score win += 1 else lose += 1 end } win_totals[my_limit-12][opp_limit-12] = win-lose win_totals[opp_limit-12][my_limit-12] = lose-win # shortcut for the inverse printf "%10d", win-lose $stdout.flush end printf "%10d", win_totals[my_limit-12].inject(:+) puts end </code></pre> Usage <pre class="prettyprint"><code>ruby blackjack.rb [num_iterations] [num_decks] </code></pre> The script defaults to 100,000 iterations and 4 decks. 100,000 takes about 5 minutes on a fast macbook pro. <h3>Output (N = 100 000)</h3> <pre class="prettyprint"><code> opponent_limit my_limit | 12 13 14 15 16 17 18 19 20 21 net -------- | -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- 12 | -- -7666 -13315 -15799 -15586 -10445 -2299 12176 30365 65631 43062 13 | 7666 -- -6962 -11015 -11350 -8925 -975 10111 27924 60037 66511 14 | 13315 6962 -- -6505 -9210 -7364 -2541 8862 23909 54596 82024 15 | 15799 11015 6505 -- -5666 -6849 -4281 4899 17798 45773 84993 16 | 15586 11350 9210 5666 -- -6149 -5207 546 11294 35196 77492 17 | 10445 8925 7364 6849 6149 -- -7790 -5317 2576 23443 52644 18 | 2299 975 2541 4281 5207 7790 -- -11848 -7123 8238 12360 19 | -12176 -10111 -8862 -4899 -546 5317 11848 -- -18848 -8413 -46690 20 | -30365 -27924 -23909 -17798 -11294 -2576 7123 18848 -- -28631 -116526 21 | -65631 -60037 -54596 -45773 -35196 -23443 -8238 8413 28631 -- -255870 </code></pre> <h3>Interpretation</h3> This is where I struggle. I'm not quite sure how to interpret this data. At first glance it seems like always staying at 16 or 17 is the way to go, but I'm not sure if it's that easy. I think it's unlikely that an actual human opponent would stay on 12, 13, and possibly 14, so should I throw out those opponent_limit values? Also, how can I modify this to take into account the variability of a real human opponent? e.g. a real human is likely to stay on 15 just based on a "feeling" and may also hit on 18 based on a "feeling"

I'm suspicious of your results. For example, if the opponent aims for 19, your data says that the best way to beat him is to hit until you reach 20. This does not pass a basic smell test. Are you sure you don't have a bug? If my opponent is striving for 19 or better, my strategy would be to avoid busting at all costs: stay on anything 13 or higher (maybe even 12?). Going for 20 has to wrong -- and not just by a small margin, but by a lot. How do I know that your data is bad? Because the blackjack game you are playing isn't unusual. It's the way a dealer plays in most casinos: the dealer hits up to a target and then stops, regardless of what the other players hold in their hands. What is that target? Stand on hard 17 and hit soft 17. When you get rid of the bugs in your script, it should confirm that the casinos know their business. When I make the following replacements to your code: <pre class="prettyprint"><code># Replace scoring method. def score s = inject(0) { |sum, c| sum + c } return s if s < 21 n_aces = find_all { |c| c == 11 }.size while s > 21 and n_aces > 0 s -= 10 n_aces -= 1 end return s end # Replace section of code determining hand outcome. my_score = my_hand.score opp_score = opp_hand.score my_score = 0 if my_score > 21 opp_score = 0 if opp_score > 21 if my_score == opp_score draw += 1 elsif my_score > opp_score win += 1 else lose += 1 end </code></pre> The results agree with the behavior of casino dealers: 17 is the optimal target. <pre class="prettyprint"><code>n=10000 opponent_limit my_limit | 12 13 14 15 16 17 18 19 20 21 net -------- | -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- 12 | -- -843 -1271 -1380 -1503 -1148 -137 1234 3113 6572 13 | 843 -- -642 -1041 -1141 -770 -93 1137 2933 6324 14 | 1271 642 -- -498 -784 -662 93 1097 2977 5945 15 | 1380 1041 498 -- -454 -242 -100 898 2573 5424 16 | 1503 1141 784 454 -- -174 69 928 2146 4895 17 | 1148 770 662 242 174 -- 38 631 1920 4404 18 | 137 93 -93 100 -69 -38 -- 489 1344 3650 19 | -1234 -1137 -1097 -898 -928 -631 -489 -- 735 2560 20 | -3113 -2933 -2977 -2573 -2146 -1920 -1344 -735 -- 1443 21 | -6572 -6324 -5945 -5424 -4895 -4404 -3650 -2560 -1443 -- </code></pre> Some miscellaneous comments: The current design is inflexible. With a just little refactoring, you could achieve a clean separation between the operation of the game (dealing, shuffling, keeping running stats) and player decision making. This would allow you to test various strategies against each other. Currently, your strategies are embedded in loops that are all tangled up in the game operation code. Your experimentation would be better served by a design that allowed you to create new players and set their strategy at will.

What is the optimal winning strategy for this modified blackjack game?

Q: Can you win blackjack playing basic strategy?

The goal of the blackjack basic strategy isn't to help you win at blackjack every time you play. That's impossible. Instead, it is to help you maximize your winning chances and to minimize your losses.

Q: What is the probability of winning blackjack?

According to my blackjack appendix 4, the probability of an overall win in blackjack is 42.22%, a tie is 8.48%, and a loss is 49.10%. I'm going to assume you wish to ignore ties for purposes of the streak. In that case, the probability of a win, given a resolved bet, is 46.36%.

Questions

Is there a best value to stay on so that I win the greatest percentage of games possible? If so, what is it?

Edit: Is there an exact probability of winning that can be calculated for a given limit, independent of whatever the opponent does? (I haven't done probability & statistics since college). I'd be interested in seeing that as an answer to contrast it with my simulated results.

Edit: Fixed bugs in my algorithm, updated result table.

Background

I've been playing a modified blackjack game with some rather annoying rule tweaks from the standard rules. I've italicized the rules that are different from the standard blackjack rules, as well as included the rules of blackjack for those not familiar.

Modified Blackjack Rules

Exactly two human players (dealer is irrelevant)
Each player is dealt two cards face down
- Neither player _ever_ knows the value of _any_ of the opponent's cards
- Neither player knows the value of the opponent's hand until _both_ have finished the hand
Goal is to come as close to score of 21 as possible. Outcomes:
- If player's A & B have identical score, game is a draw
- If player's A & B both have a score over 21 (a bust), game is a draw
- If player A's score is <= 21 and player B has busted, player A wins
- If player A's score is greater than player B's, and neither have busted, player A wins
- Otherwise, player A has lost (B has won).
Cards are worth:
- Cards 2 through 10 are worth the corresponding amount of points
- Cards J, Q, K are worth 10 points
- Card Ace is worth 1 or 11 points
Each player may request additional cards one at a time until:
- The player doesn't want any more (stay)
- The player's score, with any Aces counted as 1, exceeds 21 (bust)
- Neither player knows how many cards the other has used at any time
Once both players have either stayed or busted the winner is determined per rule 3 above.
After each hand the entire deck is reshuffled and all 52 cards are in play again

What is a deck of cards?

A deck of cards consists of 52 cards, four each of the following 13 values:

2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K, A

No other property of the cards are relevant.

A Ruby representation of this is:

CARDS = ((2..11).to_a+[10]*3)*4

Algorithm

I've been approaching this as follows:

I will always want to hit if my score is 2 through 11, as it is impossible to bust
For each of the scores 12 through 21 I will simulate N hands against an opponent
- For these N hands, the score will be my "limit". Once I reach the limit or greater, I will stay.
- My opponent will follow the exact same strategy
- I will simulate N hands for every permutation of the sets (12..21), (12..21)
Print the difference in wins and losses for each permutation as well as the net win loss difference

Here is the algorithm implemented in Ruby:

#!/usr/bin/env ruby
class Array
  def shuffle
    sort_by { rand }
  end

  def shuffle!
    self.replace shuffle
  end

  def score
    sort.each_with_index.inject(0){|s,(c,i)|
      s+c > 21 - (size - (i + 1)) && c==11 ? s+1 : s+c
    }
  end
end

N=(ARGV[0]||100_000).to_i
NDECKS = (ARGV[1]||1).to_i

CARDS = ((2..11).to_a+[10]*3)*4*NDECKS
CARDS.shuffle

my_limits = (12..21).to_a
opp_limits = my_limits.dup

puts " " * 55 + "opponent_limit"
printf "my_limit |"
opp_limits.each do |result|
  printf "%10s", result.to_s
end
printf "%10s", "net"
puts

printf "-" * 8 + " |"
print "  " + "-" * 8
opp_limits.each do |result|
  print "  " + "-" * 8
end
puts

win_totals = Array.new(10)
win_totals.map! { Array.new(10) }

my_limits.each do |my_limit|
  printf "%8s |", my_limit
  $stdout.flush
  opp_limits.each do |opp_limit|

    if my_limit == opp_limit # will be a tie, skip
      win_totals[my_limit-12][opp_limit-12] = 0
      print "        --"
      $stdout.flush
      next
    elsif win_totals[my_limit-12][opp_limit-12] # if previously calculated, print
      printf "%10d", win_totals[my_limit-12][opp_limit-12]
      $stdout.flush
      next
    end

    win = 0
    lose = 0
    draw = 0

    N.times {
      cards = CARDS.dup.shuffle
      my_hand = [cards.pop, cards.pop]
      opp_hand = [cards.pop, cards.pop]

      # hit until I hit limit
      while my_hand.score < my_limit
        my_hand << cards.pop
      end

      # hit until opponent hits limit
      while opp_hand.score < opp_limit
        opp_hand << cards.pop
      end

      my_score = my_hand.score
      opp_score = opp_hand.score
      my_score = 0 if my_score > 21 
      opp_score = 0 if opp_score > 21

      if my_hand.score == opp_hand.score
        draw += 1
      elsif my_score > opp_score
        win += 1
      else
        lose += 1
      end
    }

    win_totals[my_limit-12][opp_limit-12] = win-lose
    win_totals[opp_limit-12][my_limit-12] = lose-win # shortcut for the inverse

    printf "%10d", win-lose
    $stdout.flush
  end
  printf "%10d", win_totals[my_limit-12].inject(:+)
  puts
end

Usage

ruby blackjack.rb [num_iterations] [num_decks]

The script defaults to 100,000 iterations and 4 decks. 100,000 takes about 5 minutes on a fast macbook pro.

Output (N = 100 000)

                                                       opponent_limit
my_limit |        12        13        14        15        16        17        18        19        20        21       net
-------- |  --------  --------  --------  --------  --------  --------  --------  --------  --------  --------  --------
      12 |        --     -7666    -13315    -15799    -15586    -10445     -2299     12176     30365     65631     43062
      13 |      7666        --     -6962    -11015    -11350     -8925      -975     10111     27924     60037     66511
      14 |     13315      6962        --     -6505     -9210     -7364     -2541      8862     23909     54596     82024
      15 |     15799     11015      6505        --     -5666     -6849     -4281      4899     17798     45773     84993
      16 |     15586     11350      9210      5666        --     -6149     -5207       546     11294     35196     77492
      17 |     10445      8925      7364      6849      6149        --     -7790     -5317      2576     23443     52644
      18 |      2299       975      2541      4281      5207      7790        --    -11848     -7123      8238     12360
      19 |    -12176    -10111     -8862     -4899      -546      5317     11848        --    -18848     -8413    -46690
      20 |    -30365    -27924    -23909    -17798    -11294     -2576      7123     18848        --    -28631   -116526
      21 |    -65631    -60037    -54596    -45773    -35196    -23443     -8238      8413     28631        --   -255870

Interpretation

This is where I struggle. I'm not quite sure how to interpret this data. At first glance it seems like always staying at 16 or 17 is the way to go, but I'm not sure if it's that easy. I think it's unlikely that an actual human opponent would stay on 12, 13, and possibly 14, so should I throw out those opponent_limit values? Also, how can I modify this to take into account the variability of a real human opponent? e.g. a real human is likely to stay on 15 just based on a "feeling" and may also hit on 18 based on a "feeling"

309

asked Feb 20 '10 07:02

hobodave

1 Answers

I'm suspicious of your results. For example, if the opponent aims for 19, your data says that the best way to beat him is to hit until you reach 20. This does not pass a basic smell test. Are you sure you don't have a bug? If my opponent is striving for 19 or better, my strategy would be to avoid busting at all costs: stay on anything 13 or higher (maybe even 12?). Going for 20 has to wrong -- and not just by a small margin, but by a lot.

How do I know that your data is bad? Because the blackjack game you are playing isn't unusual. It's the way a dealer plays in most casinos: the dealer hits up to a target and then stops, regardless of what the other players hold in their hands. What is that target? Stand on hard 17 and hit soft 17. When you get rid of the bugs in your script, it should confirm that the casinos know their business.

When I make the following replacements to your code:

# Replace scoring method.
def score
  s = inject(0) { |sum, c| sum + c }
  return s if s < 21
  n_aces = find_all { |c| c == 11 }.size
  while s > 21 and n_aces > 0
      s -= 10
      n_aces -= 1
  end
  return s
end

# Replace section of code determining hand outcome.
my_score  = my_hand.score
opp_score = opp_hand.score
my_score  = 0 if my_score  > 21
opp_score = 0 if opp_score > 21
if my_score == opp_score
  draw += 1
elsif my_score > opp_score
  win += 1
else
  lose += 1
end

The results agree with the behavior of casino dealers: 17 is the optimal target.

n=10000
                                                       opponent_limit
my_limit |        12        13        14        15        16        17        18        19        20        21       net
-------- |  --------  --------  --------  --------  --------  --------  --------  --------  --------  --------  --------
      12 |        --      -843     -1271     -1380     -1503     -1148      -137      1234      3113      6572
      13 |       843        --      -642     -1041     -1141      -770       -93      1137      2933      6324
      14 |      1271       642        --      -498      -784      -662        93      1097      2977      5945
      15 |      1380      1041       498        --      -454      -242      -100       898      2573      5424
      16 |      1503      1141       784       454        --      -174        69       928      2146      4895
      17 |      1148       770       662       242       174        --        38       631      1920      4404
      18 |       137        93       -93       100       -69       -38        --       489      1344      3650
      19 |     -1234     -1137     -1097      -898      -928      -631      -489        --       735      2560
      20 |     -3113     -2933     -2977     -2573     -2146     -1920     -1344      -735        --      1443
      21 |     -6572     -6324     -5945     -5424     -4895     -4404     -3650     -2560     -1443        --

Some miscellaneous comments:

The current design is inflexible. With a just little refactoring, you could achieve a clean separation between the operation of the game (dealing, shuffling, keeping running stats) and player decision making. This would allow you to test various strategies against each other. Currently, your strategies are embedded in loops that are all tangled up in the game operation code. Your experimentation would be better served by a design that allowed you to create new players and set their strategy at will.

answered Sep 20 '22 10:09

FMc

Related questions
                            
                                How to test signal handling in RSpec, particularly handling of SIGTERM?
                            
                                IRB doesn't respect dot (.) as a word-break character
                            
                                In JRuby, how do I determine what causes java.lang.ThreadDeath?
                            
                                How to find memory used by ruby object?
                            
                                Documenting def_delegators with Yardoc
                            
                                Rack middleware and thread-safety
                            
                                Rspec: How to test ActiveRecord::Base.connection.execute
                            
                                Ruby variable definition [duplicate]
                            
                                Ruby string to rust and back again
                            
                                Extending ActiveRecord::Base
                            
                                Use custom colors with Spreadsheet gem
                            
                                1GB memory allocated to "lib/ruby/2.1.0/timeout.rb"
                            
                                Unknown Host Error while updating Ruby-Gems
                            
                                How to create CloudWatch logs trigger for AWS Lambda using aws ruby SDK?
                            
                                SSL_connect returned=1 errno=0 state=error: certificate verify failed (unable to get local issuer certificate)
                            
                                Attempt to get Twitter request_token using Oauth 1.0 keeps giving '215 Bad Authentication' error
                            
                                Ruby string mutability
                            
                                Which 3D engine for ruby
                            
                                Where are MSG_ options defined for ruby sockets?
                            
                                Programmable transparent forward proxy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the optimal winning strategy for this modified blackjack game?

Tags:

language-agnostic

algorithm

ruby

probability

playing-cards