how can I manage a huge list of 100+ million strings? How can i begin to work with such a huge list?
example large list:
cards = [
"2s","3s","4s","5s","6s","7s","8s","9s","10s","Js","Qs","Ks","As"
"2h","3h","4h","5h","6h","7h","8h","9h","10h","Jh","Qh","Kh","Ah"
"2d","3d","4d","5d","6d","7d","8d","9d","10d","Jd","Qd","Kd","Ad"
"2c","3c","4c","5c","6c","7c","8c","9c","10c","Jc","Qc","Kc","Ac"
]
from itertools import combinations
cardsInHand = 7
hands = list(combinations(cards, cardsInHand))
print str(len(hands)) + " hand combinations in texas holdem poker"
With lots and lots of memory. Python lists and strings are actually reasonably efficient, so provided you've got the memory, it shouldn't be an issue.
That said, if what you're storing are specifically poker hands, you can definitely come up with more compact representations. For example, you can use one byte to encode each card, which means you only need one 64 bit int to store an entire hand. You could then store these in a NumPy array, which would be significantly more efficient than a Python list.
For example:
>>> cards_to_bytes = dict((card, num) for (num, card) in enumerate(cards))
>>> import numpy as np
>>> hands = np.zeros(133784560, dtype='7int8') # 133784560 == 52c7
>>> for num, hand in enumerate(itertools.combinations(cards, 7)):
... hands[num] = [cards_to_bytes[card] for card in hand]
And to speed up that last line a bit: hands[num] = map(cards_to_bytes.__getitem__, hand)
This will only require 7 * 133784560 = ~1gb of memory… And that could be cut down if you pack four cards into each byte (I don't know the syntax for doing that off the top of my head…)
If you just want to loop over all possible hands to count them or to find one with a certain property, there is no need to store them all in memory.
You can just use the iterator and not convert to a list:
from itertools import combinations
cardsInHand = 7
hands = combinations(cards, cardsInHand)
n = 0
for h in hands:
n += 1
# or do some other stuff here
print n, "hand combinations in texas holdem poker."
85900584 hand combinations in texas holdem poker.
Another memory-less option which allow you to create a stream of data for processing however you like is to use generators. For example.
Print the total number of hands:
sum (1 for x in combinations(cards, 7))
Print the number of hands with the ace of clubs in it:
sum (1 for x in combinations(cards, 7) if 'Ac' in x)
There's often a trade-off between how long you spend coding and how long your code takes to run. If you're just trying to get something done quickly and don't expect it to run frequently, an approach like you're suggesting is fine. Just make the list huge -- if you don't have enough RAM, your system will churn virtual memory, but you'll probably get your answer faster than learning how to write a more sophisticated solution.
But if this is a system that you expect to be used on a regular basis, you should figure out something other than storing everything in RAM. An SQL database is probably what you want. They can be very complex, but because they are nearly ubiquitous there are plenty of excellent tutorials out there.
You might look to a well-documented framework like django which simplifies access to a database through an ORM layer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With