Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implement a classic martingale using Python and Pandas

I want to implement a classic martingale using Python and Pandas in a betting system.

Let's say that this DataFrame is defined like this

df = pd.DataFrame(np.random.randint(0,2,100)*2-1, columns=['TossResults'])

so it contains toss results (-1=lose 1=win)

I would like to change stake (the amount I bet every bet) using classic martingale.

Initial stake is 1.

If I lose stake will be 2 times previous stake (multiplier=2).

If I win stake will be stake_initial

I did a function

def stake_martingale_classical(stake_previous, result_previous, multiplier, stake_initial):
    if (result_previous==-1): # lose
        stake = stake_previous*multiplier
    elif (result_previous==1):
        stake = stake_initial
    else:
        raise(Exception('Error result_previous must be equal to 1 (win) or -1 (lose)'))
    return(stake)

but I don't know how to implement it efficiently using Pandas. I tried this :

initial_stake = 1
df['Stake'] = None
df['Stake'][0] = initial_stake
df['TossResultsPrevious'] = self.df['TossResults'].shift(1) # shifting-lagging
df['StakePrevious'] = self.df['Stake'].shift(1) # shifting-lagging

but now, I need to apply this (multiparameters) function along 0-axis.

I don't know how to proceed !

I ever saw pandas.DataFrame.applymap function but it seems to be 1 parameter function only.

Maybe I'm wrong and using shift function is not a good idea

like image 887
working4coins Avatar asked Feb 07 '13 15:02

working4coins


2 Answers

One slight interpretation change is you need to mark a loss as a 1 and a win as a 0.

The first step is to find the edges of the losing runs, (steps + edges). You then need to take the difference of the sizes of the steps and shove those values back into the original data. When you take a cumsum of toss2 it gives you the current length of your losing streak. Your bet is then 2 ** cumsum(toss2).

The numpy version is faster than the pandas version, but the factor depends on N (~8 for N=100 and ~2 for N > 10000).


pandas

Using pandas.Series:

import pandas as pd
toss = np.random.randint(0,2,100)

toss = pd.Series(toss)

steps = (toss.cumsum() * toss).diff() # mask out the cumsum where we won [0 1 2 3 0 0 4 5 6 ... ]
edges = steps < 0 # find where the cumsum steps down -> where we won
dsteps = steps[edges].diff() # find the length of each losing streak
dsteps[steps[edges].index[0]] = steps[edges][:1] # fix length of the first run which in now NaN
toss2 = toss.copy() # get a copy of the toss series
toss2[edges] = dsteps # insert the length of the losing streaks into the copy of the toss results
bets = 2 ** (toss2).cumsum() # compute the wagers

res = pd.DataFrame({'toss': toss,
                    'toss2': toss2,
                    'runs': toss2.cumsum(),
                    'next_bet': bets})

numpy

This is the pure numpy version (my native language is it were). There is a bit of fineagling to get the arrays to line up that pandas does for you

toss = np.random.randint(0,2,100)

steps = np.diff(np.cumsum(toss) * toss)
edges = steps < 0
edges_shift = np.append(False, edges[:-1])
init_step = steps[edges][0]
toss2 = np.array(toss)
toss2[edges_shift] = np.append(init_step, np.diff(steps[edges]))
bets = 2 ** np.cumsum(toss2)

fmt_dict = {1:'l', 0:'w'}
for t, b in zip(toss, bets):
    print fmt_dict[t] + '-> {0:d}'.format(b)

pandas output

In [65]: res
Out[65]: 
    next_bet  runs  toss  toss2
0          1     0     0      0
1          2     1     1      1
2          4     2     1      1
3          8     3     1      1
4         16     4     1      1
5          1     0     0     -4
6          1     0     0      0
7          2     1     1      1
8          4     2     1      1
9          1     0     0     -2
10         1     0     0      0
11         2     1     1      1
12         4     2     1      1
13         1     0     0     -2
14         1     0     0      0
15         2     1     1      1
16         1     0     0     -1
17         1     0     0      0
18         2     1     1      1
19         1     0     0     -1
20         1     0     0      0
21         1     0     0      0
22         2     1     1      1
23         1     0     0     -1
24         2     1     1      1
25         1     0     0     -1
26         1     0     0      0
27         1     0     0      0
28         2     1     1      1
29         4     2     1      1
30         1     0     0     -2
31         2     1     1      1
32         4     2     1      1
33         1     0     0     -2
34         1     0     0      0
35         1     0     0      0
36         1     0     0      0
37         2     1     1      1
38         4     2     1      1
39         1     0     0     -2
40         2     1     1      1
41         4     2     1      1
42         8     3     1      1
43         1     0     0     -3
44         1     0     0      0
45         1     0     0      0
46         1     0     0      0
47         2     1     1      1
48         1     0     0     -1
49         2     1     1      1
50         1     0     0     -1
51         1     0     0      0
52         1     0     0      0
53         1     0     0      0
54         1     0     0      0
55         2     1     1      1
56         1     0     0     -1
57         1     0     0      0
58         1     0     0      0
59         1     0     0      0
60         1     0     0      0
61         2     1     1      1
62         1     0     0     -1
63         2     1     1      1
64         4     2     1      1
65         8     3     1      1
66        16     4     1      1
67        32     5     1      1
68         1     0     0     -5
69         2     1     1      1
70         1     0     0     -1
71         2     1     1      1
72         4     2     1      1
73         1     0     0     -2
74         2     1     1      1
75         1     0     0     -1
76         1     0     0      0
77         2     1     1      1
78         4     2     1      1
79         1     0     0     -2
80         1     0     0      0
81         2     1     1      1
82         1     0     0     -1
83         1     0     0      0
84         1     0     0      0
85         1     0     0      0
86         2     1     1      1
87         4     2     1      1
88         8     3     1      1
89        16     4     1      1
90        32     5     1      1
91        64     6     1      1
92         1     0     0     -6
93         1     0     0      0
94         1     0     0      0
95         1     0     0      0
96         2     1     1      1
97         1     0     0     -1
98         1     0     0      0
99         1     0     0      0

numpy output

(different seed than panadas results)

(result -> next bet):
w->  1
l->  2
w->  1
w->  1
l->  2
w->  1
l->  2
w->  1
l->  2
l->  4
w->  1
l->  2
w->  1
l->  2
l->  4
w->  1
w->  1
w->  1
l->  2
l->  4
l->  8
w->  1
l->  2
l->  4
w->  1
l->  2
l->  4
w->  1
w->  1
l->  2
w->  1
w->  1
w->  1
w->  1
l->  2
l->  4
w->  1
w->  1
l->  2
l->  4
l->  8
w->  1
w->  1
l->  2
l->  4
w->  1
w->  1
w->  1
w->  1
w->  1
w->  1
l->  2
w->  1
l->  2
w->  1
l->  2
w->  1
w->  1
w->  1
w->  1
w->  1
w->  1
l->  2
l->  4
l->  8
l->  16
w->  1
l->  2
l->  4
w->  1
w->  1
w->  1
w->  1
l->  2
w->  1
w->  1
l->  2
w->  1
w->  1
w->  1
l->  2
w->  1
w->  1
w->  1
w->  1
w->  1
w->  1
l->  2
l->  4
l->  8
w->  1
w->  1
l->  2
l->  4
l->  8
w->  1
l->  2
l->  4
w->  1
l->  2
like image 97
tacaswell Avatar answered Nov 15 '22 08:11

tacaswell


Pandas is going to get it's biggest efficiency wins when you can use vectorized operations, but I think this problem requires iteration. A solution using pandas:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0,2,100)*2-1, columns=['TossResults'])
initial_stake = 1
df['Stake'] = initial_stake

for i in xrange(1,df.shape[0]):
    if df.TossResults[i-1] == -1:
        df.Stake[i] = 2 * df.Stake[i-1]
like image 23
DanB Avatar answered Nov 15 '22 08:11

DanB