Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregating customer spend without any customer ID

I have 2 columns as below. The first column is spend, and the second column is months from offer. Unfortunately there is no ID to identify each customer. In the case below, there are three customers. e.g. The first 5 rows represent customer 1, the next 3 rows are customer 2, and then final 7 rows are customer 3. You can tell by looking at the months_from_offer, which go from -x to x months for each customer (x is not necessarily the same for each customer, as shown here where x=2,1,3 respectively for customers 1,2,3).

What I am looking to do is calculate the difference in post offer spend vs pre-offer spend for each customer. I don't care about the individual customers themselves, but I would like an overview - e.g. 10 customers had a post/pre difference in between $0-$100.

As an example with the data below, to calculate the post/pre offer difference for customer 1, it is -$10 - $32 + $23 + $54 = $35

for customer 2: -$21 + $87 = $66

for customer 3: -$12 - $83 - $65 + $80 + $67 + $11 = -$2

spend  months_from_offer
$10    -2
$32    -1
$43     0
$23     1
$54     2
$21    -1
$23     0
$87     1
$12    -3
$83    -2
$65    -1
$21     0
$80     1
$67     2
$11     3
like image 623
novice Avatar asked Dec 30 '22 23:12

novice


1 Answers

You can identify the customers using the following and then groupby customer:

df['customer'] = df['months_from_offer'].cumsum().shift().eq(0).cumsum().add(1)
#Another way to calculate customer per @teylyn method
#df['customer'] = np.sign(df['months_from_offer']).diff().lt(0).cumsum().add(1)

df['amount'] = df['spend'].str[1:].astype(int) * np.sign(df['months_from_offer']
df.groupby('customer')['amount'].sum().reset_index()

Output:

   customer  amount
0         1      35
1         2      66
2         3      -2

How it is done:

  spend  months_from_offer  customer  amount
0    $10                 -2         1     -10
1    $32                 -1         1     -32
2    $43                  0         1       0
3    $23                  1         1      23
4    $54                  2         1      54
5    $21                 -1         2     -21
6    $23                  0         2       0
7    $87                  1         2      87
8    $12                 -3         3     -12
9    $83                 -2         3     -83
10   $65                 -1         3     -65
11   $21                  0         3       0
12   $80                  1         3      80
13   $67                  2         3      67
14   $11                  3         3      11
  • Calculate 'customer' column using cumsum, shift and eq and add to start at customer 1.
  • Calculate 'amount' using string manipulation and multiply by np.sign 'month from offer'
  • sum 'amount' with groupby 'customer'
like image 113
Scott Boston Avatar answered Jan 06 '23 02:01

Scott Boston