Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Decrementing money balance stored on master server from numerous backends? (distributed counter, eh?)

I have some backend servers located in two differend datacenters (in USA and in Europe). These servers are just delivering ads on CPM basis.

Beside that I have big & fat master MySQL server serving advertiser's ad campaign's money balances. Again, all ad campaigns are being delivered on CPM basis.

On every impression served from any of backends I have to decrement ad campaign's money balance according to impression price.

For example, price per one impression is 1 cent. Backend A has delivered 50 impressions and will decrement money balance by 50 cents. Backed B has delivered 30 impressions and it will decrement money balance by 30 cents.

So, main problems as I see are:

  • Backends are serving about 2-3K impressions every seconds. So, decrementing money balance on fly in MySQL is not a good idea imho.

  • Backends are located in US and EU datacenters. MySQL master server is located in USA. Network latency could be a problem [EU backend] <-> [US master]

As possible solutions I see:

  • Using Cassandra as distributed counter storage. I will try to be aware of this solution as long possible.

  • Reserving part on money by backend. For example, backend A is connecting to master and trying to reserve $1. As $1 is reserved and stored locally on backend (in local Redis for example) there is no problem to decrement it with light speed. Main problem I see is returning money from backend to master server if backend is being disabled from delivery scheme ("disconnected" from balancer). Anyway, it seems to be very nice solution and will allow to stay in current technology stack.

  • Any suggestions?

UPD: One important addition. It is not so important to deliver ads impressions with high precision. We can deliver more impressions than requested, but never less.

like image 888
Kirzilla Avatar asked Jan 21 '14 05:01

Kirzilla


2 Answers

How about instead of decrementing balance, you keep a log of all reported work from each backend, and then calculate balance when you need it by subtracting the sum of all reported work from the campaign's account?

Tables:

campaign (campaign_id, budget, ...)
impressions (campaign_id, backend_id, count, ...)

Report work:

INSERT INTO impressions VALUES ($campaign_id, $backend_id, $served_impressions);

Calculate balance of a campaign only when necessary:

SELECT campaign.budget - impressions.count * $impression_price AS balance
FROM campaign INNER JOIN impressions USING (campaign_id);
like image 169
lanzz Avatar answered Sep 21 '22 00:09

lanzz


This is perhaps the most classical ad-serving/impression-counting problem out there. You're basically trying to balance a few goals:

  1. Not under-serving ad inventory, thus not making as much money as you could.
  2. Not over-serving ad inventory, thus serving for free since you can't charge the customer for your mistake.
  3. Not serving the impressions too quickly, because usually customers want an ad to run through a given calendar time period, and serving them all in an hour between 2-3 AM makes those customers unhappy and doesn't do them any good.

This is tricky because you don't necessarily know how many impressions will be available for a given spot (since it depends on traffic), and it gets even more tricky if you do CPC instead of CPM, since you then introduce another unknowable variable of click-through rate.

There isn't a single "right" pattern for this, but what I have seen to be successful through my years of consulting is:

  • Treat the backend database as your authoritative store. Partition it by customer as necessary to support your goals for scalability and fault tolerance (limiting possible outages to a fraction of customers). The database knows that you have an ad insertion order for e.g. 1000 impressions over the course of 7 days. It is periodically updated (minutes to hours) to reflect the remaining inventory and some basic stats to bootstrap the cache in case of cache loss, such as actual

  • Don't bother with money balances at the ad server level. Deal with impression counts, rates, and targets only. Settle that to money balances after the fact through logging and offline processing.

  • Serve ad inventory from a very lightweight and fast cache (near the web servers) which caches the impression remaining count and target serving velocity of an insertion order, and calculates the actual serving velocity.

  • Log all served impressions with relevant data.

  • Periodically collect serving velocities and push them back to the database.

  • Periodically collect logs and calculate actual served inventory and push it back to the database. (You may need to recalculate from logs due to outages, DoSes, spam, etc.)

like image 37
jeremycole Avatar answered Sep 22 '22 00:09

jeremycole