MySQL
Lets say there is a credit card processing company. Every time a credit card is used a row gets inserted into a table.
create table tran(
id int,
tran_dt datetime,
card_id int,
merchant_id int,
amount int
);
One wants to know what cards have been used 3+ times in any 15 minute window at the same merchant.
My attempt:
select card_id, date(tran_dt), hour(tran_dt), merchant_id, count(*)
from tran
group by card_id, date(tran_dt), hour(tran_dt), merchant_id
having count(*)>=3
The first problem is that would give excessive transactions per hour, not per a 15 minute window. The second problem is that would not catch transactions that cross the hour mark ie at 1:59pm and 2:01pm.
To make this simpler, it would ok to split up the hour into 5 minute increments. So we would not have to check 1:00-1:15pm, 1:01-1:16pm, etc. It would be ok to check 1:00-1:15pm, 1:05-1:20pm, etc., if that is easier.
Any ideas how to fix the sql? I have a feeling maybe I need sql window functions, that are not yet available in MySQL. Or write a stored procedure that can look at each 15 block.
http://sqlfiddle.com/#!9/f2d74/1
SQL Server COUNT() Function The COUNT() function returns the number of records returned by a select query. Note: NULL values are not counted.
We use system function now() to get the latest datetime value, and INTERVAL clause to calculate a date 7 days in the past.
For MySQL 5.7 or later, run this query to determine the THREAD_ID of the blocking transaction by the substituting blocking_thread processlist ID: SELECT THREAD_ID FROM performance_schema. threads WHERE PROCESSLIST_ID = blocking_thread; 4.
You can convert the date/time to seconds and do arithmetic on the seconds to get the value within a 15 minute clock interval:
select card_id, min(date(tran_dt)) as first_charge_time, merchant_id, count(*)
from tran
group by card_id, floor(to_seconds(tran_dt) / (60 * 15)), merchant_id
having count(*) >= 3;
The above uses to_seconds()
. In earlier versions of MySQL, you can use unix_timestamp()
.
Getting any 15 minute interval is more challenging. You can express the query as:
select t1.*, count(*) as numTransactions
from tran t1 join
tran t2
on t1.merchant_id = t2.merchanti_d and
t1.card_id = t2.card_id and
t2.tran_dt >= t1.tran_dt and
t2.tran_dt < t1.tran_dt + interval 15 minute
group by t1.id
having numTransactions >= 3;
Performance of this query might be problematic. An index on trans(card_id, merchant_id, tran_dt)
should help a lot.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With