Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fetch range from days

Tags:

php

mysql

I have this table structure:

EDIT more complex example: add hidden range

category|   day      |   a   |
--------|------------|-------|
1       | 2012-01-01 |   4   |
1       | 2012-01-02 |   4   |
1       | 2012-01-03 |   4   |
1       | 2012-01-04 |   4   |
1       | 2012-01-05 |   5   |
1       | 2012-01-06 |   5   |
1       | 2012-01-07 |   5   |
1       | 2012-01-08 |   4   |
1       | 2012-01-09 |   4   |
1       | 2012-01-10 |   4   |
1       | 2012-01-11 |   5   |
1       | 2012-01-12 |   5   |
1       | 2012-01-16 |   5   |
1       | 2012-01-17 |   5   |
1       | 2012-01-18 |   5   |
1       | 2012-01-19 |   5   |
...

with 'category-day' as unique keys. I would extract a range of dates, for each category, according with column "a" and given limit range, like so:

1,2012-01-01|2012-01-04,4
1,2012-01-05|2012-01-07,5
1,2012-01-08|2012-01-10,4
1,2012-01-11|2012-01-12,5
1,2012-01-13|2012-01-15,0
1,2012-01-16|2012-01-19,5

or similar.

I search the best way for do it. Using only mysql preferably but also with a little bit of php.

NOTE1: not all day are inserted: between two days non-contiguos could not be other days. In this case I would in output the missed range with column "a" = 0.

NOTE2: I did it with a simple query and some rows of php but I don't like it because my simple algorithm need a cycle for each day in range multiplied for each category found. If range is too big and there are too much categories, that's not so good.

FINAL EDIT: OK! After reading all comments and answers, I think not exists a valid, efficient and, at same time, readable solution. So Mosty Mostacho answer is a no 100% valid solution, but it has 100% valid suggestions. Thank you all.

like image 978
Luca Rainone Avatar asked Mar 30 '12 15:03

Luca Rainone


2 Answers

New edit:

As I told you in a comment, I strongly recommend you to use the quick query and then process the missing dates in PHP as that would be faster and more readable:

select
  concat(@category := category, ',', min(day)) col1,
  concat(max(day), ',', @a := a) col2
from t, (select @category := '', @a := '', @counter := 0) init
where @counter := @counter + (category != @category or a != @a)
group by @counter, category, a

However, if you still want to use the query version, then try this:

select
  @counter := @counter + (category != @category or a != @a) counter,
  concat(@category := category, ',', min(day)) col1,
  concat(max(day), ',', @a := a) col2
from (
  select distinct s.day, s.category, coalesce(t1.a, 0) a
  from (
    select (select min(day) from t) + interval val - 1 day day, c.category
    from seq s, (select distinct category from t) c
    having day <= (select max(day) from t)
  ) s
  left join t t1 on s.day = t1.day and s.category = t1.category
  where s.day between (
    select min(day) from t t2
    where s.category = t2.category) and (
    select max(day) from t t2
    where s.category = t2.category)
  order by s.category, s.day
) t, (select @category := '', @a := '', @counter := 0) init
group by counter, category, a
order by category, min(day)

Note that MySQL won't allow you to create data on the fly, unless you hardcode UNIONS, for example. This is an expensive process that's why I strongly suggest you to create a table with only an integer field with values from 1 to X, where X is, at least the maximum amount of dates that separate the min(day) and max(day) from your table. If you're not sure about that date, just add 100,000 numbers and you'll be able to generate range periods for over 200 years. In the previous query, this table is seq and the column it has is val.

This results in:

+--------------+--------------+
|     COL1     |     COL2     |
+--------------+--------------+
| 1,2012-01-01 | 2012-01-04,4 |
| 1,2012-01-05 | 2012-01-07,5 |
| 1,2012-01-08 | 2012-01-10,4 |
| 1,2012-01-11 | 2012-01-12,5 |
| 1,2012-01-13 | 2012-01-15,0 |
| 1,2012-01-16 | 2012-01-19,5 |
+--------------+--------------+

Ok, I'm lying. The result is actually returning a counter column. Just disregard it, as removing it (using a derived table) would be even less performant!

like image 121
Mosty Mostacho Avatar answered Oct 18 '22 14:10

Mosty Mostacho


and here's a one liner brutality for you :) (Note: Change the "datt" table name.)

select dd.category,
dd.day as start_day,
(select dp.day from 
    (
        select 1 as n,d1.category,d1.day,d1.a from datt d1 where not exists (
            select * from datt where day = d1.day - INTERVAL 1 DAY and a=d1.a
        )
        union
        select 2 as n,d1.category,d1.day,d1.a from datt d1 where not exists (
            select * from datt where day = d1.day + INTERVAL 1 DAY and a=d1.a
        )
    ) dp where dp.day >= dd.day - INTERVAL (n-2) DAY order by day asc limit 0,1) 
as end_day,
dd.a from (
    select 1 as n,d1.category,d1.day,d1.a from datt d1 where not exists (
        select * from datt where day = d1.day - INTERVAL 1 DAY and a=d1.a
    )
    union
    select 2 as n,d1.category,d1.day,d1.a from datt d1 where not exists (
        select * from datt where day = d1.day + INTERVAL 1 DAY and a=d1.a
    )
) dd
where n=1

and it's output is :

|| 1 || 2012-01-01 || 2012-01-01 || 4 ||
|| 1 || 2012-01-03 || 2012-01-04 || 4 ||
|| 1 || 2012-01-05 || 2012-01-07 || 5 ||
|| 1 || 2012-01-08 || 2012-01-10 || 4 ||
|| 1 || 2012-01-11 || 2012-01-12 || 5 ||

Note: Thats the result for non-existing 2012-01-02 in a 01-12 day table.

like image 39
Taha Paksu Avatar answered Oct 18 '22 14:10

Taha Paksu