Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Done: for Z companies, calculate average value of A over X days. How do I now: return value of B for day X for company with highest average of A?

Tags:

sql

php

mysql

(I'll first explain my problem. The below table (any example query) is available at http://sqlfiddle.com/#!2/8ec17/4 though)

I have a table with stock information, as follows:

sp100_id  _date         bullishness  returnpct
----------------------------------------------
1         2011-03-16    1.01         -0.33
1         2011-03-17    0.85         -1.28
1         2011-03-18    0.89          1.25
1         2011-03-21    1.46          1.21
1         2011-03-22    0.39         -2.53
2         2011-03-16    3.07          1.27
2         2011-03-17    2.09         -0.80
2         2011-03-18    0.91         -0.12
2         2011-03-21    1.50          0.00
2         2011-03-22    2.62          1.10
3         2011-03-16    0.73         -1.13
3         2011-03-17    1.13          1.21
3         2011-03-18    1.12          0.45
3         2011-03-21    1.00          1.01
3         2011-03-22    1.00         -0.53
4         2011-03-16    0.40          1.10
4         2011-03-17    2.40          0.03
4         2011-03-18    3.16         -0.10
4         2011-03-21    0.86          0.50
4         2011-03-22    1.00          0.10

What I need is:

  • for each 3 consecutive days, calculate the average bullishness for each company (sp100_id)
  • take the 3rd day returnpct of the company with the highest average bullishness and store it
  • at the end, calculate the total returnpct by adding up the stored returnpcts

There are 5 _dates in this example, so there folling pairs of consecutive days have to be made:

  • 2011-03-16, 2011-03-17, 2011-03-18
  • 2011-03-17, 2011-03-18, 2011-03-21 (please note, 2011-03-19 isn't in the table)
  • 2011-03-18, 2011-03-21, 2011-03-22

To get back to what I need:

  • For the first three days, the "winner" is company 2 with an average bullsihness of (3.07 + 2.09 + 0.91) / 3 = 2.0233 and 3rd day returnpct of -0.12. For the other 2 "dateranges" the winners are company 4 (with average bullsihness 2.14 and returnpct 0.50) and company 2 again (bullsihness 1.67, returnpct 1.10)
  • The values -0.12, 0.50, 1.10 should be stored
  • The total returnpct would then be -0.12 + 0.50 + 1.10 = 1.48, which should be returned from the query (or script)

Problem 1: Concering the above example, my query returns the 1st day returnpct (1.27) instead of the 3rd day returnpct (-0.12). How can I change this?

Problem 2: In the sqlfiddle, I hardcoded the first group of 3 consecutive days. How can I automate this (perhaps using php) so that I don't have to type all queries manually? Please note there are dates missing in the table. As in the example, the script should just take the next available date in the table (so after 2011-03-18 comes 2011-03-21, not 2011-03-19 since it is not in the table)

Problem 3: In the example I use 3 consecutive days, but ideally a script can be easily changed to take any other number of consecutive days (2, 4, or 8, for instance). Consequently, the returnpct of respectively the 2nd, 4th or 8th day should be stored.

Who can help me out with some of the issues I'm experiencing here? Any help is greatly appreciated :-)

like image 393
Pr0no Avatar asked Aug 18 '12 12:08

Pr0no


1 Answers

The following query goes a long way toward what you want to do. It calculates the 3-day average, then orders within each date by the highest average:

SELECT s.sp100_id, s._date,
       (s.bullishness+splus1.bullishness+splus2.bullishness)/3 as avgb,
       splus2.returnpct
FROM (select s3.*,
             (select min(_date)
                    from stocks s4
                    where s4.sp100_id  = s3.sp100_id and
                          s4._date > s3.dateplus1
                   ) as dateplus2
      from (select s.*,
                   (select min(_date)
                    from stocks s2
                    where s2.sp100_id  = s.sp100_id and
                          s2._date > s._date
                   ) as dateplus1
            from stocks s
           ) s3
     ) s left outer join
     stocks splus1
     on s.sp100_id = splus1.sp100_id and
        s.dateplus1 = splus1._date left outer join
     stocks splus2
     on s.sp100_id = splus2.sp100_id and
         s.dateplus2 = splus2._date
order by 2, 3 desc

At this point, using mysql gets tedious. This would be much easier in a database that supported analytic/windows functions and the "with" statement (which is just about every database apart from mysql: Oracle, Postgres, DB2, SQL Server, for example).

You can do (3) in MySQL with a single query, but it is a pain. You might want to do that in the application layer.

By the way, thank you for setting up a SQL Fiddle. For than reason alone, I'll upvote the question.

like image 179
Gordon Linoff Avatar answered Sep 30 '22 18:09

Gordon Linoff