High performance approach to greatest-n-per-group SQL query

Question

I'm attempting to build an infrastructure for quickly running regressions on demand, pulling apache requests from a database that contains all historic activity on our webservers. To improve coverage by making sure that we still regress requests from our smaller clients, I'd like to ensure a distribution of requests by retrieving at most n (for the sake of this question, say 10) requests for each client.

I found a number of similar questions answered here, the closest of which seemed to be SQL query to return top N rows per ID across a range of IDs, but the answers were for the most part performance-agnostic solutions that I had already tried. For instance, the row_number() analytic function gets us exactly the data we're looking for:

SELECT
    *
FROM
    (
    SELECT
        dailylogdata.*,
        row_number() over (partition by dailylogdata.contextid order by occurrencedate) rn
    FROM
        dailylogdata
    WHERE
        shorturl in (?)
    )
WHERE
    rn <= 10;

However, given that this table contains millions of entries for a given day and this approach necessitates reading all rows from the index that match our selection criteria in order to apply the row_number analytic function, performance is terrible. We end up selecting nearly a million rows, only to throw out the vast majority of them because their row_number exceeded 10. Stats from executing the above query:

|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|| Id  | Operation                            | Name                    | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp||
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
||   0 | SELECT STATEMENT                     |                         |      1 |        |  12222 |00:09:08.94 |     895K|    584K|    301 |       |       |          |         ||
||*  1 |  VIEW                                |                         |      1 |   4427K|  12222 |00:09:08.94 |     895K|    584K|    301 |       |       |          |         ||
||*  2 |   WINDOW SORT PUSHED RANK            |                         |      1 |   4427K|  13536 |00:09:08.94 |     895K|    584K|    301 |  2709K|   743K|   97M (1)|    4096 ||
||   3 |    PARTITION RANGE SINGLE            |                         |      1 |   4427K|    932K|00:22:27.90 |     895K|    584K|      0 |       |       |          |         ||
||   4 |     TABLE ACCESS BY LOCAL INDEX ROWID| DAILYLOGDATA            |      1 |   4427K|    932K|00:22:27.61 |     895K|    584K|      0 |       |       |          |         ||
||*  5 |      INDEX RANGE SCAN                | DAILYLOGDATA_URLCONTEXT |      1 |  17345 |    932K|00:00:00.75 |    1448 |      0 |      0 |       |       |          |         ||
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                                                                                 |
|Predicate Information (identified by operation id):                                                                                                                              |
|---------------------------------------------------                                                                                                                              |
|                                                                                                                                                                                 |
|   1 - filter("RN"<=:SYS_B_2)                                                                                                                                                    |
|   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "DAILYLOGDATA"."CONTEXTID" ORDER BY "OCCURRENCEDATE")<=:SYS_B_2)                                                                  |
|   5 - access("SHORTURL"=:P1)                                                                                                                                                    |
|                                                                                                                                                                                 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

However, if instead we only query for the first 10 results for a specific contextid, we can execute this dramatically faster:

SELECT
    *
FROM
    (
    SELECT
        dailylogdata.*
    FROM
        dailylogdata
    WHERE
        shorturl in (?)
        and contextid = ?
    )
WHERE
    rownum <= 10;

Stats from running this query:

|-------------------------------------------------------------------------------------------------------------------------|
|| Id  | Operation                           | Name                    | Starts | E-Rows | A-Rows |   A-Time   | Buffers ||
|-------------------------------------------------------------------------------------------------------------------------|
||   0 | SELECT STATEMENT                    |                         |      1 |        |     10 |00:00:00.01 |      14 ||
||*  1 |  COUNT STOPKEY                      |                         |      1 |        |     10 |00:00:00.01 |      14 ||
||   2 |   PARTITION RANGE SINGLE            |                         |      1 |     10 |     10 |00:00:00.01 |      14 ||
||   3 |    TABLE ACCESS BY LOCAL INDEX ROWID| DAILYLOGDATA            |      1 |     10 |     10 |00:00:00.01 |      14 ||
||*  4 |     INDEX RANGE SCAN                | DAILYLOGDATA_URLCONTEXT |      1 |      1 |     10 |00:00:00.01 |       5 ||
|-------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                         |
|Predicate Information (identified by operation id):                                                                      |
|---------------------------------------------------                                                                      |
|                                                                                                                         |
|   1 - filter(ROWNUM<=10)                                                                                                |
|   4 - access("SHORTURL"=:P1 AND "CONTEXTID"=TO_NUMBER(:P2))                                                             |
|                                                                                                                         |
+-------------------------------------------------------------------------------------------------------------------------+

In this instance, Oracle is smart enough to stop retrieving data after getting 10 results. I could gather a complete set of contextids and programatically generate a query consisting of one instance of this query for each contextid and union all the whole mess together, but given the sheer number of contextids, we might run into an internal Oracle limitation, and even if not, this approach reeks of kludge.

Does anyone know of an approach that maintains the simplicity of the first query, while retaining performance commensurate with the second query? Also note that I don't actually care about retrieving a stable set of rows; so long as they satisfy my criteria, they're fine for the purposes of a regression.

Edit: Adam Musch's suggestion did the trick. I'm appending performance results with his changes up here since I can't fit them in a comment response to his answer. I'm also using a larger data set for testing this time, here are the (cached) stats from my original row_number approach for comparison:

|-------------------------------------------------------------------------------------------------------------------------------------------------|
|| Id  | Operation                     | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem ||
|-------------------------------------------------------------------------------------------------------------------------------------------------|
||   0 | SELECT STATEMENT              |                   |      1 |        |  12624 |00:00:22.34 |    1186K|    931K|       |       |          ||
||*  1 |  VIEW                         |                   |      1 |   1163K|  12624 |00:00:22.34 |    1186K|    931K|       |       |          ||
||*  2 |   WINDOW NOSORT               |                   |      1 |   1163K|   1213K|00:00:21.82 |    1186K|    931K|  3036M|    17M|          ||
||   3 |    TABLE ACCESS BY INDEX ROWID| TWTEST            |      1 |   1163K|   1213K|00:00:20.41 |    1186K|    931K|       |       |          ||
||*  4 |     INDEX RANGE SCAN          | TWTEST_URLCONTEXT |      1 |   1163K|   1213K|00:00:00.81 |    8568 |      0 |       |       |          ||
|-------------------------------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                                                 |
|Predicate Information (identified by operation id):                                                                                              |
|---------------------------------------------------                                                                                              |
|                                                                                                                                                 |
|   1 - filter("RN"<=10)                                                                                                                          |
|   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "CONTEXTID" ORDER BY  NULL )<=10)                                                                 |
|   4 - access("SHORTURL"=:P1)                                                                                                                    |
+-------------------------------------------------------------------------------------------------------------------------------------------------+

I've taken the liberty of boiling down Adam's suggestion a bit; here's the modified query...

select
    *
from
    twtest
where
    rowid in (
    select
            rowid
    from (
            select
                    rowid,
                    shorturl,
                    row_number() over (partition by shorturl, contextid
                                                      order by null) rn
            from
                    twtest
    )
    where rn <= 10
    and shorturl in (?)
);

...and stats from its (cached) evaluation:

|--------------------------------------------------------------------------------------------------------------------------------------|
|| Id  | Operation                   | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem ||
|--------------------------------------------------------------------------------------------------------------------------------------|
||   0 | SELECT STATEMENT            |                   |      1 |        |  12624 |00:00:01.33 |   19391 |       |       |          ||
||   1 |  NESTED LOOPS               |                   |      1 |      1 |  12624 |00:00:01.33 |   19391 |       |       |          ||
||   2 |   VIEW                      | VW_NSO_1          |      1 |   1163K|  12624 |00:00:01.27 |    6770 |       |       |          ||
||   3 |    HASH UNIQUE              |                   |      1 |      1 |  12624 |00:00:01.27 |    6770 |  1377K|  1377K| 5065K (0)||
||*  4 |     VIEW                    |                   |      1 |   1163K|  12624 |00:00:01.25 |    6770 |       |       |          ||
||*  5 |      WINDOW NOSORT          |                   |      1 |   1163K|   1213K|00:00:01.09 |    6770 |   283M|  5598K|          ||
||*  6 |       INDEX RANGE SCAN      | TWTEST_URLCONTEXT |      1 |   1163K|   1213K|00:00:00.40 |    6770 |       |       |          ||
||   7 |   TABLE ACCESS BY USER ROWID| TWTEST            |  12624 |      1 |  12624 |00:00:00.04 |   12621 |       |       |          ||
|--------------------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                                      |
|Predicate Information (identified by operation id):                                                                                   |
|---------------------------------------------------                                                                                   |
|                                                                                                                                      |
|   4 - filter("RN"<=10)                                                                                                               |
|   5 - filter(ROW_NUMBER() OVER ( PARTITION BY "SHORTURL","CONTEXTID" ORDER BY NULL NULL )<=10)                                       |
|   6 - access("SHORTURL"=:P1)                                                                                                         |
|                                                                                                                                      |
|Note                                                                                                                                  |
|-----                                                                                                                                 |
|   - dynamic sampling used for this statement (level=2)                                                                               |
|                                                                                                                                      |
+--------------------------------------------------------------------------------------------------------------------------------------+

As advertised, we're only accessing the dailylogdata table for the fully-filtered rows. I'm concerned that it appears to still be doing a full scan of the urlcontext index based on the number of rows it claims to be selecting (1213K), but given that it's using only using 6770 buffers (this number remains constant even if I increase the number of context-specific results) this may be misleading.

Adam Musch · Accepted Answer

This is kind of a janky solution, but it seems to do what you want: shortcut the index scan as soon as possible, and not read data until it's been qualified both by filtering conditioning and top-n query criteria.

Note that it was tested with a shorturl = condition, not a shorturl IN condition.

with rowid_list as
(select rowid
   from (select *
           from (select rowid,
                        row_number() over (partition by shorturl, contextid
                                           order by null) rn
                   from dailylogdata
                )
          where rn <= 10
        )
  where shorturl = ? 
)
select * 
  from dailylogdata
 where rowid in (select rowid from rowid_list)

The with clause grabs the first 10 rowids filtering a WINDOW NOSORT for each unique combination of shorturl and contextid that meets your criteria. Then it loops over that set of rowids, fetching each by rowid.

----------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                      |     1 |   286 |  1536   (1)| 00:00:19 |
|   1 |  NESTED LOOPS               |                      |     1 |   286 |  1536   (1)| 00:00:19 |
|   2 |   VIEW                      | VW_NSO_1             |   136K|  1596K|   910   (1)| 00:00:11 |
|   3 |    HASH UNIQUE              |                      |     1 |  3326K|            |          |
|*  4 |     VIEW                    |                      |   136K|  3326K|   910   (1)| 00:00:11 |
|*  5 |      WINDOW NOSORT          |                      |   136K|  2794K|   910   (1)| 00:00:11 |
|*  6 |       INDEX RANGE SCAN      | TABLE_REDACTED_INDEX |   136K|  2794K|   910   (1)| 00:00:11 |
|   7 |   TABLE ACCESS BY USER ROWID| TABLE_REDACTED       |     1 |   274 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("RN"<=10)
   5 - filter(ROW_NUMBER() OVER ( PARTITION BY "CLIENT_ID","SCE_ID" ORDER BY NULL NULL
              )<=10)
   6 - access("TABLE_REDACTED"."SHORTURL"=:b1)

High performance approach to greatest-n-per-group SQL query

Tags:

performance

sql

greatest-n-per-group

oracle

oltp

Trevor

1 Answers

Adam Musch

Recent Activity

Donate For Us

High performance approach to greatest-n-per-group SQL query

Tags:

performance

sql

greatest-n-per-group

oracle

oltp

Trevor

1 Answers

Adam Musch

Related questions

Recent Activity

Donate For Us