Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Paging in SQL with LIMIT/OFFSET sometimes results in duplicates on different pages

Tags:

sql

I'm developing an online gallery with voting and have a separate table for pictures and votes (for every vote I'm storing the ID of the picture and the ID of the voter). The tables related like this: PICTURE <--(1:n, using VOTE.picture_id)-- VOTE. I would like to query the pictures table and sort the output by votes number. This is what I do:

SELECT
    picture.votes_number,
    picture.creation_date,
    picture.author_id,
    picture.author_nickname,
    picture.id,
    picture.url,
    picture.name,
    picture.width,
    picture.height,
    coalesce(anon_1."totalVotes", 0)
FROM picture
LEFT OUTER JOIN
    (SELECT
        vote.picture_id as pid,
        count(*) AS "totalVotes"
     FROM vote
     WHERE vote.device_id = <this is the query parameter> GROUP BY pid) AS anon_1
ON picture.id = anon_1.pid
ORDER BY picture.votes_number DESC
LIMIT 10
OFFSET 0

OFFSET is different for different pages, of course.

However, there are pictures with the same ID that are displayed on the different pages. I guess the reason is the sorting, but can't construct any better query, which will not allow duplicates. Could anybody give me a hint?

Thanks in advance!

like image 842
Sergey Mikhanov Avatar asked Jan 16 '10 15:01

Sergey Mikhanov


People also ask

Why offset pagination is bad?

As we can see, OFFSET pagination has some drawbacks: For a high database volume, the end pages are harder to retrieve than the beginning pages, as the number of rows to load and skip is high. For a growing database, it becomes less and less efficient to reach the beginning rows over time.

How is limit and offset used in pagination?

LIMIT n is an alternative syntax to the FETCH FIRST n ROWS ONLY. The OFFSET clause specifies the number of rows of the result table to skip before any rows are retrieved, and must be used with the LIMIT clause. The OFFSET clause instructs the server where to start returning rows within the query result.

How offset and limit works in SQL?

The limit option allows you to limit the number of rows returned from a query, while offset allows you to omit a specified number of rows before the beginning of the result set. Using both limit and offset skips both rows as well as limit the rows returned.

How do I stop SQL from repeating data?

The go to solution for removing duplicate rows from your result sets is to include the distinct keyword in your select statement. It tells the query engine to remove duplicates to produce a result set in which every row is unique.


2 Answers

Do you execute one query per page to display? If yes, I suspect that the database doesn't guarantee a consitent order for items with the same number of votes. So first query may return { item 1, item 2 } and a 2nd query may return { item 2, item 1} if both items have same number of votes. If the items are actually items 10 and 11, then the same item may appear on page 1 and then on page 2.

I had such a problem once. If that's also your case, append an extra clause to the order by to ensure a consistent ordering of items with same vote number, e.g.:

ORDER BY picture.vote, picture.ID

like image 99
ewernli Avatar answered Sep 24 '22 11:09

ewernli


The simples explanation is that you had some data added or some votes occured when you was looking at different pages.

I am sure if you would sorte by ID or creation_date this issue would go away.

I.e. there is no issue with your code

like image 30
BarsMonster Avatar answered Sep 24 '22 11:09

BarsMonster