Hey, I have 2 tables in PostgreSql:
1 - documents: id, title 2 - updates: id, document_id, date
and some data:
documents:
| 1 | Test Title |
updates:
| 1 | 1 | 2006-01-01 | | 2 | 1 | 2007-01-01 | | 3 | 1 | 2008-01-01 |
So All updates are pointing to the same document, but all with different dates for the updates.
What I am trying to do is to do a select from the documents table, but also include the latest update based on the date.
How should a query like this look like? This is the one I currently have, but I am listing all updates, and not the latest one as the one I need:
SELECT * FROM documents,updates WHERE documents.id=1 AND documents.id=updates.document_id ORDER BY date
To include; The reason I need this in the query is that I want to order by the date from the updates template!
Edit: This script is heavily simplified, so I should be able to create a query that returns any number of results, but including the latest updated date. I was thinking of using a inner join or left join or something like that!?
First, specify columns from both tables that you want to select data in the SELECT clause. Second, specify the main table i.e., table A in the FROM clause. Third, specify the second table (table B ) in the INNER JOIN clause and provide a join condition after the ON keyword.
In SQL, to fetch data from multiple tables, the join operator is used. The join operator adds or removes rows in the virtual table that is used by SQL server to process data before the other steps of the query consume the data.
You can replace the JOIN keyword with a comma in the FROM clause. What do you do next? There's no ON keyword for you to state the joining condition as there would be when using JOIN , e.g., on which two columns you want to join the tables. In this method, you simply use a WHERE clause to do so.
Use PostgreSQL
extension DISTINCT ON
:
SELECT DISTINCT ON (documents.id) * FROM document JOIN updates ON updates.document_id = document_id ORDER BY documents.id, updates.date DESC
This will take the first row from each document.id
cluster in ORDER BY
order.
Test script to check:
SELECT DISTINCT ON (documents.id) * FROM ( VALUES (1, 'Test Title'), (2, 'Test Title 2') ) documents (id, title) JOIN ( VALUES (1, 1, '2006-01-01'::DATE), (2, 1, '2007-01-01'::DATE), (3, 1, '2008-01-01'::DATE), (4, 2, '2009-01-01'::DATE), (5, 2, '2010-01-01'::DATE) ) updates (id, document_id, date) ON updates.document_id = documents.id ORDER BY documents.id, updates.date DESC
You may create a derived table which contains only the most recent "updates" records per document_id, and then join "documents" against that:
SELECT d.id, d.title, u.update_id, u."date" FROM documents d LEFT JOIN -- JOIN "documents" against the most recent update per document_id ( SELECT recent.document_id, id AS update_id, recent."date" FROM updates INNER JOIN (SELECT document_id, MAX("date") AS "date" FROM updates GROUP BY 1) recent ON updates.document_id = recent.document_id WHERE updates."date" = recent."date" ) u ON d.id = u.document_id;
This will handle "un-updated" documents, like so:
pg=> select * from documents; id | title ----+------- 1 | foo 2 | bar 3 | baz (3 rows) pg=> select * from updates; id | document_id | date ----+-------------+------------ 1 | 1 | 2009-10-30 2 | 1 | 2009-11-04 3 | 1 | 2009-11-07 4 | 2 | 2009-11-09 (4 rows) pg=> SELECT d.id ... id | title | update_id | date ----+-------+-----------+------------ 1 | foo | 3 | 2009-11-07 2 | bar | 4 | 2009-11-09 3 | baz | | (3 rows)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With