I am converting some SQL Logic from T-SQL used in SSMS to Amazon Redshift. I believe Redshift is a fork of Postgres version 8.0.2 so the below may not be possible unless using Postgres 9.1.
WITH CTE_ID AS
(
SELECT FULL_NAME, COUNT(DISTINCT ID) as ID_COUNT, MAX(ID) AS MAX_ID
FROM MEMBERS
GROUP BY FULL_NAME
HAVING COUNT(DISTINCT ID) > 1
)
UPDATE a
SET a.ID = b.MAX_ID
FROM MEMBERS a
INNER JOIN CTE_ID b
ON a.FULL_NAME = b.FULL_NAME
If this feature is not supported by Amazon Redshift, would my best option be to create a new "temporary" table and populate it with the values the CTE would generate?
Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE or MERGE statement. This clause can also be used in a CREATE VIEW statement as part of its defining SELECT statement.
You can update a table by referencing information in other tables. List these other tables in the FROM clause or use a subquery as part of the WHERE condition. Tables listed in the FROM clause can have aliases. If you need to include the target table of the UPDATE statement in the list, use an alias.
A recursive common table expression (CTE) is a CTE that references itself. A recursive CTE is useful in querying hierarchical data, such as organization charts that show reporting relationships between employees and managers. See Example: Recursive CTE.
The correct syntax is: UPDATE table_name SET column = { expression | DEFAULT } [,...] So your UPDATE statement should look as follows: update t1 set val1 = val3 from t2 inner join t3 on t2.
You can re-write the query as a derived table as mentioned by @a_horse_with_no_name:
UPDATE MEMBERS
SET a.ID = b.MAX_ID
FROM MEMBERS a
INNER JOIN (
SELECT FULL_NAME, COUNT(DISTINCT ID) as ID_COUNT, MAX(ID) AS MAX_ID
FROM MEMBERS
GROUP BY FULL_NAME
HAVING COUNT(DISTINCT ID) > 1
) b
ON a.FULL_NAME = b.FULL_NAME
Existing answers (including the accepted) are invalid. This should work:
UPDATE members AS a
SET id = b.max_id
FROM (
SELECT full_name, max(id) AS max_id
FROM members
GROUP BY full_name
HAVING count(DISTINCT id) > 1
) b
WHERE a.full_name = b.full_name
AND a.id IS DISTINCT FROM b.max_id;
No need for a CTE (though possible). A subquery is simpler.
The target table is only listed once. You'd only repeat it in the FROM
clause with a (different) alias for special needs.
Target columns in the SET
list cannot be table-qualified.
Unquoted names are folded to lower case in Redshift. UPPER case spelling only adds confusion.
I added the predicate AND a.id IS DISTINCT FROM b.max_id
to skip updates on rows that would not change. (Expensive no-op.) You'd only want those in exotic cases to trigger (undeclared) side effects.
More in the Redshift manual for UPDATE
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With