I have 2 tables - course that contains id and name of the courses and tagCourse that contains tags for each course.
course tagcourse
------------ ----------------
PK id_course PK tag
name PK, FK id_course
I'd like to write a function that searches courses by given array of tags and returns them ordered by quantity of matching tags. However I don't know how to write it correctly and in an efficient way. Please help me.
ie.
CREATE OR REPLACE FUNCTION searchByTags(tags varchar[])
RETURNS SETOF.....
RETURN QUERY SELECT * FROM course c INNER JOIN tagcourse tc ON c.id_course = tc.id_course
WHERE ??? ORDER BY ???
END....
Yes, order is preserved.
PostgreSQL allows us to define a table column as an array type. The array must be of a valid data type such as integer, character, or user-defined types. To insert values into an array column, we use the ARRAY constructor.
The ORDER BY statement in SQL is used to sort the fetched data in either ascending or descending according to one or more columns. By default ORDER BY sorts the data in ascending order. We can use the keyword DESC to sort the data in descending order and the keyword ASC to sort in ascending order.
The ORDER BY clause in PostgreSQL is used together with the SELECT statement to sort table data. The table data can either be sorted in ascending or descending order. By default, the data is sorted in ascending order.
CREATE OR REPLACE FUNCTION search_by_tags(tags varchar[])
RETURNS TABLE (id_course integer, name text, tag_ct integer)
LANGUAGE sql AS
$func$
SELECT id_course, c.name, ct.tag_ct
FROM (
SELECT tc.id_course, count(*)::int AS tag_ct
FROM unnest($1) x(tag)
JOIN tagcourse tc USING (tag)
GROUP BY 1 -- first aggregate ..
) AS ct
JOIN course c USING (id_course) -- .. then join
ORDER BY ct.tag_ct DESC -- more columns to break ties?
$func$;
Use unnest()
to produce a table from your input array, like already demonstrated by @Clodoaldo.
You don't need plpgsql for this. Simpler with a plain SQL function.
I use unnest($1)
(with positional parameter) instead of unnest(tags)
, since the later is only valid for PostgreSQL 9.2+ in SQL functions (unlike plpgsql). The manual:
In the older numeric approach, arguments are referenced using the syntax
$n
:$1
refers to the first input argument,$2
to the second, and so on. This will work whether or not the particular argument was declared with a name.
count()
returns bigint
. You need to cast it to int
to match the declared return type or declare the the returned column as bigint
to begin with.
Perfect occasion to simplify the syntax a bit with USING
(equi-joins): USING (tag)
instead of ON tc.tag = c.tag
.
It's regularly faster to first aggregate, then join to another table. Reduces the needed join operations.
To address @Clodoaldo's comments, here is a fiddle demonstrating the difference:
db<>fiddle here
Old sqlfiddle
OTOH, if you aggregate after the join, you don't need a subquery. Shorter, but probably slower:
SELECT c.id_course, c.name, count(*)::int AS tag_ct
FROM unnest($1) x(tag)
JOIN tagcourse tc USING (tag)
JOIN course c USING (id_course)
GROUP BY 1
ORDER BY 3 DESC; -- more columns to break ties?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With