My title is terrible, and that's probably why I'm not finding what I want on Google.
What I'm trying to do is export some data from an old in-house blog so I can import it into something else. My issue is that while I can kind of create the sort of JOIN I'm looking for, the match in the second table can contain multiple rows, so I end up with tons of duplicate data. I need to take the results from the second table and concat those (if there are multiple matches) into a single field in the query result. There is no need for a WHERE constraint on the query, I'm trying to retrieve the entire blog_posts table.
Hopefully this abbreviated layout of the table structure will help illustrate:
blog_posts blog_categories
---------------------------------------
post_id post_id
post_content category_id
post_author
And here's some sample data.
blog_posts table data:
post_id post_content post_author
----------------------------------
1 foo1 bob
2 foo2 bob
3 foo3 fred
blog_categories table data:
post_id category_id
--------------------
1 1
1 2
1 6
2 1
3 2
3 4
And what my ideal results would look like would be this:
post_id post_content post_author category_ids
------------------------------------------------
1 foo1 bob 1,2,6
2 foo2 bob 1
3 foo3 fred 2,4
The closest I could get was a simple join like this:
SELECT
blog_posts.post_id,
blog_posts.post_content,
blog_posts.post_author,
blog_categories.category_id
FROM blog_posts
INNER JOIN blog_categories
ON blog_posts.post_id = blog_categories.post_id
But that returns matches in the blog_posts table multiple times (one time for each category_id that matches).
Is there any way to accomplish what I want using just SQL? I'm thinking some sort of sub-select would work, but what I can't wrap my head around how that would work - I know I'd essentially want to do a select in my "loop" for the category ids using the current post id, but the syntax for that escapes me. It need not be efficient, this is a one-time operation.
The group_concat() function does exactly what you need:
SELECT
blog_posts.post_id,
blog_posts.post_content,
blog_posts.post_author,
group_concat(blog_categories.category_id)
FROM blog_posts
JOIN blog_categories ON blog_posts.post_id = blog_categories.post_id
GROUP BY 1, 2, 3
You want to GROUP BY blog_posts.post_id, blog_posts.post_content, blog_posts.post_author. And then use an aggregate function (http://en.wikipedia.org/wiki/Aggregate_function) to take ALL the blog_categories.category_id values from each group and turn it into a single string.
Which DBMS are you using? For Postgres you could probably simply use an array as aggregate function:
SELECT
blog_posts.post_id,
blog_posts.post_content,
blog_posts.post_author,
ARRAY_AGG(blog_categories.category_id)
FROM blog_posts
INNER JOIN blog_categories ON blog_posts.post_id = blog_categories.post_id
GROUP BY
blog_posts.post_id,
blog_posts.post_content,
blog_posts.post_author
Or use ARRAY_TO_STRING(ARRAY_AGG(blog_categories.category_id), ',') to get a comma separated string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With