I have the following query:
SELECT mutations.id, genes.loc FROM mutations, genes where mutations.id=genes.id;
and outputs this:
| SL2.50ch02_51014904 | intergenic |
| SL2.50ch02_51014907 | upstream |
| SL2.50ch02_51014907 | downstream |
| SL2.50ch02_51014907 | intergenic |
| SL2.50ch02_51014911 | upstream |
| SL2.50ch02_51014911 | downstream |
My desired output is this:
| SL2.50ch02_51014904 | intergenic |
| SL2.50ch02_51014907 | upstream,downstream,intergenic |
| SL2.50ch02_51014911 | upstream,downstream |
I thought GROUP_CONCAT
was useful for this. However, doing this:
SELECT mutations.id, GROUP_CONCAT(distinct(genes.loc)) FROM mutations, genes WHERE mutations.id=genes.id;
I have a unique row like this:
SL2.50ch02_51014904 | downstream,intergenic,upstream
How can I solve this?
You need to add group by
:
SELECT m.id, GROUP_CONCAT(distinct(g.loc))
FROM mutations m JOIN
genes g
ON m.id = g.id
GROUP BY m.id;
Along the way, you should learn a couple other things:
join
syntax. A simple rule: never use commas in the from
clause.m
and g
). They make the query easier to write and to read.You forgot the GROUP BY
:
SELECT
mutations.id,
GROUP_CONCAT(DISTINCT(genes.loc))
FROM
mutations, genes
WHERE
mutations.id=genes.id
GROUP BY
mutations.id
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With