I have the following table in my AWS QLDB table:
INSERT INTO Testing << {
'MyId': 1,
'MyList': [ 'Item1', 'Item2', 'Item3']
},
{
'MyId': 2,
'MyList': [ 'Item2', 'Item3', 'Item4']
},
{
'MyId': 3,
'MyList': [ 'Item4', 'Item5', 'Item6']
}
>>
I need to be able to get all documents which contain a list of items (which do not come from another table), which I'm using the following query for:
SELECT *
FROM Testing AS t,
t.MyList AS l
WHERE l IN ('Item1', 'Item2', 'Item4')
However, this gives the following output
+------+---------------------------+---------+
| MyId | MyList | _3 |
+------+---------------------------+---------+
| 3 | ["Item4","Item5","Item6"] | "Item4" |
+------+---------------------------+---------+
| 1 | ["Item1","Item2","Item3"] | "Item1" |
+------+---------------------------+---------+
| 1 | ["Item1","Item2","Item3"] | "Item2" |
+------+---------------------------+---------+
| 2 | ["Item2","Item3","Item4"] | "Item2" |
+------+---------------------------+---------+
| 2 | ["Item2","Item3","Item4"] | "Item4" |
+------+---------------------------+---------+
I want to be able to just get three distinct rows. It would appear based on the error I get when trying to use DISTINCT
that QLDB doesn't support it, but I also would prefer using *
in my SELECT
, So I'm trying to use GROUP BY
SELECT *
FROM Testing AS t,
t.MyList AS l
WHERE l IN ('Item1', 'Item2', 'Item4')
GROUP BY t.MyId
But this gives the following error:
Start query error
Semantic Error: at line , column : No such variable named '$__partiql__group_by_1_item_0'; No such variable named '$__partiql__group_by_1_item_0' (Service: AmazonQLDBSession; Status Code: 400; Error Code: BadRequestException; Request ID: 65vrQHytqHdEL3o9Ym9Xn4)
Neither DISTINCT
nor GROUP BY
is not currently supported in QLDB, as indicated by omission from the SELECT reference. Unfortunately the error message is misleading in this case.
Assuming that your MyId
column is unique itself, you can express what you want by filtering the lists by checking for inclusion of each item in the list, something like:
SELECT *
FROM Testing AS t
WHERE 'Item1' IN t.MyList OR 'Item2' IN t.MyList OR 'Item3' IN t.MyList
If you want to further filter the produced lists, you can add a nested SELECT
with your original filter as described here:
SELECT t.MyId, (SELECT VALUE l FROM t.MyList WHERE l IN ('Item1', 'Item2', 'Item3'))
FROM Testing AS t
WHERE 'Item1' IN t.MyList OR 'Item2' IN t.MyList OR 'Item3' IN t.MyList
While this is a bit awkward, you likely wouldn't want to do a GROUP BY
anyway, as that would group across the entire data set (requiring a complete aggregation), when the items
for each t
are contained inside of it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With