In R2 and R3, I can use unique
to remove duplicate items from a series:
>> a: [1 2 2 3]
>> length? a
== 4
>> length? unique a
== 3
How can I perform the same operation on a series of objects? e.g.,
b: reduce [
make object! [a: 1]
make object! [b: 2]
make object! [b: 2]
make object! [c: 3]
]
>> length? b
== 4
>> length? unique b
== 4 ; (I'd like it to be 3)
The implementation of the equality check in UNIQUE and the other set operations appears to be Cmp_Value
, and the way the comparison is done is to subtract the frame pointers of the objects. If that subtraction is zero (e.g. these are the SAME? object) then the comparison is considered a match:
f-series.c Line 283, R3-Alpha open source release
If you look at the surrounding code you'll see a call to Cmp_Block in that same routine. In the case of Cmp_Block it does a recursive comparison, and honors the case sensitivity...hence the difference between how blocks and objects act:
Cmp_Block() in f-series.c
Given that it is written that way, if you would like a UNIQUE operation to be based on field-by-field comparison of objects vs. their identity, there's no way to do it besides writing your own routine and calling EQUAL?...or modifying the C code.
Here is a short hack not requiring changing the C source, which does a MAP-EACH over the output of UNIQUE. The body filters out any EQUAL? objects that have already been seen (because when the body of a MAP-EACH returns unset, it adds nothing to the result):
my-unique: function [array [block!]] [
objs: copy []
map-each item unique array [
if object? :item [
foreach obj objs [
if equal? item obj [unset 'item break]
]
unless unset? :item [append objs item]
]
:item ;-- if unset, map-each adds nothing to result
]
]
Unfortunately you have to use a BLOCK! and not a MAP! to keep track of the objects as you go, because MAP! does not currently allow objects as keys. If they had allowed it, they would have probably had the same issue of not hashing field-equal objects the same.
(Note: Fixing this and other issues are on the radar of the Ren-C branch, which in addition to now being the fastest Rebol interpreter with fundamental fixes, also has a bit of enhancements to the set operations. Discussions in chat)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With