I have this list :
thresholds <- list(
list(color="red", value=100),
list(color="blue", value=50),
list(color="orange", value=100),
list(color="green", value=1),
list(color="orange", value=50)
)
I want to order it by the "value" field of each element and discard duplicates so that no two elements have the same "value" field in the resulting list (the element that gets picked when there's a tie doesn't matter).
sort
and unique
don't work with complex lists and don't permit a custom ordering. How to achieve the desired result?
First of all, in this particular case, the actual vector to order is:
values <- sapply(thresholds, function (t) t$value)
# values == c(100, 50, 100, 1, 50)
You can adjust the function inside sapply
for your needs (for instance, do the appropriate casting depending on whether you want to sort in numeric or alphabetical order, etc.).
From this point, if we were to keep the duplicates, the answer would simply be:
thresholds[order(values)]
order
returns for each element in "values" its rank, i.e. its position if the vector were sorted. Here order(values)
is 4 2 5 1 3
. Then, thresholds[order(values)]
returns the elements of thresholds
identified by these indices, producing 1 50 50 100 100
.
However, since we want to remove duplicates, it cannot be as simple as that. unique
won't work on thresholds
and if we apply it to values
, it will lose the correspondence with the indices in the original list.
The solution is to use another function, namely duplicated
. When applied on a vector, duplicated
returns a vector of booleans, indicating for each element, if it already exists in the vector at an earlier position. For instance, duplicated(values)
would return FALSE FALSE TRUE FALSE TRUE
. This vector is the filter on duplicated elements we need here.
The solution is therefore:
ordering <- order(values)
nodups <- ordering[!duplicated(values)]
thresholds[nodups]
or as a one-liner:
thresholds[order(values)[!duplicated(values)]]
Adding another alternative, for completeness, regarding the "custom sort"/"custom unique" part of the question. By defining methods for certain functions (as seen in ?xtfrm
) we can apply custom sort
and unique
functions to any list (or other object).
First, a "class" attribute needs to be added:
class(thresholds) = "thresholds"
Then, define the necessary custom functions:
"==.thresholds" = function(x, y) return(x[[1]][["value"]] == y[[1]][["value"]])
">.thresholds" = function(x, y) return(x[[1]][["value"]] > y[[1]][["value"]])
"[.thresholds" = function(x, i) return(structure(.subset(x, i), class = class(x)))
is.na.thresholds = function(x) return(is.na(x[[1]][["value"]]))
Now, we can apply sort
:
sort(thresholds)
Finally, add a custom unique
function:
duplicated.thresholds = function(x, ...) return(duplicated(sapply(x, function(elt) elt[["value"]])))
unique.thresholds = function(x, ...) return(x[!duplicated((x))])
And:
sort(unique(thresholds))
(Similar answers and more information here and here)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With