Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast case-insensitive sorting in Elixir

Tags:

elixir

Hi fellow Elixir programmers.

I have list of about 2.500 music tracks that I would like to sort by different parameters, for example the title of the track.

The sorting should be case insensitive.

The code below works, but takes about 100ms to 130ms to sort the list. Is there a faster way to do it? A reference for me is Node.js which does it in about 25ms when using Array.prototype.sort

Edit: Sorry, I actually mistimed the performance. The sorting happens in about 30ms. However, I would still like your opinion: Can the sorting be done faster?

Thanks.

defmodule MusicServer.Tracks.SortTracks do
  def sort_tracks(tracks, "title", "desc") do
    Enum.sort(tracks, fn track1, track2 ->
      first_char(track1["title"]) <= first_char(track2["title"])
    end)
  end

  def first_char(string) do
    string
    |> String.at(0)
    |> String.downcase()
  end
end

An example of the data structure:

[
  %{
    "artist" => "Rolling Stones",
    "title" => "Start It Up",
    "bpm" => 100,
    "createdAt" => "2018-04-27T09:08:04.428Z",
    "updatedAt" => "2018-07-14T14:28:17.771Z"
  },
  %{
    "artist" => "Al Green",
    "title" => "Let's Stay Together",
    "bpm" => 123,
    "createdAt" => "2018-04-27T09:08:04.428Z",
    "updatedAt" => "2018-07-14T14:28:17.771Z"
  },
  ...
]
like image 599
skovmand Avatar asked Aug 10 '18 13:08

skovmand


1 Answers

Enum.sort will call the comparator function n log(n) times which means first_char will be called 2n log(n) times which might be the bottleneck here. To reduce calls to first_char, you can switch to Enum.sort_by which calls the function once for each element and then caches its value when sorting:

Enum.sort_by(tracks, fn track -> first_char(track["title"]) end)

For a list of length 2,500, the number of calls to first_char would reduce from over 50k to 2.5k. Of course sort_by will have to do work work allocating data structure to store the computed values but it should still be faster for this input. You should carefully benchmark this yourself before using it!

like image 116
Dogbert Avatar answered Oct 31 '22 06:10

Dogbert