Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Custom ORDER BY to ignore 'the'

Tags:

sql

sorting

mysql

I'm trying to sort a list of titles, but currently there's a giant block of titles which start with 'The '. I'd like the 'The ' to be ignored, and the sort to work off the second word. Is that possible in SQL, or do I have to do custom work on the front end?

For example, current sorting:

  • Airplane
  • Children of Men
  • Full Metal Jacket
  • Pulp Fiction
  • The Fountain
  • The Great Escape
  • The Queen
  • Zardoz

Would be better sorted:

  • Airplane
  • Children of Men
  • The Fountain
  • Full Metal Jacket
  • The Great Escape
  • Pulp Fiction
  • The Queen
  • Zardoz

Almost as if the records were stored as 'Fountain, The', and the like. But I don't want to store them that way if I can, which is of course the crux of the problem.

like image 342
dimo414 Avatar asked Aug 17 '09 01:08

dimo414


People also ask

How to specify Custom ORDER BY in SQL?

By default SQL ORDER BY sort, the column in ascending order but when the descending order is needed ORDER BY DESC can be used. In case when we need a custom sort then we need to use a CASE statement where we have to mention the priorities to get the column sorted.

What does order by do by default?

Introduction to ORDER BY. By default, the order of rows in the output of an SQL query is arbitrary. If you want to sort the output in a particular order, you'll need to use the ORDER BY keyword. The rows are sorted according to one or more columns specified in the ORDER BY clause.


3 Answers

Best is to have a computed column to do this, so that you can index the computed column and order by that. Otherwise, the sort will be a lot of work.

So then you can have your computed column as:

CASE WHEN title LIKE 'The %' THEN stuff(title,1,4,'') + ', The' ELSE title END

Edit: If STUFF isn't available in MySQL, then use RIGHT or SUBSTRING to remove the leading 4 characters. But still try to use a computed column if possible, so that indexing can be better. The same logic should be applicable to rip out "A " and "An ".

Rob

like image 161
Rob Farley Avatar answered Sep 20 '22 10:09

Rob Farley


Something like:

ORDER BY IF(LEFT(title,2) = "A ",
            SUBSTRING(title FROM 3),
            IF(LEFT(title,3) = "An ",
              SUBSTRING(title FROM 4),
              IF(LEFT(title,4) = "The ",
                 SUBSTRING(title FROM 5),
                 title)))

But given the overhead of doing this more than a few times, you're really better off storing the title sort value in another column...

like image 27
great_llama Avatar answered Sep 21 '22 10:09

great_llama


I think you could do something like

ORDER BY REPLACE(TITLE, 'The ', '')

although this would replace any occurrence of 'The ' with '', not just the first 'The ', although I don't think this would affect very much.

like image 45
MusiGenesis Avatar answered Sep 22 '22 10:09

MusiGenesis