Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignoring stopwords while sorting SQL Server 2008 records

I have a table that has a book title field. I would like to be able to sort the records like this:

  1. The Ancient Alligator
  2. Aunt Annie's Alligator
  3. A Complete Guide to Alligators
  4. Countrified Alligators
  5. Don't Touch the Alligator!
  6. An Effortless Alligator Hunt

and so on, ignoring "A", "An", & "The" when they appear as the first word of the title. (They could also be ignored anywhere in the title.)

I know these are stopwords in SQL Server 2008, so they can be ignored if someone uses them in a search.

But is there a way to make them ignored by ORDER BY? (If it makes a difference, the query will use a LinqDataSource in ASP.NET.)

Thanks!

like image 455
AEzzell Avatar asked Oct 24 '22 08:10

AEzzell


2 Answers

Computing a sort key by using replace() won't scale if you have a large number of records.

The best way is to add an additional table field containing the title with A/An/The etc prefixes removed and make sure it has an index to speed up sorting. Then you can just order by this new field but display the original unchanged field.

like image 138
Tony O'Hagan Avatar answered Oct 27 '22 10:10

Tony O'Hagan


Something like this perhaps.

;with T(Title) as
(
  select 'The Ancient Alligator'          union all
  select 'Aunt Annie''s Alligator'        union all
  select 'A Complete Guide to Alligators' union all
  select 'Countrified Alligators'         union all
  select 'Don''t Touch the Alligator!'    union all
  select 'An Effortless Alligator Hunt'
)

select Title
from T
order by replace(
         replace(
         replace(T.Title, 
         'A ', ''), 
         'An ', ''), 
         'The ', '')

Result:

Title
------------------------------
The Ancient Alligator
Aunt Annie's Alligator
A Complete Guide to Alligators
Countrified Alligators
Don't Touch the Alligator!
An Effortless Alligator Hunt
like image 24
Mikael Eriksson Avatar answered Oct 27 '22 11:10

Mikael Eriksson