Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to select string fields based on character ranges?

I need to add the ability for users of my software to select records by character ranges.
How can I write a query that returns all widgets from a table whose name falls in the range Ba-Bi for example?

Currently I'm using greater than and less than operators, so the above example would become:

select * from widget
where name >= 'ba' and name < 'bj'

Notice how I have "incremented" the last character of the upper bound from i to j so that "bike" would not be left out.

Is there a generic way to find the next character after a given character based on the field's collation or would it be safer to create a second condition?

select * from widget
where name >= 'ba'
and (name < 'bi' or name like 'bi%')

My application needs to support localization. How sensitive is this kind of query to different character sets?

I also need to support both MSSQL and Oracle. What are my options for ensuring that character casing is ignored no matter what language appears in the data?

like image 796
Nathan Baulch Avatar asked Sep 30 '08 10:09

Nathan Baulch


1 Answers

Let's skip directly to localization. Would you say "aa" >= "ba" ? Probably not, but that is where it sorts in Sweden. Also, you simply can't assume that you can ignore casing in any language. Casing is explicitly language-dependent, with the most common example being Turkish: uppercase i is İ. Lowercase I is ı.

Now, your SQL DB defines the result of <, == etc by a "collation order". This is definitely language specific. So, you should explicitly control this, for every query. A Turkish collation order will put those i's where they belong (in Turkish). You can't rely on the default collation.

As for the "increment part", don't bother. Stick to >= and <=.

like image 79
MSalters Avatar answered Oct 08 '22 04:10

MSalters