Question

I need to add the ability for users of my software to select records by character ranges.
How can I write a query that returns all widgets from a table whose name falls in the range Ba-Bi for example?

Currently I'm using greater than and less than operators, so the above example would become:

select * from widget
where name >= 'ba' and name < 'bj'

Notice how I have "incremented" the last character of the upper bound from i to j so that "bike" would not be left out.

Is there a generic way to find the next character after a given character based on the field's collation or would it be safer to create a second condition?

select * from widget
where name >= 'ba'
and (name < 'bi' or name like 'bi%')

My application needs to support localization. How sensitive is this kind of query to different character sets?

I also need to support both MSSQL and Oracle. What are my options for ensuring that character casing is ignored no matter what language appears in the data?

Was it helpful?

Solution

Let's skip directly to localization. Would you say "aa" >= "ba" ? Probably not, but that is where it sorts in Sweden. Also, you simply can't assume that you can ignore casing in any language. Casing is explicitly language-dependent, with the most common example being Turkish: uppercase i is İ. Lowercase I is ı.

Now, your SQL DB defines the result of <, == etc by a "collation order". This is definitely language specific. So, you should explicitly control this, for every query. A Turkish collation order will put those i's where they belong (in Turkish). You can't rely on the default collation.

As for the "increment part", don't bother. Stick to >= and <=.

OTHER TIPS

For MSSQL see this thread: http://bytes.com/forum/thread483570.html .

For Oracle, it depends on your Oracle version, as Oracle 10 now supports regex(p) like queries: http://www.psoug.org/reference/regexp.html (search for regexp_like ) and see this article: http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/rischert_regexp_pt1.html

HTH

Frustratingly, the Oracle substring function is SUBSTR(), whilst it SQL-Server it's SUBSTRING().

You could write a simple wrapper around one or both of them so that they share the same function name + prototype.

Then you can just use

MY_SUBSTRING(name, 2) >= 'ba' AND MY_SUBSTRING(name, 2) <= 'bi'

or similar.

You could use this...

select * from widget
where name Like 'b[a-i]%'

This will match any row where the name starts with b, the second character is in the range a to i, and any other characters follow.

I think that I'd go with something simple like appending a high-sorting string to the end of the upper bound. Something like:

select * from widgetwhere name >= 'ba' and name <= 'bi'||'~'

I'm not sure that would survive EBCDIC conversion though

You could also do it like this:

select * from widget
where left(name, 2) between 'ba' and 'bi'

If your criteria length changes (as you seemed to indicate in a comment you left), the query would need to have the length as an input also:

declare @CriteriaLength int
set @CriteriaLength = 4
select * from widget
where left(name, @CriteriaLength) between 'baaa' and 'bike'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top