Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explode range of integers out for joining in SQL

I have one table that stores a range of integers in a field, sort of like a print range, (e.g. "1-2,4-7,9-11"). This field could also contain a single number.

My goal is to join this table to a second one that has discrete values instead of ranges.

So if table one contains

1-2,5
9-15
7

And table two contains

1
2
3
4
5
6
7
8
9
10

The result of the join would be

1-2,5   1
1-2,5   2
1-2,5   5
7       7
9-15    9
9-15    10

Working in SQL Server 2008 R2.

like image 632
Christoph Avatar asked Dec 21 '22 05:12

Christoph


2 Answers

Use a string split function of your choice to split on comma. Figure out the min/max values and join using between.

SQL Fiddle

MS SQL Server 2012 Schema Setup:

create table T1(Col1 varchar(10))
create table T2(Col2 int)

insert into T1 values
('1-2,5'),
('9-15'),
('7')

insert into T2 values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)

Query 1:

select T1.Col1,
       T2.Col2
from T2
  inner join (
              select T1.Col1,
                     cast(left(S.Item, charindex('-', S.Item+'-')-1) as int) MinValue,
                     cast(stuff(S.Item, 1, charindex('-', S.Item), '') as int) MaxValue
              from T1
                cross apply dbo.Split(T1.Col1, ',') as S
             ) as T1
    on T2.Col2 between T1.MinValue and T1.MaxValue

Results:

|  COL1 | COL2 |
----------------
| 1-2,5 |    1 |
| 1-2,5 |    2 |
| 1-2,5 |    5 |
|  9-15 |    9 |
|  9-15 |   10 |
|     7 |    7 |
like image 106
Mikael Eriksson Avatar answered Jan 10 '23 08:01

Mikael Eriksson


Like everybody has said, this is a pain to do natively in SQL Server. If you must then I think this is the proper approach.

First determine your rules for parsing the string, then break down the process into well-defined and understood problems.
Based on your example, I think this is the process:

  1. Separate comma separated values in the string into rows
  2. If the data does not contain a dash, then it's finished (it's a standalone value)
  3. If it does contain a dash, parse the left and right sides of the dash
  4. Given the left and right sides (the range) determine all the values between them into rows

I would create a temp table to populate the parsing results into which needs two columns:
SourceRowID INT, ContainedValue INT

and another to use for intermediate processing:
SourceRowID INT, ContainedValues VARCHAR

Parse your comma-separated values into their own rows using a CTE like this Step 1 is now a well-defined and understood problem to solve:

Turning a Comma Separated string into individual rows

So your result from the source
'1-2,5'
will be:
'1-2'
'5'

From there, SELECT from that processing table where the field does not contain a dash. Step 2 is now a well-defined and understood problem to solve These are standalone numbers and can go straight into the results temp table. The results table should also get the ID reference to the original row.

Next would be to parse the values to the left and right of the dash using CHARINDEX to locate it, then the appropriate LEFT and RIGHT functions as needed. This will give you the starting and ending value.

Here is a relevant question for accomplishing this step 3 is now a well-defined and understood problem to solve:

T-SQL substring - separating first and last name

Now you have separated the starting and ending values. Use another function which can explode this range. Step 4 is now a well-defined and understood problem to solve:

SQL: create sequential list of numbers from various starting points

SELECT all N between @min and @max

What is the best way to create and populate a numbers table?

and, also, insert it into the temp table.

Now what you should have is a temp table with every value in the exploded range.

Simply JOIN that to the other table on the values now, then to your source table on the ID reference and you're there.

like image 41
Matthew Avatar answered Jan 10 '23 06:01

Matthew