Select
Distinct
REPLACE(REPLACE(REPLACE(REPLACE(Category, ' & ', '-'), '/', '-'), ', ', '-'), ' ', '-') AS Department
From
Inv WITH(NOLOCK)
I was wondering because I am a Jr. ETL Engineer and want to develop good habits.
Obviously this could get even longer in many circumstancials.
SELECT REPLACE(REPLACE(REPLACE(REPLACE('3*[4+5]/{6-8}', '[', '('), ']', ')'), '{', '('), '}', ')'); We can see that the REPLACE function is nested and it is called multiple times to replace the corresponding string as per the defined positional values within the SQL REPLACE function.
SQL Server REPLACE() FunctionThe REPLACE() function replaces all occurrences of a substring within a string, with a new substring. Note: The search is case-insensitive.
You can use REPLACE in an UPDATE statement.
The nested replace is fine, but as the nesting level increases the readability of your code goes down. If I had a large number of characters to replace I would opt for something cleaner like the below table driven approach.
declare @Category varchar(25)
set @Category = 'ABC & DEF/GHI, LMN OP'
-- nested replace
select replace(replace(replace(replace(@Category, ' & ', '-'), '/', '-'), ', ', '-'), ' ', '-') as Department
-- table driven
declare @t table (ReplaceThis varchar(10), WithThis varchar(10))
insert into @t
values (' & ', '-'),
('/', '-'),
(', ', '-'),
(' ', '-')
select @Category = replace(@Category, ReplaceThis, isnull(WithThis, ''))
from @t
where charindex(ReplaceThis, @Category) > 0;
select @Category [Department]
You might be better off using the SQLCLR and a regex. http://blogs.msdn.com/b/sqlclr/archive/2005/06/29/regex.aspx
Certainly that can be more maintainable and flexible.
As far as performance, you typically find it hard to beat built-in functions, but with many REPLACE operations, the CLR may outperform it - you'll have to benchmark.
I notice you said you are doing this in SSIS - in that case, you can use a variety of other possible methods within your data flows, including a script task and regex in those. As a general rule, you need to assess each operation you are doing and decide if it should be done in the query which brings data into the data flows or in the data flow itself. Some operations can be better to do (like filtering) on the source, but others (like aggregating), might be better done in the dataflow, especially if they are stateful with any kind of running data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With